thr3ads.net - R help - [R] Removing rows if certain elements are found in character string [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Claudia Penaloza

2012-Jul-02 22:48 UTC

[R] Removing rows if certain elements are found in character string

I would like to remove rows from the following data frame (df) if there are
only two specific elements found in the df$ch character string (I want to
remove rows with only "0" & "D" or "0" &
"d"). Alternatively, I would like
to remove rows if the first non-zero element is "D" or "d".


                                                 ch     count
1  0000000000D0000000000000000000000000000000000000 0.007368;
2  0000000000d0000000000000000000000000000000000000 0.002456;
3  000000000T00000000000000000000000000000000000000 0.007368;
4  000000000TD0000000000000000000000000000000000000 0.007368;
5  000000000T00000000000000000000000000000000000000 0.002456;
6  000000000Td0000000000000000000000000000000000000 0.002456;
7  00000000T000000000000000000000000000000000000000 0.007368;
8  00000000T0D0000000000000000000000000000000000000 0.007368;
9  00000000T000000000000000000000000000000000000000 0.002456;
10 00000000T0d0000000000000000000000000000000000000 0.002456;


I tried the following but it doesn't work if there is more than one
character per string:
>df <- df[!df$ch %in% c("0","D"),]
>df <- df[!df$ch %in% c("0","d"),]
Any help greatly appreciated,
Claudia

	[[alternative HTML version deleted]]

Rui Barradas

2012-Jul-02 23:24 UTC

head link

[R] Removing rows if certain elements are found in character string

Hello,

Try regular expressions instead.
In this data.frame, I've changed row nr.4 to have a row with 'D' as 
first non-zero character.

dd <- read.table(text="
ch     count
1  0000000000D0000000000000000000000000000000000000 0.007368
2  0000000000d0000000000000000000000000000000000000 0.002456
3  000000000T00000000000000000000000000000000000000 0.007368
4  000000000DT0000000000000000000000000000000000000 0.007368
5  000000000T00000000000000000000000000000000000000 0.002456
6  000000000Td0000000000000000000000000000000000000 0.002456
7  00000000T000000000000000000000000000000000000000 0.007368
8  00000000T0D0000000000000000000000000000000000000 0.007368
9  00000000T000000000000000000000000000000000000000 0.002456
10 00000000T0d0000000000000000000000000000000000000 0.002456
", header=TRUE)
dd

i1 <- grepl("^([0D]|[0d])*$", dd$ch)
i2 <- grepl("^0*[Dd]", dd$ch)

dd[!i1, ]
dd[!i2, ]
dd[!(i1 | i2), ]


Hope this helps,

Rui Barradas

Em 02-07-2012 23:48, Claudia Penaloza escreveu:> I would like to remove rows from the following data frame (df) if there are
> only two specific elements found in the df$ch character string (I want to
> remove rows with only "0" & "D" or "0"
& "d"). Alternatively, I would like
> to remove rows if the first non-zero element is "D" or
"d".
>
>
>                                                   ch     count
> 1  0000000000D0000000000000000000000000000000000000 0.007368;
> 2  0000000000d0000000000000000000000000000000000000 0.002456;
> 3  000000000T00000000000000000000000000000000000000 0.007368;
> 4  000000000TD0000000000000000000000000000000000000 0.007368;
> 5  000000000T00000000000000000000000000000000000000 0.002456;
> 6  000000000Td0000000000000000000000000000000000000 0.002456;
> 7  00000000T000000000000000000000000000000000000000 0.007368;
> 8  00000000T0D0000000000000000000000000000000000000 0.007368;
> 9  00000000T000000000000000000000000000000000000000 0.002456;
> 10 00000000T0d0000000000000000000000000000000000000 0.002456;
>
>
> I tried the following but it doesn't work if there is more than one
> character per string:
>
>> df <- df[!df$ch %in% c("0","D"),]
>> df <- df[!df$ch %in% c("0","d"),]
>
> Any help greatly appreciated,
> Claudia
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

arun

2012-Jul-02 23:29 UTC

head link

[R] Removing rows if certain elements are found in character string

Hi,

Try this:

dat1<-read.table(text="
1? 0000000000D0000000000000000000000000000000000000 0.007368;
2? 0000000000d0000000000000000000000000000000000000 0.002456;
3? 000000000T00000000000000000000000000000000000000 0.007368;
4? 000000000TD0000000000000000000000000000000000000 0.007368;
5? 000000000T00000000000000000000000000000000000000 0.002456;
6? 000000000Td0000000000000000000000000000000000000 0.002456;
7? 00000000T000000000000000000000000000000000000000 0.007368;
8? 00000000T0D0000000000000000000000000000000000000 0.007368;
9? 00000000T000000000000000000000000000000000000000 0.002456;
10 00000000T0d0000000000000000000000000000000000000 0.002456;
",sep="",header=FALSE)

colnames(dat1)<-c("num","Ch", "count")

#I guess this is what you wanted.

?dat1[grepl("TD|Td|T",dat1$Ch),]
?? num?????????????????????????????????????????????? Ch???? count
3??? 3 000000000T00000000000000000000000000000000000000 0.007368;
4??? 4 000000000TD0000000000000000000000000000000000000 0.007368;
5??? 5 000000000T00000000000000000000000000000000000000 0.002456;
6??? 6 000000000Td0000000000000000000000000000000000000 0.002456;
7??? 7 00000000T000000000000000000000000000000000000000 0.007368;
8??? 8 00000000T0D0000000000000000000000000000000000000 0.007368;
9??? 9 00000000T000000000000000000000000000000000000000 0.002456;
10? 10 00000000T0d0000000000000000000000000000000000000 0.002456;

#If you want to remove D or d rows
?dat1[!grepl("D|d",dat1$Ch),]
? num?????????????????????????????????????????????? Ch???? count
3?? 3 000000000T00000000000000000000000000000000000000 0.007368;
5?? 5 000000000T00000000000000000000000000000000000000 0.002456;
7?? 7 00000000T000000000000000000000000000000000000000 0.007368;
9?? 9 00000000T000000000000000000000000000000000000000 0.002456;

A.K.


----- Original Message -----
From: Claudia Penaloza <claudiapenaloza at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, July 2, 2012 6:48 PM
Subject: [R] Removing rows if certain elements are found in character string

I would like to remove rows from the following data frame (df) if there are
only two specific elements found in the df$ch character string (I want to
remove rows with only "0" & "D" or "0" &
"d"). Alternatively, I would like
to remove rows if the first non-zero element is "D" or "d".


? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ch? ?? count
1? 0000000000D0000000000000000000000000000000000000 0.007368;
2? 0000000000d0000000000000000000000000000000000000 0.002456;
3? 000000000T00000000000000000000000000000000000000 0.007368;
4? 000000000TD0000000000000000000000000000000000000 0.007368;
5? 000000000T00000000000000000000000000000000000000 0.002456;
6? 000000000Td0000000000000000000000000000000000000 0.002456;
7? 00000000T000000000000000000000000000000000000000 0.007368;
8? 00000000T0D0000000000000000000000000000000000000 0.007368;
9? 00000000T000000000000000000000000000000000000000 0.002456;
10 00000000T0d0000000000000000000000000000000000000 0.002456;


I tried the following but it doesn't work if there is more than one
character per string:
>df <- df[!df$ch %in% c("0","D"),]
>df <- df[!df$ch %in% c("0","d"),]
Any help greatly appreciated,
Claudia

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius

2012-Jul-03 02:58 UTC

head link

[R] Removing rows if certain elements are found in character string

On Jul 2, 2012, at 6:48 PM, Claudia Penaloza wrote:
> I would like to remove rows from the following data frame (df) if  
> there are
> only two specific elements found in the df$ch character string (I  
> want to
> remove rows with only "0" & "D" or "0"
& "d"). Alternatively, I
> would like
> to remove rows if the first non-zero element is "D" or
"d".
>
>
>                                                 ch     count
> 1  0000000000D0000000000000000000000000000000000000 0.007368;
> 2  0000000000d0000000000000000000000000000000000000 0.002456;
> 3  000000000T00000000000000000000000000000000000000 0.007368;
> 4  000000000TD0000000000000000000000000000000000000 0.007368;
> 5  000000000T00000000000000000000000000000000000000 0.002456;
> 6  000000000Td0000000000000000000000000000000000000 0.002456;
> 7  00000000T000000000000000000000000000000000000000 0.007368;
> 8  00000000T0D0000000000000000000000000000000000000 0.007368;
> 9  00000000T000000000000000000000000000000000000000 0.002456;
> 10 00000000T0d0000000000000000000000000000000000000 0.002456;
>
>
> I tried the following but it doesn't work if there is more than one
> character per string:
>
>> df <- df[!df$ch %in% c("0","D"),]
>> df <- df[!df$ch %in% c("0","d"),]
You seem to be missing test cases for the second set of conditions but  
this works for the first set (and might for the second):

 > dat[ grepl("[^0dD]", dat$ch) & !
grepl("^0+d|^0^D", dat$ch) , ]
                                                  ch    count
3  000000000T00000000000000000000000000000000000000 0.007368
4  000000000TD0000000000000000000000000000000000000 0.007368
5  000000000T00000000000000000000000000000000000000 0.002456
6  000000000Td0000000000000000000000000000000000000 0.002456
7  00000000T000000000000000000000000000000000000000 0.007368
8  00000000T0D0000000000000000000000000000000000000 0.007368
9  00000000T000000000000000000000000000000000000000 0.002456
10 00000000T0d0000000000000000000000000000000000000
0.002456>

-- 

David Winsemius, MD
West Hartford, CT

MacQueen, Don

2012-Jul-05 18:13 UTC

head link

[R] Removing rows if certain elements are found in character string

Perhaps I've missed something, but if it's really true that the goal is
to
remove rows if the first non-zero element is "D" or "d",
then how about
this:

tmp <- gsub('0','',df$ch)
first <- substr(tmp,1,1)
subset(df, tolower(first) != 'd')

and of course it could be rolled up into a single expression, but I wrote
it in several steps to make it easy to follow. No need to wrap one's brain
around regular expressions (which is hard for me!)

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 7/2/12 3:48 PM, "Claudia Penaloza" <claudiapenaloza at
gmail.com> wrote:
>I would like to remove rows from the following data frame (df) if there
>are
>only two specific elements found in the df$ch character string (I want to
>remove rows with only "0" & "D" or "0"
& "d"). Alternatively, I would like
>to remove rows if the first non-zero element is "D" or
"d".
>
>
>                                                 ch     count
>1  0000000000D0000000000000000000000000000000000000 0.007368;
>2  0000000000d0000000000000000000000000000000000000 0.002456;
>3  000000000T00000000000000000000000000000000000000 0.007368;
>4  000000000TD0000000000000000000000000000000000000 0.007368;
>5  000000000T00000000000000000000000000000000000000 0.002456;
>6  000000000Td0000000000000000000000000000000000000 0.002456;
>7  00000000T000000000000000000000000000000000000000 0.007368;
>8  00000000T0D0000000000000000000000000000000000000 0.007368;
>9  00000000T000000000000000000000000000000000000000 0.002456;
>10 00000000T0d0000000000000000000000000000000000000 0.002456;
>
>
>I tried the following but it doesn't work if there is more than one
>character per string:
>
>>df <- df[!df$ch %in% c("0","D"),]
>>df <- df[!df$ch %in% c("0","d"),]
>
>Any help greatly appreciated,
>Claudia
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more reasonably related threads

R help - Jul 2012 - Removing rows if certain elements are found in character string

[R] Removing rows if certain elements are found in character string

[R] Removing rows if certain elements are found in character string

[R] Removing rows if certain elements are found in character string

[R] Removing rows if certain elements are found in character string

[R] Removing rows if certain elements are found in character string

Maybe Matching Threads