thr3ads.net - R help - [R] How to get multiple partial matches? [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Sarah Tucker

2006-Sep-06 23:39 UTC

[R] How to get multiple partial matches?

Hi, 

I'm very new to R, and am not at all a software
programmer of any sort.    I appreciate any help you
may have.  I have figured out how to get my data into
a dataframe and order it alphabetically according to a
particular column.  Now, I would like to seperate out
certain rows based on partial character matches.  Here
is an (extremely) abreviated example of my data set

        Probe Ch1 Median - B Ch1 Mean - B
72     5S_F_1            501          567
7700   5S_F_2            338          611
7517   5S_F_3            412          467
10687  5S_F_4            380          428
4870   5S_F_5            315          368
6035   5S_F_6            300          359
3826   5S_F_7            350          386
8754   5S_F_8            450          473
6399   5S_F_9            439          494
749   5S_F_10            334          384

I would like to be able to select out all rows with,
for example, "5S_F_" in the Probe column (there are
non-"5S_F_" containing values in the real, larger data
set).

I think pmatch does this for instances where there is
only 1 match, but I would like to recover all the
matches.  I have tried to use charmatch, match,
pmatch, agrep and grep for this purpose, but with no
luck.

When I grep for "5S_F_" with value = T, I get
"character(0)"
Adding wildcards (either "*" or ".") does not change
this outcome.

I thought maybe the underscores were messing it up, so
I tried to grep "5S*" with value = T, and I get a long
list of numbers back

[1] "55"   "95"   "56"   "57"  
"58"   "59"   "65"
"75"   "85"   "105" 
  [11] "115"  "125"  "135"  "5"   
"5"    "5"    "5"
 "5"    "5"    "5"  

These numbers make no sense to me.  They don't seem to
correlate with where the "5S"'s occur in the
dataframe, and they don't look like any values in the
Probe column (there are no numeric vaules in the Probe
column, just strings of character digit combinations).

How can I select out all the rows with the same
partial character match?

jim holtman

2006-Sep-07 00:01 UTC

head link

[R] How to get multiple partial matches?

Try using 'grep' and regular expressions:
> x <- "72     5S_F_1            501          567+ 7700   5S_F_2            338          611
+ 7517   5S_F_3            412          467
+ 10687  5S_F_4            380          428
+ 4870   5S_F_5            315          368
+ 6035   5S_F_6            300          359
+ 3826   5S_F_7            350          386
+ 8754   5S_F_8            450          473
+ 6399   5S_F_9            439          494
+ 749   5S_F_10            334          384
+ "> df <- read.table(textConnection(x))
> df      V1      V2  V3  V4
1     72  5S_F_1 501 567
2   7700  5S_F_2 338 611
3   7517  5S_F_3 412 467
4  10687  5S_F_4 380 428
5   4870  5S_F_5 315 368
6   6035  5S_F_6 300 359
7   3826  5S_F_7 350 386
8   8754  5S_F_8 450 473
9   6399  5S_F_9 439 494
10   749 5S_F_10 334 384> # select only ones with '5S_F_1'
> df[grep('5S_F_1', as.character(df$V2)),]    V1      V2  V3  V4
1   72  5S_F_1 501 567
10 749 5S_F_10 334 384>
>

On 9/6/06, Sarah Tucker <sltucker15 at yahoo.com>
wrote:> Hi,
>
> I'm very new to R, and am not at all a software
> programmer of any sort.    I appreciate any help you
> may have.  I have figured out how to get my data into
> a dataframe and order it alphabetically according to a
> particular column.  Now, I would like to seperate out
> certain rows based on partial character matches.  Here
> is an (extremely) abreviated example of my data set
>
>        Probe Ch1 Median - B Ch1 Mean - B
> 72     5S_F_1            501          567
> 7700   5S_F_2            338          611
> 7517   5S_F_3            412          467
> 10687  5S_F_4            380          428
> 4870   5S_F_5            315          368
> 6035   5S_F_6            300          359
> 3826   5S_F_7            350          386
> 8754   5S_F_8            450          473
> 6399   5S_F_9            439          494
> 749   5S_F_10            334          384
>
> I would like to be able to select out all rows with,
> for example, "5S_F_" in the Probe column (there are
> non-"5S_F_" containing values in the real, larger data
> set).
>
> I think pmatch does this for instances where there is
> only 1 match, but I would like to recover all the
> matches.  I have tried to use charmatch, match,
> pmatch, agrep and grep for this purpose, but with no
> luck.
>
> When I grep for "5S_F_" with value = T, I get
> "character(0)"
> Adding wildcards (either "*" or ".") does not change
> this outcome.
>
> I thought maybe the underscores were messing it up, so
> I tried to grep "5S*" with value = T, and I get a long
> list of numbers back
>
> [1] "55"   "95"   "56"   "57"  
"58"   "59"   "65"
> "75"   "85"   "105"
>  [11] "115"  "125"  "135"  "5"   
"5"    "5"    "5"
>  "5"    "5"    "5"
>
> These numbers make no sense to me.  They don't seem to
> correlate with where the "5S"'s occur in the
> dataframe, and they don't look like any values in the
> Probe column (there are no numeric vaules in the Probe
> column, just strings of character digit combinations).
>
> How can I select out all the rows with the same
> partial character match?
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Seemingly Similar Threads

Search for more reasonably related threads

R help - Sep 2006 - How to get multiple partial matches?

[R] How to get multiple partial matches?

[R] How to get multiple partial matches?

Seemingly Similar Threads