Praveen Surendran
2009-Jul-08 13:04 UTC
[R] R regular expression to extract words with the query string.
Hi, Is there a way in R to get the string which matches the expression, where the expression is a substring of the parent string. Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" What I need is the string "pid:ENSP000012345" from $i using the query "ENSP". Appreciate your comments. Praveen Surendran School of Medicine and Medical Sciences University College Dublin Belfiled, Dublin 4 Ireland. [[alternative HTML version deleted]]
Henrique Dallazuanna
2009-Jul-08 13:18 UTC
[R] R regular expression to extract words with the query string.
Try this: sapply(strsplit(i, ' '), grep, pattern='ENSP', value = T) On Wed, Jul 8, 2009 at 10:04 AM, Praveen Surendran <praveen.surendran@ucd.ie> wrote:> Hi, > > > > Is there a way in R to get the string which matches the expression, where > the expression is a substring of the parent string. > > > > Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" > > What I need is the string "pid:ENSP000012345" from $i using the query > "ENSP". > > > > Appreciate your comments. > > > > Praveen Surendran > > School of Medicine and Medical Sciences > > University College Dublin > > Belfiled, Dublin 4 > > Ireland. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Praveen Surendran
2009-Jul-08 13:27 UTC
[R] R regular expression to extract words with the query string.
Thanks Henrique. This is indeed short and quite simple compared to what I was using which goes like... unlist(strsplit(i,split=" "))[grep("ENSP",unlist(strsplit(i,split=" ")))] J Cheers, Praveen. From: Henrique Dallazuanna [mailto:wwwhsd@gmail.com] Sent: 08 July 2009 14:18 To: praveen.surendran@ucd.ie Cc: r-help@r-project.org Subject: Re: [R] R regular expression to extract words with the query string. Try this: sapply(strsplit(i, '' ''), grep, pattern=''ENSP'', value = T) On Wed, Jul 8, 2009 at 10:04 AM, Praveen Surendran <praveen.surendran@ucd.ie> wrote: Hi, Is there a way in R to get the string which matches the expression, where the expression is a substring of the parent string. Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" What I need is the string "pid:ENSP000012345" from $i using the query "ENSP". Appreciate your comments. Praveen Surendran School of Medicine and Medical Sciences University College Dublin Belfiled, Dublin 4 Ireland. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25'' 40" S 49° 16'' 22" O [[alternative HTML version deleted]]
Jorge Ivan Velez
2009-Jul-08 14:05 UTC
[R] R regular expression to extract words with the query string.
Dear Praveen, Try also: strsplit(i,' ')[[1]][2] # [1] "pid:ENSP000012345" HTH, Jorge On Wed, Jul 8, 2009 at 9:04 AM, Praveen Surendran <praveen.surendran@ucd.ie>wrote:> Hi, > > > > Is there a way in R to get the string which matches the expression, where > the expression is a substring of the parent string. > > > > Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" > > What I need is the string "pid:ENSP000012345" from $i using the query > "ENSP". > > > > Appreciate your comments. > > > > Praveen Surendran > > School of Medicine and Medical Sciences > > University College Dublin > > Belfiled, Dublin 4 > > Ireland. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Gabor Grothendieck
2009-Jul-08 14:08 UTC
[R] R regular expression to extract words with the query string.
Try this: library(gsubfn) i <- "transcript:ENST0000112334 pid:ENSP000012345" strapply(i, paste("\\w*", "ENSP", "\\w*", sep = ""), c, simplify = unlist) This says to match any number (possibly zero) of word characters followed by ENSP followed by more word characters. c just returns the match without further processing and unlist unlists the result giving a character vector (which otherwise would be a list). See http://gsubfn.googlecode.com for more info. On Wed, Jul 8, 2009 at 9:04 AM, Praveen Surendran<praveen.surendran at ucd.ie> wrote:> Hi, > > > > Is there a way in R to get the string which matches the expression, where > the expression is a substring of the parent string. > > > > Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" > > What I need is the string "pid:ENSP000012345" from $i using the query > "ENSP". > > > > Appreciate your comments. > > > > Praveen ?Surendran > > School of Medicine and Medical Sciences > > University College Dublin > > Belfiled, Dublin 4 > > Ireland. > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Gabor Grothendieck
2009-Jul-09 03:14 UTC
[R] R regular expression to extract words with the query string.
The solution below does not include the pid: string before it. This modification works:> strapply(i, paste("[^ ]*", "ENSP", "[^ ]*", sep = ""), c, simplify = unlist)[1] "pid:ENSP000012345" On Wed, Jul 8, 2009 at 10:08 AM, Gabor Grothendieck<ggrothendieck at gmail.com> wrote:> Try this: > > library(gsubfn) > i <- "transcript:ENST0000112334 pid:ENSP000012345" > strapply(i, paste("\\w*", "ENSP", "\\w*", sep = ""), c, simplify = unlist) > > This says to match any number (possibly zero) of word > characters followed by ENSP followed by more word > characters. ?c just returns the match without > further processing and unlist unlists the result giving > a character vector (which otherwise would be a list). > > See http://gsubfn.googlecode.com for more info. > > On Wed, Jul 8, 2009 at 9:04 AM, Praveen > Surendran<praveen.surendran at ucd.ie> wrote: >> Hi, >> >> >> >> Is there a way in R to get the string which matches the expression, where >> the expression is a substring of the parent string. >> >> >> >> Lets say, I have $i <- "transcript:ENST0000112334 pid:ENSP000012345" >> >> What I need is the string "pid:ENSP000012345" from $i using the query >> "ENSP". >> >> >> >> Appreciate your comments. >> >> >> >> Praveen ?Surendran >> >> School of Medicine and Medical Sciences >> >> University College Dublin >> >> Belfiled, Dublin 4 >> >> Ireland. >> >> >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >