Matching regular expressions Dear useRs! I have the following problem. I would like to find objects in my environment that have two strings in it. For example, I might want to find objects that have in their names "MY" and "TARGET". I do not care about the ordering of these two substrings in the name, neither what is in front, behind or between them, the only thing important is that both words are present. I apologize if this is covered in help pages (then I did not understand it by reading them several times) or it was answered previously (then I did not find it). Since "ls" with argument pattern essentially uses "grep" (if I am not mistaken), I have an example for "grep" text<-c("somethigMYsomthing elseTARGET another thing","MY somthing TARGET another thing","somethig somthing elseTARGETMY another thing","somethigMTARGETY another thing") grep(pattern="MY&TARGET", x=text) #I would like to get 1 2 3 and not 4 or actually their names using text[grep(pattern="MY&TARGET", x=text)] #of course, the "pattern" in this case is wrong I know I can do text[grep(pattern="MY", x=text)][grep(pattern="TARGET", x=text[grep(pattern="MY",x=text)])] However I hope there exists a more elegant way. Thanks in advance for any suggestions! Best, Ales Ziberna
"Ales Ziberna" <aleszib at gmail.com> writes:> Matching regular expressions > > Dear useRs! > > I have the following problem. I would like to find objects in my environment > that have two strings in it. For example, I might want to find objects that > have in their names "MY" and "TARGET". I do not care about the ordering of > these two substrings in the name, neither what is in front, behind or > between them, the only thing important is that both words are present. I > apologize if this is covered in help pages (then I did not understand it by > reading them several times) or it was answered previously (then I did not > find it). > > Since "ls" with argument pattern essentially uses "grep" (if I am not > mistaken), I have an example for "grep" > > text<-c("somethigMYsomthing elseTARGET another thing","MY somthing TARGET > another thing","somethig somthing elseTARGETMY another > thing","somethigMTARGETY another thing") > > grep(pattern="MY&TARGET", x=text) > #I would like to get 1 2 3 and not 4 or actually their names using > text[grep(pattern="MY&TARGET", x=text)] > #of course, the "pattern" in this case is wrong > > I know I can do > > text[grep(pattern="MY", x=text)][grep(pattern="TARGET", > x=text[grep(pattern="MY",x=text)])] > > However I hope there exists a more elegant way.Perhaps this? text[intersect(grep("MY",text), grep("TARGET",text))] -- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
"Ales Ziberna" <aleszib at gmail.com> writes:> Dear useRs! > > I have the following problem. I would like to find objects in > my environment > that have two strings in it. For example, I might want to > find objects that > have in their names "MY" and "TARGET". I do not care about > the ordering of > these two substrings in the name, neither what is in front, behind or > between them, the only thing important is that both words are > present. I > apologize if this is covered in help pages (then I did not > understand it by > reading them several times) or it was answered previously > (then I did not > find it). > > Since "ls" with argument pattern essentially uses "grep" (if I am not > mistaken), I have an example for "grep" > > text<-c("somethigMYsomthing elseTARGET another thing","MY > somthing TARGET > another thing","somethig somthing elseTARGETMY another > thing","somethigMTARGETY another thing") > > grep(pattern="MY&TARGET", x=text) > #I would like to get 1 2 3 and not 4 or actually their names using > text[grep(pattern="MY&TARGET", x=text)] > #of course, the "pattern" in this case is wrong > > I know I can do > > text[grep(pattern="MY", x=text)][grep(pattern="TARGET", > x=text[grep(pattern="MY",x=text)])] > > However I hope there exists a more elegant way. > > Thanks in advance for any suggestions! > > Best, > Ales ZibernaHow about: text[grep("(MY|TARGET)", text)] That works on my Redhat box, R version 2.2.0. --Todd -- Why does clip mean both cut apart and fasten together?
I guess I have not been clear enough. I want both words in the results. So if we have: text<-c("somethigMYsomthing elseTARGET another thing","MY somthing TARGET another thing","somethig somthing elseTARGETMY another thing","somethigMTARGETY another thing", "somthingMY somthing else") The last element should not be returned. The best suggestion was given by Gabor Grothendieck: grep("MY.*TARGET|TARGET.*MY", text) While the one by Peter Dalgaard also does the trick: text[intersect(grep("MY",text), grep("TARGET",text))] I was just supriessed that "or" (|) works and "and" (&) does not. Thanks to all! Best, Ales Ziberna -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Taylor, Z Todd Sent: Wednesday, January 11, 2006 7:50 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] Regular expressions "Ales Ziberna" <aleszib at gmail.com> writes:> Dear useRs! > > I have the following problem. I would like to find objects in my > environment that have two strings in it. For example, I might want to > find objects that have in their names "MY" and "TARGET". I do not care > about the ordering of these two substrings in the name, neither what > is in front, behind or between them, the only thing important is that > both words are present. I apologize if this is covered in help pages > (then I did not understand it by reading them several times) or it was > answered previously (then I did not find it). > > Since "ls" with argument pattern essentially uses "grep" (if I am not > mistaken), I have an example for "grep" > > text<-c("somethigMYsomthing elseTARGET another thing","MY somthing > TARGET another thing","somethig somthing elseTARGETMY another > thing","somethigMTARGETY another thing") > > grep(pattern="MY&TARGET", x=text) > #I would like to get 1 2 3 and not 4 or actually their names using > text[grep(pattern="MY&TARGET", x=text)] #of course, the "pattern" in > this case is wrong > > I know I can do > > text[grep(pattern="MY", x=text)][grep(pattern="TARGET", > x=text[grep(pattern="MY",x=text)])] > > However I hope there exists a more elegant way. > > Thanks in advance for any suggestions! > > Best, > Ales ZibernaHow about: text[grep("(MY|TARGET)", text)] That works on my Redhat box, R version 2.2.0. --Todd -- Why does clip mean both cut apart and fasten together? ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html