Hi all, Let Names a vector of chatacters. For example, > Names [1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)" [3] "g 608 alu-0 (659 matches)" I try to use gsub or grep functions for two problems : 1. First, I would like to delete all the characters between parentheses. [1] "g 604 be-0 -p1" "g 606 be-0 -p2" [3] "g 608 be-0 -p3" 2. And, I would like to extract the characters between parentheses [1] "602 matches" "517 matches" [3] "659 matches" Any idea? Best regards, Olivier -- ------------------------------------------------------------- Martin Olivier INRA - Unit? prot?omique LIRMM - IFA/MAB 2, Place Viala 161, rue Ada 34060 Montpellier C?dex 1 34392 Montpellier C?dex 5 Tel : 04 99 61 27 01 Tel : O4 67 41 86 71 martinol at ensam.inra.fr martin at lirmm.fr
Dear Oliver, I believe that the following will give you what you want: At 04:30 PM 10/13/2003 +0200, Martin Olivier wrote:>Hi all, > >Let Names a vector of chatacters. For example, > > > Names >[1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)" >[3] "g 608 alu-0 (659 matches)" > >I try to use gsub or grep functions for two problems : > >1. First, I would like to delete all the characters between parentheses. >[1] "g 604 be-0 -p1" "g 606 be-0 -p2" >[3] "g 608 be-0 -p3"gsub(" *$", "", gsub("\\(.*\\)$", "", Names)) # also deletes trailing blanks>2. And, I would like to extract the characters between parentheses >[1] "602 matches" "517 matches" >[3] "659 matches" >posn <- regexpr("\\(.*\\)$", Names) substring(Names, first=posn+1, last=posn+attr(posn,"match.length")-2)>Any idea? > >Best regards, >OlivierI hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox
Well, this works for the first one:> sub(" \\([A-Za-z0-9_ ]*\\)", "", Names)and from there the second one is fairly obvious I hope. QUESTION: having recently been using Source Edit I wanted to write [\\w]* instead of [A-Za-z0-9_ ]* but that doesn't seem to work in R. ?grep points to ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/ but I can't access that (server/gateway restriction). So, could anyone tell me exactly what is allowed in R regular expressions? A URL to the POSIX standards would be useful too. In fact it would be even more useful if R's particular choice of RE syntax, together with R's multiple backslashes, was given somewhere in the R help itself ... yes I will write it if someone gives me the info or points me in the right direction ...> -----Original Message----- > From: Martin Olivier [mailto:martinol at ensam.inra.fr] > Sent: 13 October 2003 15:31 > To: r-help > Subject: [R] help with gsub and grep functions > > > Security Warning: > If you are not sure an attachment is safe to open please contact > Andy on x234. There are 0 attachments with this message. > ________________________________________________________________ > > Hi all, > > Let Names a vector of chatacters. For example, > > > Names > [1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)" > [3] "g 608 alu-0 (659 matches)" > > I try to use gsub or grep functions for two problems : > > 1. First, I would like to delete all the characters between > parentheses. > [1] "g 604 be-0 -p1" "g 606 be-0 -p2" > [3] "g 608 be-0 -p3" > > 2. And, I would like to extract the characters between parentheses > [1] "602 matches" "517 matches" > [3] "659 matches" > > > > Any idea? > > Best regards, > Olivier > > -- > > ------------------------------------------------------------- > Martin Olivier > INRA - Unit? prot?omique LIRMM - IFA/MAB > 2, Place Viala 161, rue Ada > 34060 Montpellier C?dex 1 34392 Montpellier C?dex 5 > > Tel : 04 99 61 27 01 Tel : O4 67 41 86 71 > martinol at ensam.inra.fr martin at lirmm.fr > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
On Monday 13 October 2003 16:30, Martin Olivier wrote:> Hi all, > > Let Names a vector of chatacters. For example, > > > Names > > [1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)" > [3] "g 608 alu-0 (659 matches)" > > I try to use gsub or grep functions for two problems : > > 1. First, I would like to delete all the characters between parentheses. > [1] "g 604 be-0 -p1" "g 606 be-0 -p2" > [3] "g 608 be-0 -p3" > > 2. And, I would like to extract the characters between parentheses > [1] "602 matches" "517 matches" > [3] "659 matches" > > > Any idea? > Best regards, > OlivierThere might be a better solution, but the following commands do what you want (at least in the 3 cases that you showed above): sub(" [(].*","",Names) sub("[\)]+","",sub("[^(]*[\(]","",Names)) Arne -- Arne Henningsen Department of Agricultural Economics Christian-Albrechts-University Kiel 24098 Kiel, Germany Tel: +49-431-880-4445 Fax: +49-431-880-1397 ahenningsen at email.uni-kiel.de http://www.uni-kiel.de/agrarpol/ahenningsen/
If you split the strings using strsplit: s <- strsplit(Names," *[()]") # remove space * if trailing space OK Then the two results are: sapply(s,"[",-2) sapply(s,"[",2) --- Date: Mon, 13 Oct 2003 16:30:37 +0200 From: Martin Olivier <martinol at ensam.inra.fr> Subject: [R] help with gsub and grep functions Hi all, Let Names a vector of chatacters. For example,> Names[1] "g 604 be-0 -p1 (602 matches)" "g 606 Phli-0 -p2 (517 matches)" [3] "g 608 alu-0 (659 matches)" I try to use gsub or grep functions for two problems : 1. First, I would like to delete all the characters between parentheses. [1] "g 604 be-0 -p1" "g 606 be-0 -p2" [3] "g 608 be-0 -p3" 2. And, I would like to extract the characters between parentheses [1] "602 matches" "517 matches" [3] "659 matches" Any idea? _______________________________________________ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com