Hi, I'm trying to parse lines of the form: dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976} What I need is the sequence of number between {}. I'm using grep as match <- grep("{([0-9,\s]*)}",s,perl=T,value=T) where s is a character vector. But all I get is the whole string s. I tried using regexpr in an attempt to get just the sequence I wanted: match <- regexpr("{([0-9,\s]*)}", s , perl=T) but then I get -1 as the return value indicating that there was no match. If grep is able to return the matched element (though I dont know why the whole string is being returned) why is regexpr failing? Finally, could anybody provide a hint as to how I should modify the regex to get the sequence between {}. (I've used the same regex in Python code to get the sequence and it works fine.) Thanks, ------------------------------------------------------------------- Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- Entropy requires no maintenance. -- Markoff Chaney
(Its seems it did'nt get posted the first time around) Hi, I'm trying to parse lines of the form: dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976} What I need is the sequence of number between {}. I'm using grep as match <- grep("{([0-9,\s]*)}",s,perl=T,value=T) where s is a character vector. But all I get is the whole string s. I tried using regexpr in an attempt to get just the sequence I wanted: match <- regexpr("{([0-9,\s]*)}", s , perl=T) but then I get -1 as the return value indicating that there was no match. If grep is able to return the matched element (though I dont know why the whole string is being returned) why is regexpr failing? Finally, could anybody provide a hint as to how I should modify the regex to get the sequence between {}. (I've used the same regex in Python code to get the sequence and it works fine.) Thanks, ------------------------------------------------------------------- Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- A motion to adjourn is always in order.
Is this what you're looking for? > s = "dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976}" > sub(".*{([0-9,\\s]+)}", "\\1", s, perl=T) [1] "256, 411, 426, 947, 973, 976" + seth On Fri, Feb 06, 2004 at 01:49:30PM -0500, Rajarshi Guha wrote:> Hi, > I'm trying to parse lines of the form: > > dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976} > > What I need is the sequence of number between {}. I'm using grep as > > match <- grep("{([0-9,\s]*)}",s,perl=T,value=T) > > where s is a character vector. > > But all I get is the whole string s. I tried using regexpr in an attempt > to get just the sequence I wanted: > > match <- regexpr("{([0-9,\s]*)}", s , perl=T) > > but then I get -1 as the return value indicating that there was no > match. > > If grep is able to return the matched element (though I dont know why > the whole string is being returned) why is regexpr failing? > > Finally, could anybody provide a hint as to how I should modify the > regex to get the sequence between {}. (I've used the same regex in > Python code to get the sequence and it works fine.) > > Thanks, > > > ------------------------------------------------------------------- > Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net> > GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE > ------------------------------------------------------------------- > Entropy requires no maintenance. > -- Markoff Chaney > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Try this: gsub(".*{|}","",s) which deletes everything up to and including the { and then deletes the }. --- Date: Fri, 06 Feb 2004 13:49:30 -0500 From: Rajarshi Guha <rxg218 at psu.edu> To: R <r-help at stat.math.ethz.ch> Subject: [R] a grep/regexpr problem Hi, I'm trying to parse lines of the form: dan001.hin (0): fingerprint={256, 411, 426, 947, 973, 976} What I need is the sequence of number between {}. I'm using grep as match <- grep("{([0-9,\s]*)}",s,perl=T,value=T) where s is a character vector. But all I get is the whole string s. I tried using regexpr in an attempt to get just the sequence I wanted: match <- regexpr("{([0-9,\s]*)}", s , perl=T) but then I get -1 as the return value indicating that there was no match. If grep is able to return the matched element (though I dont know why the whole string is being returned) why is regexpr failing? Finally, could anybody provide a hint as to how I should modify the regex to get the sequence between {}. (I've used the same regex in Python code to get the sequence and it works fine.) Thanks, ------------------------------------------------------------------- Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>; GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- Entropy requires no maintenance. -- Markoff Chaney