I have a set of character strings like below: > data3[1] [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)">I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way --------------------------------- Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]]
I have a set of character strings like below:> data3[1][1] "CB01_0171_03-27-2002-(Sample 26609)-(126)">I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) Any hints? ------------------------------------------------------------------------------- Shawn Way 14 Cambridge Center Cambridge, MA 02142 Ph:617-679-4488 [[alternative HTML version deleted]]
On Fri, 2007-03-09 at 15:23 -0500, Shawn Way wrote:> I have a set of character strings like below: > > > data3[1] > [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)" > > > > I am trying to extract the text 03-27-2002 and convert this into a date > for the same record. I keep looking at the grep function, however I > cannot quite get it to work. > > grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) > > Any hints?At least two different ways: Vec <- "CB01_0171_03-27-2002-(Sample 26609)-(126)" 1. Using substr(), if your source vector is a fixed format # Get the 11th thru the 20th character> substr(Vec, 11, 20)[1] "03-27-2002" 2. Using sub() for a more generalized approach: # Use a back reference, returning the value pattern within the # parens> sub(".+([0-9]{2}-[0-9]{2}-[0-9]{4}).+", "\\1", Vec)[1] "03-27-2002" See ?substr, ?sub and ?regex HTH, Marc Schwartz
Try replacing \d with \\d throughout your pattern. The R parser is trying to interpret the \ before the grep function ever sees it. By backslashing the backslashes, the parser ends up putting a single backslash in the pattern for grep to see. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Shawn Way > Sent: Friday, March 09, 2007 1:12 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Extracting text from a character string > > I have a set of character strings like below: > > > data3[1] > [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)" > > > > I am trying to extract the text 03-27-2002 and convert this > into a date for the same record. I keep looking at the grep > function, however I cannot quite get it to work. > > grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) > > Any hints? > > Shawn Way > > > --------------------------------- > Sucker-punch spam with award-winning protection. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I have a set of character strings like below:> data3[1][1] "CB01_0171_03-27-2002-(Sample 26609)-(126)">I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) Any hints? Shawn Way ____________________________________________________________________________________ We won't tell. Get more on shows you hate to love
Try this: library(gsubfn) x <- "CB01_0171_03-27-2002-(Sample 26609)-(126)" unlist(strapply(x, "..-..-....")) The gsubfn home page is at: http://code.google.com/p/gsubfn/ On 3/9/07, Shawn Way <shawnwaypublic at yahoo.com> wrote:> I have a set of character strings like below: > > > data3[1] > [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)" > > > > I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. > > grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) > > Any hints? > > Shawn Way > > > --------------------------------- > Sucker-punch spam with award-winning protection. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
actually, I am thinking of strsplit(). On 3/9/07, Shawn Way <shawnwaypublic at yahoo.com> wrote:> I have a set of character strings like below: > > > data3[1] > [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)" > > > > I am trying to extract the text 03-27-2002 and convert this into a date for the same record. I keep looking at the grep function, however I cannot quite get it to work. > > grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE) > > Any hints? > > Shawn Way > > > --------------------------------- > Sucker-punch spam with award-winning protection. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog)