Hello, I have a vector of dates and I would like to grep the year component from this vector (= all digits after the last punctuation character) dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", "28-07-2008", "28-07-08") the resulting vector should look like "08" "2008" "08" "2008" "2008" "2008" "08" I tried something like (Perl style) with no success grep("[[:punct:]]?\\d", dates, value=T, perl=T) Any ideas? -Lauri
Lauri Nikkinen:> Hello, > > I have a vector of dates and I would like to grep the year component > from this vector (= all digits > after the last punctuation character) > > dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", > "28-07-2008", "28-07-08") > > the resulting vector should look like > > "08" "2008" "08" "2008" "2008" "2008" "08"unlist(lapply(strsplit(dates,c("\\.|/|-")),function(x){x[length(x)]})) [1] "08" "2008" "08" "2008" "2008" "2008" "08" Best regards, Kornelius.> > I tried something like (Perl style) with no success > > grep("[[:punct:]]?\\d", dates, value=T, perl=T) > > Any ideas? > > -Lauri > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Lauri Nikkinen wrote:> Hello, > > I have a vector of dates and I would like to grep the year component > from this vector (= all digits > after the last punctuation character) > > dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", > "28-07-2008", "28-07-08") > > the resulting vector should look like > > "08" "2008" "08" "2008" "2008" "2008" "08" > > I tried something like (Perl style) with no success > > grep("[[:punct:]]?\\d", dates, value=T, perl=T) > > Any ideas?> sub(".*[[:punct:]]([0-9]*$)", "\\1", dates)[1] "08" "2008" "08" "2008" "2008" "2008" "08"> sub(".*[[:punct:]](.*)$", "\\1", dates)[1] "08" "2008" "08" "2008" "2008" "2008" "08"> sub(".*[[:punct:]]", "", dates)[1] "08" "2008" "08" "2008" "2008" "2008" "08"> substring(dates,regexpr("[0-9]*$", dates))[1] "08" "2008" "08" "2008" "2008" "2008" "08" (grep() won't do. It only tells you _whether_ the pattern matches.) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Fri, Nov 28, 2008 at 5:51 AM, Peter Dalgaard <P.Dalgaard at biostat.ku.dk> wrote:> Lauri Nikkinen wrote: >> Hello, >> >> I have a vector of dates and I would like to grep the year component >> from this vector (= all digits >> after the last punctuation character) >> >> dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008", >> "28-07-2008", "28-07-08") >> >> the resulting vector should look like >> >> "08" "2008" "08" "2008" "2008" "2008" "08" >> >> I tried something like (Perl style) with no success >> >> grep("[[:punct:]]?\\d", dates, value=T, perl=T) >> >> Any ideas? > >> sub(".*[[:punct:]]([0-9]*$)", "\\1", dates) > [1] "08" "2008" "08" "2008" "2008" "2008" "08" >> sub(".*[[:punct:]](.*)$", "\\1", dates) > [1] "08" "2008" "08" "2008" "2008" "2008" "08" >> sub(".*[[:punct:]]", "", dates) > [1] "08" "2008" "08" "2008" "2008" "2008" "08" >> substring(dates,regexpr("[0-9]*$", dates)) > [1] "08" "2008" "08" "2008" "2008" "2008" "08" >Here are a one more. This uses strapply from gsubfn which returns the matches directly. The simplify = c causes it to return them as a character vector instead of a list: library(gsubfn) strapply(dates, "[0-9]+$", simplify = c)