Hello R users, I have to import a file with one column containing dates written in French short format, such as: 7-d?c-07 11-d?c-07 14-d?c-07 18-d?c-07 21-d?c-07 24-d?c-07 26-d?c-07 28-d?c-07 31-d?c-07 2-janv-08 4-janv-08 7-janv-08 9-janv-08 11-janv-08 14-janv-08 16-janv-08 18-janv-08 There are other columns for other (numeric) variables in the data file. In my read.csv2 statement, I indicate that the date column must be imported "as.is" to keep it as character. I would like to transform this into a date object in R. So far I've used chron for my dates and times needs, but I am willing to change if another object/package will ease the task of importing these dates. My reading of the chron help led me to believe that the formats it understands are only month names in English. Are there other "formats" I can use with chron, or must I somehow edit this character variables to replace French month names by English ones (or numbers from 1 to 12)? Thanks in advance, Denis p.s. I read this in digest mode, so I'll get your replies faster if you cc to my email
Suppose we have: dd <- c("7-d?c-07", "11-d?c-07", "14-d?c-07", "18-d?c-07", "21-d?c-07", "24-d?c-07", "26-d?c-07", "28-d?c-07", "31-d?c-07", "2-janv-08", "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08", "16-janv-08", "18-janv-08") Try this (where we are assuming the just released chron 2.3-17): library(chron) Sys.setlocale("LC_ALL", "French") as.chron(as.Date(dd, "%d-%b-%y")) # or with chron 2.3-16 last line is replaced with: chron(unclass(as.Date(dd, "%d-%b-%y"))) If those don't work (the above didn't work on my Vista system but this is system dependent and might work on yours) then try this alternative> library(chron) > library(gsubfn) > Sys.setlocale('LC_ALL','French')[1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252"> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by = "month"), "%b") > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y, sep = "/")) > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)[1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07 12/28/07 [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08 01/16/08 [17] 01/18/08 On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net> wrote:> Hello R users, > > I have to import a file with one column containing dates written in > French short format, such as: > > 7-d?c-07 > 11-d?c-07 > 14-d?c-07 > 18-d?c-07 > 21-d?c-07 > 24-d?c-07 > 26-d?c-07 > 28-d?c-07 > 31-d?c-07 > 2-janv-08 > 4-janv-08 > 7-janv-08 > 9-janv-08 > 11-janv-08 > 14-janv-08 > 16-janv-08 > 18-janv-08 > > There are other columns for other (numeric) variables in the data > file. In my read.csv2 statement, I indicate that the date column must > be imported "as.is" to keep it as character. > > I would like to transform this into a date object in R. So far I've > used chron for my dates and times needs, but I am willing to change if > another object/package will ease the task of importing these dates. > > My reading of the chron help led me to believe that the formats it > understands are only month names in English. > > Are there other "formats" I can use with chron, or must I somehow edit > this character variables to replace French month names by English ones > (or numbers from 1 to 12)? > > Thanks in advance, > > Denis > p.s. I read this in digest mode, so I'll get your replies faster if > you cc to my email > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
The output from sessionInfo() the posting guide asked for would have been very helpful here. I think the problem is likely to be that these are not standard French abbreviations according to my systems. On Linux I get> format(Sys.Date(), "%d-%b-%y")[1] "31-jan-08"> format(Sys.Date()-50, "%d-%b-%y")[1] "12-d?c-07" and on Windows> format(Sys.Date(), "%d-%b-%y")[1] "31-janv.-08"> format(Sys.Date()-50, "%d-%b-%y")[1] "12-d?c.-07" And yes, chron is US-centric and so only allows English names. Assuming you know exactly what is meant by 'French short format', I think the simplest thing to do is to set up a table by tr <- month.abb names(tr)[1] <- c("janv") # complete it x <- "9-janv-08" x2 <- strsplit(x, "-") x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x, collapse="-")}) as.Date(x3, format = "%d-%b-%y") On Wed, 30 Jan 2008, Denis Chabot wrote:> Hello R users, > > I have to import a file with one column containing dates written in > French short format, such as: > > 7-d?c-07 > 11-d?c-07 > 14-d?c-07 > 18-d?c-07 > 21-d?c-07 > 24-d?c-07 > 26-d?c-07 > 28-d?c-07 > 31-d?c-07 > 2-janv-08 > 4-janv-08 > 7-janv-08 > 9-janv-08 > 11-janv-08 > 14-janv-08 > 16-janv-08 > 18-janv-08 > > There are other columns for other (numeric) variables in the data > file. In my read.csv2 statement, I indicate that the date column must > be imported "as.is" to keep it as character. > > I would like to transform this into a date object in R. So far I've > used chron for my dates and times needs, but I am willing to change if > another object/package will ease the task of importing these dates. > > My reading of the chron help led me to believe that the formats it > understands are only month names in English. > > Are there other "formats" I can use with chron, or must I somehow edit > this character variables to replace French month names by English ones > (or numbers from 1 to 12)? > > Thanks in advance, > > Denis > p.s. I read this in digest mode, so I'll get your replies faster if > you cc to my email > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595