Liviu Andronic
2012-Jun-21  12:48 UTC
[R] convert 'character' vector containing mixed formats to 'Date'
Dear all
I have a 'character' vector containing mixed formats (thanks Excel!)
and I'd like to translate it into a default "%Y-%m-%d" Date
vector.
x <- c("1/3/2005", "13/04/2004", "2/5/2005",
"2/5/2005", "7/5/2007",
       "22/04/2004", "21/04/2005", "20080430",
"13/05/2003", "20080529",
       NA, NA, "19/05/1999", "17/05/2000",
"17/05/2000")
In the above you will see that some dates are of format="%d/%m/%Y",
others of format="%Y%m%d" and some NA values. Can you suggest a
straight-forward way of transforming these to a uniform 'character' or
'Date' vector? I tried to do the following, but it outputs very
strange results:> x
 [1] "1/3/2005"   "13/04/2004" "2/5/2005"  
"2/5/2005"   "7/5/2007"
"22/04/2004"
 [7] "21/04/2005" "20080430"   "13/05/2003"
"20080529"   NA
NA
[13] "19/05/1999" "17/05/2000"
"17/05/2000"> sum(xa <- grepl('/', x))
[1] 11> sum(xb  <- grepl('200', substr(x, 1,4)))
[1] 2> sum(xc <- is.na(x))
[1] 2> x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> x[xb] <- as.Date(x[xb], format="%Y%m%d")
> x
 [1] "12843" "12521" "12905" "12905"
"13640" "12530" "12894" "13999"
"12185" "14028"
[11] NA      NA      "10730" "11094" "11094"
The culprit is likely that the 'x' vector is 'character'
throughout,
but I'm not sure how to work around. For example, I couldn't figure
how to create an empty 'Date' vector. Regards
Liviu
-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
Liviu Andronic
2012-Jun-21  13:08 UTC
[R] convert 'character' vector containing mixed formats to 'Date'
On Thu, Jun 21, 2012 at 2:48 PM, Liviu Andronic <landronimirc at gmail.com> wrote:> The culprit is likely that the 'x' vector is 'character' throughout, > but I'm not sure how to work around. For example, I couldn't figure > how to create an empty 'Date' vector. Regards >I think I managed to crack this by myself. I only needed to add an as.character() call:> x[xa] <- as.character(as.Date(x[xa], format="%d/%m/%Y")) > x[xb] <- as.character(as.Date(x[xb], format="%Y%m%d")) > x[1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07" "2004-04-22" [7] "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29" NA NA [13] "1999-05-19" "2000-05-17" "2000-05-17" Liviu
Petr PIKAL
2012-Jun-21  13:10 UTC
[R] convert 'character' vector containing mixed formats to 'Date'
Hi> > Dear all > I have a 'character' vector containing mixed formats (thanks Excel!) > and I'd like to translate it into a default "%Y-%m-%d" Date vector. > x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007", > "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529", > NA, NA, "19/05/1999", "17/05/2000", "17/05/2000") > > > In the above you will see that some dates are of format="%d/%m/%Y", > others of format="%Y%m%d" and some NA values. Can you suggest a > straight-forward way of transforming these to a uniform 'character' or > 'Date' vector? I tried to do the following, but it outputs very > strange results: > > x > [1] "1/3/2005" "13/04/2004" "2/5/2005" "2/5/2005" "7/5/2007" > "22/04/2004" > [7] "21/04/2005" "20080430" "13/05/2003" "20080529" NA > NA > [13] "19/05/1999" "17/05/2000" "17/05/2000" > > sum(xa <- grepl('/', x)) > [1] 11 > > sum(xb <- grepl('200', substr(x, 1,4))) > [1] 2 > > sum(xc <- is.na(x)) > [1] 2 > > x[xa] <- as.Date(x[xa], format="%d/%m/%Y") > > x[xb] <- as.Date(x[xb], format="%Y%m%d") > > x > [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999" > "12185" "14028" > [11] NA NA "10730" "11094" "11094" >You can use another as.Date with origin specified. as.Date(ifelse(ind, as.Date(x, format="%d/%m/%Y"), as.Date(x, format="%Y%m%d")) , origin="1970-01-01") [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07" [6] "2004-04-22" "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29" [11] NA NA "1999-05-19" "2000-05-17" "2000-05-17" Regards Petr> > The culprit is likely that the 'x' vector is 'character' throughout, > but I'm not sure how to work around. For example, I couldn't figure > how to create an empty 'Date' vector. Regards > Liviu > > > -- > Do you know how to read? > http://www.alienetworks.com/srtest.cfm > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > Do you know how to write? > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Duncan Murdoch
2012-Jun-21  13:13 UTC
[R] convert 'character' vector containing mixed formats to 'Date'
On 12-06-21 8:48 AM, Liviu Andronic wrote:> Dear all > I have a 'character' vector containing mixed formats (thanks Excel!) > and I'd like to translate it into a default "%Y-%m-%d" Date vector. > x<- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007", > "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529", > NA, NA, "19/05/1999", "17/05/2000", "17/05/2000") > > > In the above you will see that some dates are of format="%d/%m/%Y", > others of format="%Y%m%d" and some NA values. Can you suggest a > straight-forward way of transforming these to a uniform 'character' or > 'Date' vector? I tried to do the following, but it outputs very > strange results: >> x > [1] "1/3/2005" "13/04/2004" "2/5/2005" "2/5/2005" "7/5/2007" > "22/04/2004" > [7] "21/04/2005" "20080430" "13/05/2003" "20080529" NA > NA > [13] "19/05/1999" "17/05/2000" "17/05/2000" >> sum(xa<- grepl('/', x)) > [1] 11 >> sum(xb<- grepl('200', substr(x, 1,4))) > [1] 2 >> sum(xc<- is.na(x))1 > [1] 2 >> x[xa]<- as.Date(x[xa], format="%d/%m/%Y") >> x[xb]<- as.Date(x[xb], format="%Y%m%d") >> x > [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999" > "12185" "14028" > [11] NA NA "10730" "11094" "11094" > > > The culprit is likely that the 'x' vector is 'character' throughout, > but I'm not sure how to work around. For example, I couldn't figure > how to create an empty 'Date' vector. RegardsYou probably don't want the vector to be empty, so something like this would work: y <- as.Date(rep(NA, 15)) Then things like y[xa] <- as.Date(x[xa], format="%d/%m/%Y") etc. should work. Duncan Murdoch
arun
2012-Jun-21  17:10 UTC
[R] convert 'character' vector containing mixed formats to 'Date'
HI,
You could also try it with strptime to get the result:
x <- c("1/3/2005", "13/04/2004", "2/5/2005",
"2/5/2005", "7/5/2007",
????? "22/04/2004", "21/04/2005", "20080430",
"13/05/2003", "20080529",
????? NA, NA, "19/05/1999", "17/05/2000",
"17/05/2000")
?x<-as.character(na.omit(x))
x1<-strptime(x,"%d/%m/%Y")
x2<-strptime(x,"%Y%m%d")
x1[is.na(x1)]<-c(x2[8],x2[10])
?x1
?[1] "2005-03-01" "2004-04-13" "2005-05-02"
"2005-05-02" "2007-05-07"
?[6] "2004-04-22" "2005-04-21" "2008-04-30"
"2003-05-13" "2008-05-29"
[11] "1999-05-19" "2000-05-17" "2000-05-17"
A.K.
----- Original Message -----
From: Liviu Andronic <landronimirc at gmail.com>
To: "r-help at r-project.org Help" <r-help at r-project.org>
Cc: 
Sent: Thursday, June 21, 2012 8:48 AM
Subject: [R] convert 'character' vector containing mixed formats to
'Date'
Dear all
I have a 'character' vector containing mixed formats (thanks Excel!)
and I'd like to translate it into a default "%Y-%m-%d" Date
vector.
x <- c("1/3/2005", "13/04/2004", "2/5/2005",
"2/5/2005", "7/5/2007",
? ? ?  "22/04/2004", "21/04/2005", "20080430",
"13/05/2003", "20080529",
? ? ?  NA, NA, "19/05/1999", "17/05/2000",
"17/05/2000")
In the above you will see that some dates are of format="%d/%m/%Y",
others of format="%Y%m%d" and some NA values. Can you suggest a
straight-forward way of transforming these to a uniform 'character' or
'Date' vector? I tried to do the following, but it outputs very
strange results:> x
[1] "1/3/2005"?  "13/04/2004" "2/5/2005"? 
"2/5/2005"?  "7/5/2007"
"22/04/2004"
[7] "21/04/2005" "20080430"?  "13/05/2003"
"20080529"?  NA
NA
[13] "19/05/1999" "17/05/2000"
"17/05/2000"> sum(xa <- grepl('/', x))
[1] 11> sum(xb? <- grepl('200', substr(x, 1,4)))
[1] 2> sum(xc <- is.na(x))
[1] 2> x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> x[xb] <- as.Date(x[xb], format="%Y%m%d")
> x
[1] "12843" "12521" "12905" "12905"
"13640" "12530" "12894" "13999"
"12185" "14028"
[11] NA? ? ? NA? ? ? "10730" "11094" "11094"
The culprit is likely that the 'x' vector is 'character'
throughout,
but I'm not sure how to work around. For example, I couldn't figure
how to create an empty 'Date' vector. Regards
Liviu
-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.