I am trying to clean up some dates and I am clearly
doing something wrong. I have laid out an example
that seems to show what is happening with the "real"
data. The coding is lousy but it looks like it
should have worked.
Can anyone suggest a) why I am getting that NA
appearing after the strptime() command and b) why the
NA is disappearing in the sort()? It happens with
na.rm=TRUE and na.rm=FALSE
-------------------------------------------------
aa <- data.frame( c("12/05/2001", " ",
"30/02/1995",
NA, "14/02/2007", "M" ) )
names(aa) <- "times"
aa[is.na(aa)] <- "M"
aa[aa==" "] <- "M"
bb <- unlist(subset(aa, aa[,1] !="M"))
dates <- strptime(bb, "%d/%m/%Y")
dates
sort(dates)
--------------------------------------------------
Session Info
R version 2.4.1 (2006-12-18)
i386-pc-mingw32
locale:
LC_COLLATE=English_Canada.1252;
LC_CTYPE=English_Canada.1252;
LC_MONETARY=English_Canada.1252;
LC_NUMERIC=C;LC_TIME=English_Canada.1252
attached base packages:
[1] "stats" "graphics" "grDevices"
"utils"
"datasets" "methods" "base"
other attached packages:
gdata Hmisc
"2.3.1" "3.3-2"
(Yes I know I'm out of date but I don't like
upgrading just as I am finishing a project)
Thanks
Perhaps you want one of these:> sort(as.Date(aa$times, "%d/%m/%Y"))[1] "1995-03-02" "2001-05-12" "2007-02-14"> sort(as.Date(aa$times, "%d/%m/%Y"), na.last = TRUE)[1] "1995-03-02" "2001-05-12" "2007-02-14" NA NA [6] NA On 6/7/07, John Kane <jrkrideau at yahoo.ca> wrote:> I am trying to clean up some dates and I am clearly > doing something wrong. I have laid out an example > that seems to show what is happening with the "real" > data. The coding is lousy but it looks like it > should have worked. > > Can anyone suggest a) why I am getting that NA > appearing after the strptime() command and b) why the > NA is disappearing in the sort()? It happens with > na.rm=TRUE and na.rm=FALSE > ------------------------------------------------- > aa <- data.frame( c("12/05/2001", " ", "30/02/1995", > NA, "14/02/2007", "M" ) ) > names(aa) <- "times" > aa[is.na(aa)] <- "M" > aa[aa==" "] <- "M" > bb <- unlist(subset(aa, aa[,1] !="M")) > dates <- strptime(bb, "%d/%m/%Y") > dates > sort(dates) > -------------------------------------------------- > > Session Info > R version 2.4.1 (2006-12-18) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_Canada.1252; > LC_CTYPE=English_Canada.1252; > LC_MONETARY=English_Canada.1252; > LC_NUMERIC=C;LC_TIME=English_Canada.1252 > > attached base packages: > [1] "stats" "graphics" "grDevices" "utils" > "datasets" "methods" "base" > > other attached packages: > gdata Hmisc > "2.3.1" "3.3-2" > > (Yes I know I'm out of date but I don't like > upgrading just as I am finishing a project) > > Thanks > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi John,
a) The NA appears because '30/02/1995' is not a valid date.
> strptime('30/02/1995' , "%d/%m/%Y")
[1] NA
b) dates which has the following classes uses sort.POSIXlt which in
turns sets na.last to NA. ?order details how NA's are handled in
ordering data via na.last.
> class(dates)
[1] "POSIXt" "POSIXlt"
> methods(sort)
[1] sort.default sort.POSIXlt
> sort.POSIXlt
function (x, decreasing = FALSE, na.last = NA, ...)
x[order(as.POSIXct(x), na.last = na.last, decreasing =
decreasing)]
<environment: namespace:base>
After resetting the Feb. date the code works.
HTH,
-jason
----- Original Message -----
From: "John Kane" <jrkrideau at yahoo.ca>
To: "R R-help" <r-help at stat.math.ethz.ch>
Sent: Thursday, June 07, 2007 2:17 PM
Subject: [R] character to time problem
>I am trying to clean up some dates and I am clearly
> doing something wrong. I have laid out an example
> that seems to show what is happening with the "real"
> data. The coding is lousy but it looks like it
> should have worked.
>
> Can anyone suggest a) why I am getting that NA
> appearing after the strptime() command and b) why the
> NA is disappearing in the sort()? It happens with
> na.rm=TRUE and na.rm=FALSE
> -------------------------------------------------
> aa <- data.frame( c("12/05/2001", " ",
"30/02/1995",
> NA, "14/02/2007", "M" ) )
> names(aa) <- "times"
> aa[is.na(aa)] <- "M"
> aa[aa==" "] <- "M"
> bb <- unlist(subset(aa, aa[,1] !="M"))
> dates <- strptime(bb, "%d/%m/%Y")
> dates
> sort(dates)
> --------------------------------------------------
>
> Session Info
> R version 2.4.1 (2006-12-18)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_Canada.1252;
> LC_CTYPE=English_Canada.1252;
> LC_MONETARY=English_Canada.1252;
> LC_NUMERIC=C;LC_TIME=English_Canada.1252
>
> attached base packages:
> [1] "stats" "graphics" "grDevices"
"utils"
> "datasets" "methods" "base"
>
> other attached packages:
> gdata Hmisc
> "2.3.1" "3.3-2"
>
> (Yes I know I'm out of date but I don't like
> upgrading just as I am finishing a project)
>
> Thanks
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>