Carlos André Zucco
2013-Aug-25 18:21 UTC
[R] POSIXct bug for conversion of specific combinations of date and time
Hello everyone, I'm having a big trouble with which seems to be a bug in as.POSIXct() date-time conversion. I have massive GPS datasets in which each location has it's own date and time attribute. As I convert them to POSIXct format, 1300 cases (of about half a million locations) simply return NA values. I picked up a small sample of failed cases and normal cases to demonstrate the problem (see below). Can anyone understand what's happening Thanks ## Data input>date<-rep("2012/10/21", 14) >hour<- c("00:02:38","00:11:05","00:19:33","00:28:00","00:36:27","00:44:57",+"00:53:27","01:03:28","01:10:15","01:16:34","01:24:00","01:30:13","01:47:58", +"01:52:43")>time<-as.data.frame(cbind(date,hour))# as.POSIXct formating time$convert<-as.POSIXct(paste(time[,1],time[,2]),format="%Y/%m/%d %H:%M:%S") date hour convert 1 2012/10/21 00:02:38 <NA> 2 2012/10/21 00:11:05 <NA> 3 2012/10/21 00:19:33 <NA> 4 2012/10/21 00:28:00 <NA> 5 2012/10/21 00:36:27 <NA> 6 2012/10/21 00:44:57 <NA> 7 2012/10/21 00:53:27 <NA> 8 2012/10/21 01:03:28 2012-10-21 01:03:28 9 2012/10/21 01:10:15 2012-10-21 01:10:15 10 2012/10/21 01:16:34 2012-10-21 01:16:34 11 2012/10/21 01:24:00 2012-10-21 01:24:00 12 2012/10/21 01:30:13 2012-10-21 01:30:13 13 2012/10/21 01:47:58 2012-10-21 01:47:58 14 2012/10/21 01:52:43 2012-10-21 01:52:43 ## Se that the problem occur specifically with information concerning 21/oct/2012 between midnight and 1am # alternatively strptime converting>time$convert<-strptime(paste(time[,1],time[,2]),format="%Y/%m/%d %H:%M:%S") >time$convertdate hour convert 1 2012/10/21 00:02:38 2012-10-21 00:02:38 2 2012/10/21 00:11:05 2012-10-21 00:11:05 3 2012/10/21 00:19:33 2012-10-21 00:19:33 4 2012/10/21 00:28:00 2012-10-21 00:28:00 5 2012/10/21 00:36:27 2012-10-21 00:36:27 6 2012/10/21 00:44:57 2012-10-21 00:44:57 7 2012/10/21 00:53:27 2012-10-21 00:53:27 8 2012/10/21 01:03:28 2012-10-21 01:03:28 9 2012/10/21 01:10:15 2012-10-21 01:10:15 10 2012/10/21 01:16:34 2012-10-21 01:16:34 11 2012/10/21 01:24:00 2012-10-21 01:24:00 12 2012/10/21 01:30:13 2012-10-21 01:30:13 13 2012/10/21 01:47:58 2012-10-21 01:47:58 14 2012/10/21 01:52:43 2012-10-21 01:52:43 # seems ok, however try any further commands:>range(time$convert)[1] NA NA>min(time$convert)[1] NA>time$convert[1] - time$convert[2]Time difference of NA secs Just in case it helps my session information area> sessionInfo()R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] grid_2.15.1 lattice_0.20-6 nlme_3.1-104 *Carlos André Zucco* --------------------------------------------------------------------------------- Biólogo, Mestre em Ecologia Laboratório de Ecologia e Conservação de Populações (LECP/IB/UFRJ) Doutorando do Programa de Pós-graduação em Ecologia da UFRJ --------------------------------------------------------------------------------- [[alternative HTML version deleted]]
arun
2013-Aug-26 06:23 UTC
[R] POSIXct bug for conversion of specific combinations of date and time
HI, Couldn't reproduce the problem.? I am using R 3.0.1. time$convert<-as.POSIXct(paste(time[,1],time[,2]),format="%Y/%m/%d %H:%M:%S") ?time ???????? date???? hour???????????? convert 1? 2012/10/21 00:02:38 2012-10-21 00:02:38 2? 2012/10/21 00:11:05 2012-10-21 00:11:05 3? 2012/10/21 00:19:33 2012-10-21 00:19:33 4? 2012/10/21 00:28:00 2012-10-21 00:28:00 5? 2012/10/21 00:36:27 2012-10-21 00:36:27 6? 2012/10/21 00:44:57 2012-10-21 00:44:57 7? 2012/10/21 00:53:27 2012-10-21 00:53:27 8? 2012/10/21 01:03:28 2012-10-21 01:03:28 9? 2012/10/21 01:10:15 2012-10-21 01:10:15 10 2012/10/21 01:16:34 2012-10-21 01:16:34 11 2012/10/21 01:24:00 2012-10-21 01:24:00 12 2012/10/21 01:30:13 2012-10-21 01:30:13 13 2012/10/21 01:47:58 2012-10-21 01:47:58 14 2012/10/21 01:52:43 2012-10-21 01:52:43> sessionInfo()R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: ?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C????????????? ?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8??? ?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8?? ?[7] LC_PAPER=C???????????????? LC_NAME=C???????????????? ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C??????????? [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C?????? attached base packages: [1] stats???? graphics? grDevices utils???? datasets? methods?? base???? other attached packages: [1] data.table_1.8.8 stringr_0.6.2??? reshape2_1.2.2? loaded via a namespace (and not attached): [1] plyr_1.8??? tools_3.0.1 ----- Original Message ----- From: Carlos Andr? Zucco <cazucco14 at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, August 25, 2013 2:21 PM Subject: [R] POSIXct bug for conversion of specific combinations of date and time Hello everyone, I'm having a big trouble with which seems to be a bug in as.POSIXct() date-time conversion. I have massive GPS datasets in which each location has it's own date and time attribute. As I convert them to POSIXct format, 1300 cases (of about half a million locations) simply return NA values. I picked up a small sample of failed cases and normal cases to demonstrate the problem (see below). Can anyone understand what's happening Thanks ## Data input>date<-rep("2012/10/21", 14) >hour<- c("00:02:38","00:11:05","00:19:33","00:28:00","00:36:27","00:44:57",+"00:53:27","01:03:28","01:10:15","01:16:34","01:24:00","01:30:13","01:47:58", +"01:52:43")>time<-as.data.frame(cbind(date,hour))# as.POSIXct formating time$convert<-as.POSIXct(paste(time[,1],time[,2]),format="%Y/%m/%d %H:%M:%S") ? ? ? ? date? ? hour? ? ? ? ? ? convert 1? 2012/10/21 00:02:38? ? ? ? ? ? ? ? <NA> 2? 2012/10/21 00:11:05? ? ? ? ? ? ? ? <NA> 3? 2012/10/21 00:19:33? ? ? ? ? ? ? ? <NA> 4? 2012/10/21 00:28:00? ? ? ? ? ? ? ? <NA> 5? 2012/10/21 00:36:27? ? ? ? ? ? ? ? <NA> 6? 2012/10/21 00:44:57? ? ? ? ? ? ? ? <NA> 7? 2012/10/21 00:53:27? ? ? ? ? ? ? ? <NA> 8? 2012/10/21 01:03:28 2012-10-21 01:03:28 9? 2012/10/21 01:10:15 2012-10-21 01:10:15 10 2012/10/21 01:16:34 2012-10-21 01:16:34 11 2012/10/21 01:24:00 2012-10-21 01:24:00 12 2012/10/21 01:30:13 2012-10-21 01:30:13 13 2012/10/21 01:47:58 2012-10-21 01:47:58 14 2012/10/21 01:52:43 2012-10-21 01:52:43 ## Se that the problem occur specifically with information concerning 21/oct/2012 between midnight and 1am # alternatively strptime converting>time$convert<-strptime(paste(time[,1],time[,2]),format="%Y/%m/%d %H:%M:%S") >time$convert? ? ? ? date? ? hour? ? ? ? ? ? convert 1? 2012/10/21 00:02:38 2012-10-21 00:02:38 2? 2012/10/21 00:11:05 2012-10-21 00:11:05 3? 2012/10/21 00:19:33 2012-10-21 00:19:33 4? 2012/10/21 00:28:00 2012-10-21 00:28:00 5? 2012/10/21 00:36:27 2012-10-21 00:36:27 6? 2012/10/21 00:44:57 2012-10-21 00:44:57 7? 2012/10/21 00:53:27 2012-10-21 00:53:27 8? 2012/10/21 01:03:28 2012-10-21 01:03:28 9? 2012/10/21 01:10:15 2012-10-21 01:10:15 10 2012/10/21 01:16:34 2012-10-21 01:16:34 11 2012/10/21 01:24:00 2012-10-21 01:24:00 12 2012/10/21 01:30:13 2012-10-21 01:30:13 13 2012/10/21 01:47:58 2012-10-21 01:47:58 14 2012/10/21 01:52:43 2012-10-21 01:52:43 # seems ok, however try any further commands:>range(time$convert)[1] NA NA>min(time$convert)[1] NA>time$convert[1] - time$convert[2]Time difference of NA secs Just in case it helps my session information area> sessionInfo()R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252? LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats? ? graphics? grDevices utils? ? datasets? methods? base loaded via a namespace (and not attached): [1] grid_2.15.1? ? lattice_0.20-6 nlme_3.1-104 *Carlos Andr? Zucco* --------------------------------------------------------------------------------- Bi?logo, Mestre em Ecologia Laborat?rio de Ecologia e Conserva??o de Popula??es (LECP/IB/UFRJ) Doutorando do Programa de P?s-gradua??o em Ecologia da UFRJ --------------------------------------------------------------------------------- ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jeff Newmiller
2013-Aug-26 06:38 UTC
[R] POSIXct bug for conversion of specific combinations of date and time
There have been a couple of threads recently on proper usage of POSIXct. I suggest you read the archives. After you read the archives: In your case, you don't seem to have zone offset data in your time info, so you probably need to use Sys.setenv to set an appropriate default time zone. The NA values are likely due to your current (unspecified) computer time zone. If you want to follow up, we are probably going to need you to indicate what Sys.getenv("TZ") and sessionInfo() report (per the Posting Guide mentioned at the bottom of this email). --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. "Carlos Andr? Zucco" <cazucco14 at gmail.com> wrote:>Hello everyone, > >I'm having a big trouble with which seems to be a bug in as.POSIXct() >date-time conversion. I have massive GPS datasets in which each >location >has it's own date and time attribute. As I convert them to POSIXct >format, >1300 cases (of about half a million locations) simply return NA values. > >I picked up a small sample of failed cases and normal cases to >demonstrate >the problem (see below). Can anyone understand what's happening >Thanks > >## Data input >>date<-rep("2012/10/21", 14) >>hour<- >c("00:02:38","00:11:05","00:19:33","00:28:00","00:36:27","00:44:57", >+"00:53:27","01:03:28","01:10:15","01:16:34","01:24:00","01:30:13","01:47:58", >+"01:52:43") >>time<-as.data.frame(cbind(date,hour)) ># as.POSIXct formating >time$convert<-as.POSIXct(paste(time[,1],time[,2]),format="%Y/%m/%d >%H:%M:%S") > date hour convert >1 2012/10/21 00:02:38 <NA> >2 2012/10/21 00:11:05 <NA> >3 2012/10/21 00:19:33 <NA> >4 2012/10/21 00:28:00 <NA> >5 2012/10/21 00:36:27 <NA> >6 2012/10/21 00:44:57 <NA> >7 2012/10/21 00:53:27 <NA> >8 2012/10/21 01:03:28 2012-10-21 01:03:28 >9 2012/10/21 01:10:15 2012-10-21 01:10:15 >10 2012/10/21 01:16:34 2012-10-21 01:16:34 >11 2012/10/21 01:24:00 2012-10-21 01:24:00 >12 2012/10/21 01:30:13 2012-10-21 01:30:13 >13 2012/10/21 01:47:58 2012-10-21 01:47:58 >14 2012/10/21 01:52:43 2012-10-21 01:52:43 >## Se that the problem occur specifically with information concerning >21/oct/2012 between midnight and 1am > ># alternatively strptime converting >>time$convert<-strptime(paste(time[,1],time[,2]),format="%Y/%m/%d >%H:%M:%S") >>time$convert > date hour convert >1 2012/10/21 00:02:38 2012-10-21 00:02:38 >2 2012/10/21 00:11:05 2012-10-21 00:11:05 >3 2012/10/21 00:19:33 2012-10-21 00:19:33 >4 2012/10/21 00:28:00 2012-10-21 00:28:00 >5 2012/10/21 00:36:27 2012-10-21 00:36:27 >6 2012/10/21 00:44:57 2012-10-21 00:44:57 >7 2012/10/21 00:53:27 2012-10-21 00:53:27 >8 2012/10/21 01:03:28 2012-10-21 01:03:28 >9 2012/10/21 01:10:15 2012-10-21 01:10:15 >10 2012/10/21 01:16:34 2012-10-21 01:16:34 >11 2012/10/21 01:24:00 2012-10-21 01:24:00 >12 2012/10/21 01:30:13 2012-10-21 01:30:13 >13 2012/10/21 01:47:58 2012-10-21 01:47:58 >14 2012/10/21 01:52:43 2012-10-21 01:52:43 ># seems ok, however try any further commands: >>range(time$convert) >[1] NA NA >>min(time$convert) >[1] NA >>time$convert[1] - time$convert[2] >Time difference of NA secs > > Just in case it helps my session information area >> sessionInfo() >R version 2.15.1 (2012-06-22) >Platform: i386-pc-mingw32/i386 (32-bit) > >locale: >[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 >[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C >[5] LC_TIME=Portuguese_Brazil.1252 > >attached base packages: >[1] stats graphics grDevices utils datasets methods base > >loaded via a namespace (and not attached): >[1] grid_2.15.1 lattice_0.20-6 nlme_3.1-104 > > > >*Carlos Andr? Zucco* >--------------------------------------------------------------------------------- >Bi?logo, Mestre em Ecologia >Laborat?rio de Ecologia e Conserva??o de Popula??es (LECP/IB/UFRJ) >Doutorando do Programa de P?s-gradua??o em Ecologia da UFRJ >--------------------------------------------------------------------------------- > > [[alternative HTML version deleted]] > > > >------------------------------------------------------------------------ > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Jeff Newmiller
2013-Aug-26 15:31 UTC
[R] POSIXct bug for conversion of specific combinations of date and time
I deal with non-daylight-savings time data all the time using Windows with its system time set to daylight time. Sys.setenv(TZ="Etc/GMT+4") sets the zone for the R process only and does not affect the system time settings. Using this method lets me handle data from all around the world, with or without daylight savings in the source time zone. If you are having troubles understanding what R is doing then give us a reproducible example. The NA values are a warning to you about inconsistencies between the data and your assumptions about it, not an indication that "R cannot deal with this shifts very well." Please keep the r-help list included... I don't do private consulting online. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. "Carlos Andr? Zucco" <cazucco14 at gmail.com> wrote:>Thanks Jeff, > >your reply guided me to a solution. In fact it was reading my system >time >zone. But even if a specified the local time zone in the tz argument, >the >brazilian timezone has daylight saving shifts. R cannot deal with this >shifts very well. The solution was to turn off the Daylight saving >correction at windows settings and then everything worked well. >Thanks again for you attention > > >On Mon, Aug 26, 2013 at 2:38 AM, Jeff Newmiller ><jdnewmil at dcn.davis.ca.us>wrote: > >> There have been a couple of threads recently on proper usage of >POSIXct. I >> suggest you read the archives. >> >> After you read the archives: In your case, you don't seem to have >zone >> offset data in your time info, so you probably need to use Sys.setenv >to >> set an appropriate default time zone. The NA values are likely due to >your >> current (unspecified) computer time zone. If you want to follow up, >we are >> probably going to need you to indicate what Sys.getenv("TZ") and >> sessionInfo() report (per the Posting Guide mentioned at the bottom >of this >> email). >> >--------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go >Live... >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> Go... >> Live: OO#.. Dead: OO#.. >Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. >rocks...1k >> >--------------------------------------------------------------------------- >> Sent from my phone. Please excuse my brevity. >> >> "Carlos Andr? Zucco" <cazucco14 at gmail.com> wrote: >> >Hello everyone, >> > >> >I'm having a big trouble with which seems to be a bug in >as.POSIXct() >> >date-time conversion. I have massive GPS datasets in which each >> >location >> >has it's own date and time attribute. As I convert them to POSIXct >> >format, >> >1300 cases (of about half a million locations) simply return NA >values. >> > >> >I picked up a small sample of failed cases and normal cases to >> >demonstrate >> >the problem (see below). Can anyone understand what's happening >> >Thanks >> > >> >## Data input >> >>date<-rep("2012/10/21", 14) >> >>hour<- >> >c("00:02:38","00:11:05","00:19:33","00:28:00","00:36:27","00:44:57", >> >> >>+"00:53:27","01:03:28","01:10:15","01:16:34","01:24:00","01:30:13","01:47:58", >> >+"01:52:43") >> >>time<-as.data.frame(cbind(date,hour)) >> ># as.POSIXct formating >> >time$convert<-as.POSIXct(paste(time[,1],time[,2]),format="%Y/%m/%d >> >%H:%M:%S") >> > date hour convert >> >1 2012/10/21 00:02:38 <NA> >> >2 2012/10/21 00:11:05 <NA> >> >3 2012/10/21 00:19:33 <NA> >> >4 2012/10/21 00:28:00 <NA> >> >5 2012/10/21 00:36:27 <NA> >> >6 2012/10/21 00:44:57 <NA> >> >7 2012/10/21 00:53:27 <NA> >> >8 2012/10/21 01:03:28 2012-10-21 01:03:28 >> >9 2012/10/21 01:10:15 2012-10-21 01:10:15 >> >10 2012/10/21 01:16:34 2012-10-21 01:16:34 >> >11 2012/10/21 01:24:00 2012-10-21 01:24:00 >> >12 2012/10/21 01:30:13 2012-10-21 01:30:13 >> >13 2012/10/21 01:47:58 2012-10-21 01:47:58 >> >14 2012/10/21 01:52:43 2012-10-21 01:52:43 >> >## Se that the problem occur specifically with information >concerning >> >21/oct/2012 between midnight and 1am >> > >> ># alternatively strptime converting >> >>time$convert<-strptime(paste(time[,1],time[,2]),format="%Y/%m/%d >> >%H:%M:%S") >> >>time$convert >> > date hour convert >> >1 2012/10/21 00:02:38 2012-10-21 00:02:38 >> >2 2012/10/21 00:11:05 2012-10-21 00:11:05 >> >3 2012/10/21 00:19:33 2012-10-21 00:19:33 >> >4 2012/10/21 00:28:00 2012-10-21 00:28:00 >> >5 2012/10/21 00:36:27 2012-10-21 00:36:27 >> >6 2012/10/21 00:44:57 2012-10-21 00:44:57 >> >7 2012/10/21 00:53:27 2012-10-21 00:53:27 >> >8 2012/10/21 01:03:28 2012-10-21 01:03:28 >> >9 2012/10/21 01:10:15 2012-10-21 01:10:15 >> >10 2012/10/21 01:16:34 2012-10-21 01:16:34 >> >11 2012/10/21 01:24:00 2012-10-21 01:24:00 >> >12 2012/10/21 01:30:13 2012-10-21 01:30:13 >> >13 2012/10/21 01:47:58 2012-10-21 01:47:58 >> >14 2012/10/21 01:52:43 2012-10-21 01:52:43 >> ># seems ok, however try any further commands: >> >>range(time$convert) >> >[1] NA NA >> >>min(time$convert) >> >[1] NA >> >>time$convert[1] - time$convert[2] >> >Time difference of NA secs >> > >> > Just in case it helps my session information area >> >> sessionInfo() >> >R version 2.15.1 (2012-06-22) >> >Platform: i386-pc-mingw32/i386 (32-bit) >> > >> >locale: >> >[1] LC_COLLATE=Portuguese_Brazil.1252 >LC_CTYPE=Portuguese_Brazil.1252 >> >[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C >> >[5] LC_TIME=Portuguese_Brazil.1252 >> > >> >attached base packages: >> >[1] stats graphics grDevices utils datasets methods base >> > >> >loaded via a namespace (and not attached): >> >[1] grid_2.15.1 lattice_0.20-6 nlme_3.1-104 >> > >> > >> > >> >*Carlos Andr? Zucco* >> >> >>--------------------------------------------------------------------------------- >> >Bi?logo, Mestre em Ecologia >> >Laborat?rio de Ecologia e Conserva??o de Popula??es (LECP/IB/UFRJ) >> >Doutorando do Programa de P?s-gradua??o em Ecologia da UFRJ >> >> >>--------------------------------------------------------------------------------- >> > >> > [[alternative HTML version deleted]] >> > >> > >> > >> >>------------------------------------------------------------------------ >> > >> >______________________________________________ >> >R-help at r-project.org mailing list >> >https://stat.ethz.ch/mailman/listinfo/r-help >> >PLEASE do read the posting guide >> >http://www.R-project.org/posting-guide.html >> >and provide commented, minimal, self-contained, reproducible code. >> >>