I have a unexpected behaviour reading times with colon from an Excel file, using the package readxl. In an Excel sheet, I have a column with times in hours:minutes, e.g: Arrival_time 13:39 13:51 When read from R with readxl::read_excel, this gives a tibble column with full date by defaut being the last day of 1899. OK. Why not, I know that POSIX variables are starting in 1900 after R doc (however I wonder why here the defaut is one day before January 1, 1900> tmp$Arrival_time [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC"Well, this is not exactly what I want to. I do not care about the year and the day... Therefore I decided to import this column as "text" explicitely (in order to manage it within R then). And this is what I get now:> read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text") > tmp$Arrival_time [1] "0.56875000000000009" "0.57708333333333328"Can someone tell me what happens ? I would really appreciate to understand the trick... [[alternative HTML version deleted]]
Probably (but not entirely sure), Excel is storing your text as a number of days, so 13:38 is a little more than half a day. Open your spreadsheet in excel and save those columns as text instead of times, that (should) fix your issue. On Fri, Apr 1, 2022, 02:12 Patrick Giraudoux < patrick.giraudoux at univ-fcomte.fr> wrote:> I have a unexpected behaviour reading times with colon from an Excel > file, using the package readxl. > > In an Excel sheet, I have a column with times in hours:minutes, e.g: > > Arrival_time > 13:39 > 13:51 > > When read from R with readxl::read_excel, this gives a tibble column > with full date by defaut being the last day of 1899. OK. Why not, I know > that POSIX variables are starting in 1900 after R doc (however I wonder > why here the defaut is one day before January 1, 1900 > > > tmp$Arrival_time [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC" > > Well, this is not exactly what I want to. I do not care about the year > and the day... Therefore I decided to import this column as "text" > explicitely (in order to manage it within R then). And this is what I > get now: > > > > read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text") > > tmp$Arrival_time [1] "0.56875000000000009" "0.57708333333333328" > > Can someone tell me what happens ? > > I would really appreciate to understand the trick... > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Both R and Excel assume a date is associated with every time object. In Excel, when you show a date it is an integer number of days since 1899-12-31 (due to a mistake made early in programming it). Whenever you show a time, it it merely displaying the time portion (fraction of a day) of a date/time. The date part of that value may or may not be 1899-12-31. With this in mind, you are tilting at windmills hoping to import a "pure time" because no such thing exists in either program. You can choose to render a `POSIXct` as showing only the time portion when you convert it to character if you so choose. On March 31, 2022 11:12:28 PM PDT, Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr> wrote:>I have a unexpected behaviour reading times with colon from an Excel >file, using the package readxl. > >In an Excel sheet, I have a column with times in hours:minutes, e.g: > >Arrival_time >13:39 >13:51 > >When read from R with readxl::read_excel, this gives a tibble column >with full date by defaut being the last day of 1899. OK. Why not, I know >that POSIX variables are starting in 1900 after R doc (however I wonder >why here the defaut is one day before January 1, 1900 > >> tmp$Arrival_time [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC" > >Well, this is not exactly what I want to. I do not care about the year >and the day... Therefore I decided to import this column as "text" >explicitely (in order to manage it within R then). And this is what I >get now: > >> read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text") > tmp$Arrival_time [1] "0.56875000000000009" "0.57708333333333328" > >Can someone tell me what happens ? > >I would really appreciate to understand the trick... > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Hello, The following function is probably not very solid, to work with dates and times is not trivial, but it converts those Excel numbers correctly. I don't know with what numbers it fails. xl_fracday_to_time <- function(x, digits = 0L) { old_opts <- options(digits = 20) on.exit(options(old_opts)) if(is.character(x)) { x <- as.numeric(x) } y <- x * 24 hours <- floor(y + .Machine$double.eps^0.5) y <- (y - hours) * 60 mins <- floor(y + .Machine$double.eps^0.5) y <- (y - mins) * 60 secs <- round(y, digits = digits) sprintf("%02d:%02d:%02d", hours, mins, secs) } x <- c("0.56875000000000009", "0.57708333333333328") xl_fracday_to_time(x) #[1] "13:39:00" "13:51:00" Hope this helps, Rui Barradas ?s 07:12 de 01/04/2022, Patrick Giraudoux escreveu:> I have a unexpected behaviour reading times with colon from an Excel > file, using the package readxl. > > In an Excel sheet, I have a column with times in hours:minutes, e.g: > > Arrival_time > 13:39 > 13:51 > > When read from R with readxl::read_excel, this gives a tibble column > with full date by defaut being the last day of 1899. OK. Why not, I know > that POSIX variables are starting in 1900 after R doc (however I wonder > why here the defaut is one day before January 1, 1900 > >> tmp$Arrival_time [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC" > > Well, this is not exactly what I want to. I do not care about the year > and the day... Therefore I decided to import this column as "text" > explicitely (in order to manage it within R then). And this is what I > get now: > >> read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text") > tmp$Arrival_time [1] "0.56875000000000009" "0.57708333333333328" > > Can someone tell me what happens ? > > I would really appreciate to understand the trick... > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.