(I reported the test failure mentioned below to R-help but was advised that this list is the right one to address the issue; in the meantime I investigated the matter somewhat more closely, including searching recent R-devel postings, since I haven't been following this list.) Last May there were two reports here of problems with Sys.timezone, one where the zoneinfo directory is in a nonstandard location (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the other where the system lacks the file /etc/localtime (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html). My system exhibits a third case: it lacks /etc/timezone and does not set TZ systemwide, but it does have /etc/localtime, which is a copy of, rather than a symlink to, a file under zoneinfo. On this system Sys.timezone() returns NA and the Sys.timezone test in reg-tests-1d fails. However, on my system I can get the (abbreviated) timezone in R by using as.POSIXlt, e.g. as.POSIXlt(Sys.time())$zone. If Sys.timezone took advantage of this, e.g. as below, it would be useful on such systems as mine and the regression test would pass. my.Sys.timezone <- function (location = TRUE) { tz <- Sys.getenv("TZ", names = FALSE) if (!location || nzchar(tz)) return(Sys.getenv("TZ", unset = NA_character_)) lt <- normalizePath("/etc/localtime") if (grepl(pat <- "^/usr/share/zoneinfo/", lt) || grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) sub(pat, "", lt) else if (lt == "/etc/localtime") if (!file.exists("/etc/timezone")) return(as.POSIXlt(Sys.time())$zone) else if (dir.exists("/usr/share/zoneinfo") && { info <- file.info(normalizePath("/etc/timezone"), extra_cols = FALSE) (!info$isdir && info$size <= 200L) } && { tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), error = function(e) raw(0L)) length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 32:126))) } && { tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", rawToChar(tz1)) tzp <- file.path("/usr/share/zoneinfo", tz2) file.exists(tzp) && !dir.exists(tzp) && identical(file.size(normalizePath(tzp)), file.size(lt)) }) tz2 else NA_character_ } One problem with this is that the zone component of as.POSIXlt only holds the abbreviated timezone, not the Olson name. I don't know how to get the Olson name using only R functions, but maybe it would be good enough to return the abbreviated timezone where possible, e.g. as above. (On my system I can get the Olson name of the timezone in R with a shell pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs md5sum | grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d '/' -f 5,6"), but the last part of this is tailored to my configuration and the whole thing is not OS-neutral, so it isn't suitable for Sys.timezone.) Steve Berman
>>>>> Stephen Berman <stephen.berman at gmx.net> >>>>> on Sun, 15 Oct 2017 01:53:12 +0200 writes:> (I reported the test failure mentioned below to R-help but was advised > that this list is the right one to address the issue; in the meantime I > investigated the matter somewhat more closely, including searching > recent R-devel postings, since I haven't been following this list.) > Last May there were two reports here of problems with Sys.timezone, one > where the zoneinfo directory is in a nonstandard location > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the > other where the system lacks the file /etc/localtime > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html). My > system exhibits a third case: it lacks /etc/timezone and does not set TZ > systemwide, but it does have /etc/localtime, which is a copy of, rather > than a symlink to, a file under zoneinfo. On this system Sys.timezone() > returns NA and the Sys.timezone test in reg-tests-1d fails. However, on > my system I can get the (abbreviated) timezone in R by using as.POSIXlt, > e.g. as.POSIXlt(Sys.time())$zone. If Sys.timezone took advantage of > this, e.g. as below, it would be useful on such systems as mine and the > regression test would pass. > my.Sys.timezone <- > function (location = TRUE) > { > tz <- Sys.getenv("TZ", names = FALSE) > if (!location || nzchar(tz)) > return(Sys.getenv("TZ", unset = NA_character_)) > lt <- normalizePath("/etc/localtime") > if (grepl(pat <- "^/usr/share/zoneinfo/", lt) || > grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) > sub(pat, "", lt) > else if (lt == "/etc/localtime") > if (!file.exists("/etc/timezone")) > return(as.POSIXlt(Sys.time())$zone) > else if (dir.exists("/usr/share/zoneinfo") && { > info <- file.info(normalizePath("/etc/timezone"), extra_cols = FALSE) > (!info$isdir && info$size <= 200L) > } && { > tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), > error = function(e) raw(0L)) > length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 32:126))) > } && { > tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", rawToChar(tz1)) > tzp <- file.path("/usr/share/zoneinfo", tz2) > file.exists(tzp) && !dir.exists(tzp) && > identical(file.size(normalizePath(tzp)), file.size(lt)) > }) > tz2 > else NA_character_ > } > One problem with this is that the zone component of as.POSIXlt only > holds the abbreviated timezone, not the Olson name. Yes, indeed. So, really only for Sys.timezone(location = FALSE) this should be given, for the default location = TRUE it should still give NA (i.e. NA_character_) in your setup. Interestingly, the Windows versions of Sys.timezone(location FALSE) uses something like your proposal, and I tend to think that -- again only for location=FALSE -- this should be used on on-Windows as well, at least instead of returning NA then. Also for me on 3 different Linuxen (Fedora 24, F. 26, and ubuntu 14.04 LTS), I get > Sys.timezone() [1] "Europe/Zurich" > Sys.timezone(FALSE) [1] NA > whereas on Windows I get Europe/Berlin for the first (why on earth - I'm really in Zurich) and get "CEST" ("Central European Summer Time") for the 2nd one instead of NA ... simply using a smarter version of your proposal. The windows source is in R's source at src/library/base/R/windows/system.R : Sys.timezone <- function(location = TRUE) { tz <- Sys.getenv("TZ", names = FALSE) if(nzchar(tz)) return(tz) if(location) return(.Internal(tzone_name())) z <- as.POSIXlt(Sys.time()) zz <- attr(z, "tzone") if(length(zz) == 3L) zz[2L + z$isdst] else zz[1L] }>From what I read, the last three lines also work in your setupwhere it seems zz would be of length 1, right ? I'd really propose to use these 3 lines in the non-Windows version of Sys.timezone .. at the end *instead* of NA_character_ (or a slightly safer version which gives NA_character_ if zz is of length 0 {e.g. if there is no "tzone" attribute}. > i don't know how to > get the Olson name using only R functions, but maybe it would be good > enough to return the abbreviated timezone where possible, e.g. as above. > (On my system I can get the Olson name of the timezone in R with a shell > pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs md5sum > | grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d > '/' -f 5,6"), but the last part of this is tailored to my configuration > and the whole thing is not OS-neutral, so it isn't suitable for > Sys.timezone.) > Steve Berman Definitely not. I still recommend you think of a more portable solution for the `location = TRUE` (default) case in Sys.timezone(). Returning the non-location form (e.g "CEST") when something like "Europe/Zurich" is expected is really not a good idea, and you are lucky that the regression test passes "accidentally" ... Martin -- Martin <Maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler Seminar f?r Statistik, ETH Z?rich and R Core Team
>>>>> Martin Maechler <maechler at stat.math.ethz.ch> >>>>> on Mon, 16 Oct 2017 19:13:31 +0200 writes:>>>>> Stephen Berman <stephen.berman at gmx.net> >>>>> on Sun, 15 Oct 2017 01:53:12 +0200 writes:> > (I reported the test failure mentioned below to R-help but was advised > > that this list is the right one to address the issue; in the meantime I > > investigated the matter somewhat more closely, including searching > > recent R-devel postings, since I haven't been following this list.) > > > Last May there were two reports here of problems with Sys.timezone, one > > where the zoneinfo directory is in a nonstandard location > > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the > > other where the system lacks the file /etc/localtime > > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html). My > > system exhibits a third case: it lacks /etc/timezone and does not set TZ > > systemwide, but it does have /etc/localtime, which is a copy of, rather > > than a symlink to, a file under zoneinfo. On this system Sys.timezone() > > returns NA and the Sys.timezone test in reg-tests-1d fails. However, on > > my system I can get the (abbreviated) timezone in R by using as.POSIXlt, > > e.g. as.POSIXlt(Sys.time())$zone. If Sys.timezone took advantage of > > this, e.g. as below, it would be useful on such systems as mine and the > > regression test would pass. > > > my.Sys.timezone <- > > function (location = TRUE) > > { > > tz <- Sys.getenv("TZ", names = FALSE) > > if (!location || nzchar(tz)) > > return(Sys.getenv("TZ", unset = NA_character_)) > > lt <- normalizePath("/etc/localtime") > > if (grepl(pat <- "^/usr/share/zoneinfo/", lt) || > > grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) > > sub(pat, "", lt) > > else if (lt == "/etc/localtime") > > if (!file.exists("/etc/timezone")) > > return(as.POSIXlt(Sys.time())$zone) > > else if (dir.exists("/usr/share/zoneinfo") && { > > info <- file.info(normalizePath("/etc/timezone"), extra_cols = FALSE) > > (!info$isdir && info$size <= 200L) > > } && { > > tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), > > error = function(e) raw(0L)) > > length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 32:126))) > > } && { > > tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", rawToChar(tz1)) > > tzp <- file.path("/usr/share/zoneinfo", tz2) > > file.exists(tzp) && !dir.exists(tzp) && > > identical(file.size(normalizePath(tzp)), file.size(lt)) > > }) > > tz2 > > else NA_character_ > > } > > > One problem with this is that the zone component of as.POSIXlt only > > holds the abbreviated timezone, not the Olson name. > > Yes, indeed. So, really only for Sys.timezone(location = FALSE) this > should be given, for the default location = TRUE it should > still give NA (i.e. NA_character_) in your setup. > > Interestingly, the Windows versions of Sys.timezone(location > FALSE) uses something like your proposal, and I tend to think that > -- again only for location=FALSE -- this should be used on > on-Windows as well, at least instead of returning NA then. > > Also for me on 3 different Linuxen (Fedora 24, F. 26, and ubuntu > 14.04 LTS), I get > > > Sys.timezone() > [1] "Europe/Zurich" > > Sys.timezone(FALSE) > [1] NA > > > > whereas on Windows I get Europe/Berlin for the first (why on > earth - I'm really in Zurich) and get "CEST" ("Central European Summer Time") > for the 2nd one instead of NA ... simply using a smarter version > of your proposal. The windows source is > in R's source at src/library/base/R/windows/system.R : > > Sys.timezone <- function(location = TRUE) > { > tz <- Sys.getenv("TZ", names = FALSE) > if(nzchar(tz)) return(tz) > if(location) return(.Internal(tzone_name())) > z <- as.POSIXlt(Sys.time()) > zz <- attr(z, "tzone") > if(length(zz) == 3L) zz[2L + z$isdst] else zz[1L] > } > > >From what I read, the last three lines also work in your setup > where it seems zz would be of length 1, right ? > > I'd really propose to use these 3 lines in the non-Windows > version of Sys.timezone .. at the end *instead* of NA_character_ > (or a slightly safer version which gives NA_character_ if zz is > of length 0 {e.g. if there is no "tzone" attribute}. > > > i don't know how to > > get the Olson name using only R functions, but maybe it would be good > > enough to return the abbreviated timezone where possible, e.g. as above. > > (On my system I can get the Olson name of the timezone in R with a shell > > pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs md5sum > > | grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d > > '/' -f 5,6"), but the last part of this is tailored to my configuration > > and the whole thing is not OS-neutral, so it isn't suitable for > > Sys.timezone.) > > > Steve Berman > > Definitely not. I still recommend you think of a more portable > solution for the `location = TRUE` (default) case in Sys.timezone(). > Returning the non-location form (e.g "CEST") when something like > "Europe/Zurich" is expected is really not a good idea, > and you are lucky that the regression test passes "accidentally" ... > > MartinIn the mean time, I have committed a common version (Windows and non-Windows) of Sys.timezone() to the R development sources (aka "R-devel"). That now uses as.POSIXlt(Sys.time()) very similarly to the above "Windows only" case, but __only__ for 'location=FALSE' which is not the default. The most current development source is always available (via 'svn' or alternatively for browsing via your web browser) from https://svn.r-project.org/R/trunk/src/library/base/R/datetime.R As you say yourself, the above system("... xargs md5sum ...") using workaround is really too platform specific but I'd guess there should be a less error prone way to get the long timezone name on your system ... If that remains "contained" (i.e. small) and works with files and R's files tools -- e.g. file.*() ones [but not system()], I'd consider a patch to the above source file (sent by you to the R-devel mailing list --- or after having gotten an account there by asking, via bug report & patch attachment at https://bugs.r-project.org/ ) Best, Martin> > -- > Martin <Maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler > Seminar f?r Statistik, ETH Z?rich > and R Core Team