ripley@stats.ox.ac.uk
2004-Jan-11 17:50 UTC
[Rd] strange behaviour when converting from char to POSIX (PR#6427)
On Sun, 11 Jan 2004, Dirk Eddelbuettel wrote:> On Fri, Jan 09, 2004 at 06:01:27PM +0100, christoph.schmutz@meteoschweiz.ch wrote: > > Full_Name: Christoph Schmutz, MeteoSchweiz, Switzerland > > Version: R1.7.1, R1.8.1 > > OS: windows2000, solaris sunOS 5.8 > > Submission from: (NULL) (141.249.133.6) > > > > > > > > I'm not sure if I don't get the clue, but please consider this: > > > strptime("19930870150","%Y%j%H%M") > > [1] "1993-03-28 01:50:00" > > > strptime("19930870250","%Y%j%H%M") > > [1] "1993-03-28 01:50:00" > > > strptime("19930870350","%Y%j%H%M") > > [1] "1993-03-28 03:50:00" > > You are presumably hitting the switch from dayligh-saving to regular time in > that year.The reverse, as I posted yesterday.> I was just mucking about with that, but R (1.9.0 as of Jan 8, 2004, on > Debian unstable) still crashes reliably even when I explicitly set the TZ > variable (and it also crashed for TZ=GMT): > > > Sys.getenv("TZ") > TZ > "" > > Sys.putenv("TZ"="CDT6CST") > > Sys.getenv("TZ") > TZ > "CDT6CST" > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S", > tz="GMT", usetz=TRUE) > [1] "1993-08-07 01:50:00" > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S %Z", > tz="GMT", usetz=TRUE) > Segmentation fault > > > Not nice.And the bug is in glibc, not R (the segfault is in strftime in libc). It works perfectly on Solaris: [1] "1993-08-07 01:50:00 CST" and on Windows (where that is not a valid time zone so I used mine). [1] "1993-08-07 01:50:00 GMT Daylight Time" Presumably glibc has some undocumented assumption that we are not fulfilling, but I am by now very tired of the bugs in its date-time code. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Prof Brian Ripley
2004-Jan-11 21:21 UTC
[Rd] strange behaviour when converting from char to POSIX
On Sun, 11 Jan 2004 ripley@stats.ox.ac.uk wrote:> On Sun, 11 Jan 2004, Dirk Eddelbuettel wrote: > > I was just mucking about with that, but R (1.9.0 as of Jan 8, 2004, on > > Debian unstable) still crashes reliably even when I explicitly set the TZ > > variable (and it also crashed for TZ=GMT): > > > > > Sys.getenv("TZ") > > TZ > > "" > > > Sys.putenv("TZ"="CDT6CST") > > > Sys.getenv("TZ") > > TZ > > "CDT6CST" > > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S", > > tz="GMT", usetz=TRUE) > > [1] "1993-08-07 01:50:00" > > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S %Z", > > tz="GMT", usetz=TRUE) > > Segmentation fault > > > > > > Not nice. > > And the bug is in glibc, not R (the segfault is in strftime in libc). > It works perfectly on Solaris: > > [1] "1993-08-07 01:50:00 CST" > > and on Windows (where that is not a valid time zone so I used mine). > > [1] "1993-08-07 01:50:00 GMT Daylight Time" > > Presumably glibc has some undocumented assumption that we are not > fulfilling, but I am by now very tired of the bugs in its date-time code.I've found the glibc bug. Looking at the gdb output (gdb) print tm $1 = {tm_sec = 0, tm_min = 50, tm_hour = 1, tm_mday = 7, tm_mon = 7, tm_year = 93, tm_wday = 6, tm_yday = 218, tm_isdst = 1, __tm_gmtoff = 138337816, __tm_zone = 0x1 <Address 0x1 out of bounds>} Now __tm_zone is not a POSIX field, and it is not documented in time.h either. Yet strftime.c in glibc 2.3.2 includes zone = NULL; #if HAVE_TM_ZONE /* The POSIX test suite assumes that setting the environment variable TZ to a new value before calling strftime() will influence the result (the %Z format) even if the information in TP is computed with a totally different time zone. This is bogus: though POSIX allows bad behavior like this, POSIX does not require it. Do the right thing instead. */ zone = (const char *) tp->tm_zone; #endif #if HAVE_TZNAME if (ut) { if (! (zone && *zone)) zone = "GMT"; } so it looks at the field even though there is no reason why it should be set according to POSIX or ISO C. If one NULLs it, the code behaves correctly. Note that strptime in the same version of glibc does not set it, hence the problem. The extra field seems to have been around since 1996-09-11, so I think a test for glibc >= 2.0 is safe here. Brian -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595