Davis Vaughan
2022-Oct-05 21:04 UTC
[Rd] A potential POSIXlt->Date bug introduced in r-devel
Hi all, I think I have discovered a bug in the conversion from POSIXlt to Date that has been introduced in r-devel. It affects lubridate, but surprisingly didn't cause test failures there. Instead it caused test failures in users of lubridate, like slider, arrow, and admiral (see https://github.com/tidyverse/lubridate/issues/1069), and at least in slider I have been asked by CRAN to correct this issue before 2022-10-16. In r-devel we get the following: ``` data <- list( sec = 0, min = 0L, hour = 0L, mday = 31L, mon = c(0L, NA, 2L), year = 113L, wday = 4L, yday = 30L, isdst = 0L ) x <- .POSIXlt(xx = data, tz = "UTC") x #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC" # Looks right as.POSIXct(x) #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC" # Weird, where is the `NA`? as.Date(x) #> [1] "2013-01-31" "1970-01-01" "2013-03-31" ``` The POSIXlt object is length 3, but is only partially filled out. The other elements are all recycled to length 3 upon conversion to POSIXct or Date. But when converting to Date, we lose the `NA` value. I think the `as.Date()` conversion seems inconsistent with the `as.POSIXct()` conversion. It looks like this comes up because the conversion to Date now defaults to using `sec` if any of the date-like fields are `NA_INTEGER`, but this means the `NA` in the `mon` field is ignored. https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/main/datetime.c#L1293-L1295 Thanks all, Davis Vaughan [[alternative HTML version deleted]]
Martin Maechler
2022-Oct-06 08:15 UTC
[Rd] A potential POSIXlt->Date bug introduced in r-devel
>>>>> Davis Vaughan >>>>> on Wed, 5 Oct 2022 17:04:11 -0400 writes:> Hi all, > I think I have discovered a bug in the conversion from POSIXlt to Date that > has been introduced in r-devel. > It affects lubridate, but surprisingly didn't cause test failures there. > Instead it caused test failures in users of lubridate, like slider, arrow, > and admiral (see https://github.com/tidyverse/lubridate/issues/1069), and > at least in slider I have been asked by CRAN to correct this issue before > 2022-10-16. > In r-devel we get the following: > ``` > data <- list( > sec = 0, > min = 0L, > hour = 0L, > mday = 31L, > mon = c(0L, NA, 2L), > year = 113L, > wday = 4L, > yday = 30L, > isdst = 0L > ) > x <- .POSIXlt(xx = data, tz = "UTC") > x > #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC" > # Looks right > as.POSIXct(x) > #> [1] "2013-01-31 UTC" NA "2013-03-31 UTC" > # Weird, where is the `NA`? > as.Date(x) > #> [1] "2013-01-31" "1970-01-01" "2013-03-31" > ``` I agree that the above is wrong, i.e., a bug in current R-devel. > The POSIXlt object is length 3, but is only partially filled out. > The other elements are all recycled to length 3 upon > conversion to POSIXct or Date. > But when converting to Date, we lose the `NA` value. I think the > `as.Date()` conversion seems inconsistent with the `as.POSIXct()` > conversion. Yes. There was another very much relatd conversation here on R-devel, initiated by Suharto Anggono just a few days ago. This subject, i.e., "partially filled out" POSIXlt objects, was one of the topics, too. See my reply there, notably at the end: https://stat.ethz.ch/pipermail/r-devel/2022-October/082072.html I do mention that "recycling" of partially filled POSIXlt objects has only partially been implemented in R more generally and was actually asking for comments and further discussion. > It looks like this comes up because the conversion to Date now defaults to > using `sec` if any of the date-like fields are `NA_INTEGER`, yes, because only that allows to also deal with +/- Inf etc, as was recently added as new feature, see the NEWS of R 4.2.0 ? Not strictly fixing a bug, format()ing and print()ing of non-finite Date and POSIXt values NaN and +/-Inf no longer show as NA but the respective string, e.g., Inf, for consistency with numeric vector's behaviour, fulfilling the wish of PR#18308. i.e., see also R's bugzilla https://bugs.r-project.org/show_bug.cgi?id=18308 which actually *also* mentioned an NA problem in Date/Time objects. > but this means the `NA` in the `mon` field is ignored. which I agree is bogous and we'll fix. Still, I did not get any feedback on asking about documentation etc on POSIXlt objects ... and I *had* mentioned I agreed that the current partial implementation of "partially filled" i.e. recycling of POSIXlt components should probably be made part of the "definition" of POSIXlt. Have I overlooked an existing definition / contract about these? Martin -- Martin M?chler ETH Zurich and R Core team