Eric Berger
2022-Oct-15 07:55 UTC
[R] Unintended behaviour of stats::time not returning integers for the first cycle
Alternatively correct.year <- floor(time(x)+1e-6) On Sat, Oct 15, 2022 at 10:26 AM Andre? V. Kostyrka < andrei.kostyrka at gmail.com> wrote:> Dear all, > > > > I was using stats::time to obtain the year as a floor of it, and > encountered a problem: due to a rounding error (most likely due to its > reliance on the base::seq.int internally, but correct me if I am wrong), > the actual number corresponding to the beginning of a year X can still be > (X-1).9999999..., resulting in the following undesirable behaviour. > > > > One of the simplest ways of getting the year from a ts object is > floor(time(...)). However, if the starting time cannot be represented > nicely as a power of 2, then, supposedly integer time does not have a > .000000... mantissa: > > > > x <- ts(2:252, start = c(2002, 2), freq = 12) > > d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by > "month") > > true.year <- rep(2002:2022, each = 12)[-1] > > wrong.year <- floor(as.numeric(time(x))) > > tail(cbind(as.character(d), true.year, wrong.year), 15) # Look at > 2022-01-01 > > print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of > which is 2021 > > > > Yes, I have read the 'R inferno' book and know the famous '0.3 != 0.7 - > 0.4' example, but I believe that the expected / intended behaviour would be > actually returning round years for the first observation in a year. This > could be achieved by rounding the near-integer time to integers. > > > > Since users working with dates are expecting to get exact integer years for > the first cycle of a ts, this should be changed. Thank you in advance for > considering a fix. > > > > Yours sincerely, > > Andre? V. Kostyrka > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Andreï V. Kostyrka
2022-Oct-18 12:26 UTC
[R] Unintended behaviour of stats::time not returning integers for the first cycle
Sure, this works, and I was thinking about this solution, but it seems like a dirty one-time trick. I was wondering whether the following 3 lines could be considered for inclusion by the core developers, but did not know which mailing list to write to. Here is my proposal: correctTime <- function (x, offset = 0, ...) { # Changes stats:::time.default n <- if (is.matrix(x)) nrow(x) else length(x) xtsp <- attr(hasTsp(x), "tsp") y <- seq.int(xtsp[1L], xtsp[2L], length.out = n) + offset/xtsp[3L] round.y <- round(y) near.integer <- abs(round.y - y) < sqrt(.Machine$double.eps) y[near.integer] <- round.y[near.integer] tsp(y) <- xtsp y } x <- ts(2:252, start = c(2002, 2), freq = 12) d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by "month") true.year <- rep(2002:2022, each = 12)[-1] wrong.year <- floor(as.numeric(time(x))) print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of which is 2021 print(correctTime(x)[240], 20) # 2022 On Sat, Oct 15, 2022 at 11:56 AM Eric Berger <ericjberger at gmail.com> wrote:> Alternatively > > correct.year <- floor(time(x)+1e-6) > > On Sat, Oct 15, 2022 at 10:26 AM Andre? V. Kostyrka < > andrei.kostyrka at gmail.com> wrote: > >> Dear all, >> >> >> >> I was using stats::time to obtain the year as a floor of it, and >> encountered a problem: due to a rounding error (most likely due to its >> reliance on the base::seq.int internally, but correct me if I am wrong), >> the actual number corresponding to the beginning of a year X can still be >> (X-1).9999999..., resulting in the following undesirable behaviour. >> >> >> >> One of the simplest ways of getting the year from a ts object is >> floor(time(...)). However, if the starting time cannot be represented >> nicely as a power of 2, then, supposedly integer time does not have a >> .000000... mantissa: >> >> >> >> x <- ts(2:252, start = c(2002, 2), freq = 12) >> >> d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by >> "month") >> >> true.year <- rep(2002:2022, each = 12)[-1] >> >> wrong.year <- floor(as.numeric(time(x))) >> >> tail(cbind(as.character(d), true.year, wrong.year), 15) # Look at >> 2022-01-01 >> >> print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of >> which is 2021 >> >> >> >> Yes, I have read the 'R inferno' book and know the famous '0.3 != 0.7 - >> 0.4' example, but I believe that the expected / intended behaviour would >> be >> actually returning round years for the first observation in a year. This >> could be achieved by rounding the near-integer time to integers. >> >> >> >> Since users working with dates are expecting to get exact integer years >> for >> the first cycle of a ts, this should be changed. Thank you in advance for >> considering a fix. >> >> >> >> Yours sincerely, >> >> Andre? V. Kostyrka >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >[[alternative HTML version deleted]]