Dear all, I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it. With the following function, I want to test for evenly space numbers, starting from anywhere. .is_continous_evenly_spaced <- function(n){ if(length(n) < 2) return(FALSE) n <- n[order(n)] n <- n - min(n) step <- n[2] - n[1] test <- seq(from = min(n), to = max(n), by = step) if(length(n) == length(test) && all(n == test)){ return(TRUE) } return(FALSE) }> .is_continous_evenly_spaced(c(1,2,3,4))[1] TRUE> .is_continous_evenly_spaced(c(1,3,4,5))[1] FALSE> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))[1] FALSE I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2. The types reported are always double, however n[2] == 0.1 reports FALSE as well. The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn?t it? Does this work as intended? Thanks for any help, advise and suggestions in advance. Best regards, Felix [[alternative HTML version deleted]]
El vie., 31 ago. 2018 a las 15:10, Felix Ernst (<felix.gm.ernst at outlook.com>) escribi?:> > Dear all, > > I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it. > > With the following function, I want to test for evenly space numbers, starting from anywhere. > > .is_continous_evenly_spaced <- function(n){ > if(length(n) < 2) return(FALSE) > n <- n[order(n)] > n <- n - min(n) > step <- n[2] - n[1] > test <- seq(from = min(n), to = max(n), by = step) > if(length(n) == length(test) && > all(n == test)){ > return(TRUE) > } > return(FALSE) > } > > > .is_continous_evenly_spaced(c(1,2,3,4)) > [1] TRUE > > .is_continous_evenly_spaced(c(1,3,4,5)) > [1] FALSE > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3)) > [1] FALSE > > I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2. > > The types reported are always double, however n[2] == 0.1 reports FALSE as well. > > The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn?t it? > > Does this work as intended? Thanks for any help, advise and suggestions in advance.I guess this has something to do with how the sequence is built and the inherent error of floating point arithmetic. In fact, if you return test minus n, you'll get: [1] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00 and the error gets bigger when you continue the sequence; e.g., this is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7): [1] 0.000000e+00 0.000000e+00 2.220446e-16 2.220446e-16 4.440892e-16 [6] 4.440892e-16 4.440892e-16 0.000000e+00 So, independently of this is considered a bug or not, instead of length(n) == length(test) && all(n == test) I would use the following condition: isTRUE(all.equal(n, test)) I?aki> > Best regards, > Felix > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- I?aki Ucar
Agreed that's it's rounding error, and all.equal would be the way to go. I wouldn't call it a bug, it's simply part of working with floating point numbers, any language has the same issue. And while we're at it, I think the function can be a lot shorter: .is_continous_evenly_spaced <- function(n){ length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n), to=max(n), length.out = length(n)))) } Cheers, Emil El vie., 31 ago. 2018 a las 15:10, Felix Ernst (<felix.gm.ernst at outlook.com>) escribi?: > > Dear all, > > I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it. > > With the following function, I want to test for evenly space numbers, starting from anywhere. > > .is_continous_evenly_spaced <- function(n){ > if(length(n) < 2) return(FALSE) > n <- n[order(n)] > n <- n - min(n) > step <- n[2] - n[1] > test <- seq(from = min(n), to = max(n), by = step) > if(length(n) == length(test) && > all(n == test)){ > return(TRUE) > } > return(FALSE) > } > > > .is_continous_evenly_spaced(c(1,2,3,4)) > [1] TRUE > > .is_continous_evenly_spaced(c(1,3,4,5)) > [1] FALSE > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3)) > [1] FALSE > > I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2. > > The types reported are always double, however n[2] == 0.1 reports FALSE as well. > > The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn?t it? > > Does this work as intended? Thanks for any help, advise and suggestions in advance. I guess this has something to do with how the sequence is built and the inherent error of floating point arithmetic. In fact, if you return test minus n, you'll get: [1] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00 and the error gets bigger when you continue the sequence; e.g., this is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7): [1] 0.000000e+00 0.000000e+00 2.220446e-16 2.220446e-16 4.440892e-16 [6] 4.440892e-16 4.440892e-16 0.000000e+00 So, independently of this is considered a bug or not, instead of length(n) == length(test) && all(n == test) I would use the following condition: isTRUE(all.equal(n, test)) I?aki > > Best regards, > Felix > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- I?aki Ucar ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
FYI, more fun with floats:> 0.1+0.1==0.2[1] TRUE> 0.1+0.1+0.1+0.1==0.4[1] TRUE> 0.1+0.1+0.1==0.3[1] FALSE> 0.1+0.1+0.1==0.1*3[1] TRUE> 0.3==0.1*3[1] FALSE ?\_(?)_/? But this is not R's fault. See: https://0.30000000000000004.com I?aki El vie., 31 ago. 2018 a las 15:36, I?aki Ucar (<iucar at fedoraproject.org>) escribi?:> > El vie., 31 ago. 2018 a las 15:10, Felix Ernst > (<felix.gm.ernst at outlook.com>) escribi?: > > > > Dear all, > > > > I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it. > > > > With the following function, I want to test for evenly space numbers, starting from anywhere. > > > > .is_continous_evenly_spaced <- function(n){ > > if(length(n) < 2) return(FALSE) > > n <- n[order(n)] > > n <- n - min(n) > > step <- n[2] - n[1] > > test <- seq(from = min(n), to = max(n), by = step) > > if(length(n) == length(test) && > > all(n == test)){ > > return(TRUE) > > } > > return(FALSE) > > } > > > > > .is_continous_evenly_spaced(c(1,2,3,4)) > > [1] TRUE > > > .is_continous_evenly_spaced(c(1,3,4,5)) > > [1] FALSE > > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3)) > > [1] FALSE > > > > I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2. > > > > The types reported are always double, however n[2] == 0.1 reports FALSE as well. > > > > The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn?t it? > > > > Does this work as intended? Thanks for any help, advise and suggestions in advance. > > I guess this has something to do with how the sequence is built and > the inherent error of floating point arithmetic. In fact, if you > return test minus n, you'll get: > > [1] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00 > > and the error gets bigger when you continue the sequence; e.g., this > is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7): > > [1] 0.000000e+00 0.000000e+00 2.220446e-16 2.220446e-16 4.440892e-16 > [6] 4.440892e-16 4.440892e-16 0.000000e+00 > > So, independently of this is considered a bug or not, instead of > > length(n) == length(test) && all(n == test) > > I would use the following condition: > > isTRUE(all.equal(n, test)) > > I?aki > > > > > Best regards, > > Felix > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > I?aki Ucar-- I?aki Ucar
> On Aug 31, 2018, at 9:36 AM, I?aki Ucar <iucar at fedoraproject.org> wrote: > > El vie., 31 ago. 2018 a las 15:10, Felix Ernst > (<felix.gm.ernst at outlook.com>) escribi?: >> >> Dear all, >> >> I a bit unsure, whether this qualifies as a bug, but it is definitly a strange behaviour. That why I wanted to discuss it. >> >> With the following function, I want to test for evenly space numbers, starting from anywhere. >> >> .is_continous_evenly_spaced <- function(n){ >> if(length(n) < 2) return(FALSE) >> n <- n[order(n)] >> n <- n - min(n) >> step <- n[2] - n[1] >> test <- seq(from = min(n), to = max(n), by = step) >> if(length(n) == length(test) && >> all(n == test)){ >> return(TRUE) >> } >> return(FALSE) >> } >> >>> .is_continous_evenly_spaced(c(1,2,3,4)) >> [1] TRUE >>> .is_continous_evenly_spaced(c(1,3,4,5)) >> [1] FALSE >>> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3)) >> [1] FALSE >> >> I expect the result for 1 and 2, but not for 3. Upon Investigation it turns out, that n == test is TRUE for every pair, but not for the pair of 0.2. >> >> The types reported are always double, however n[2] == 0.1 reports FALSE as well. >> >> The whole problem is solved by switching from all(n == test) to all(as.character(n) == as.character(test)). However that is weird, isn?t it? >> >> Does this work as intended? Thanks for any help, advise and suggestions in advance. > > I guess this has something to do with how the sequence is built and > the inherent error of floating point arithmetic. In fact, if you > return test minus n, you'll get: > > [1] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00 > > and the error gets bigger when you continue the sequence; e.g., this > is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7): > > [1] 0.000000e+00 0.000000e+00 2.220446e-16 2.220446e-16 4.440892e-16 > [6] 4.440892e-16 4.440892e-16 0.000000e+00 > > So, independently of this is considered a bug or not, instead of > > length(n) == length(test) && all(n == test) > > I would use the following condition: > > isTRUE(all.equal(n, test)) > > I?aki > >> >> Best regards, >> FelixHi, This is essentially FAQ 7.31: https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f <https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f> Review that and the references therein to gain some insights into binary representations of floating point numbers. Rather than the more complicated code you have above, try the following: evenlyspaced <- function(x) { gaps <- diff(sort(x)) all(gaps[-1] == gaps[1]) } Note the use of ?diff:> diff(c(1, 2, 3, 4))[1] 1 1 1> diff(c(1, 3, 4, 5))[1] 2 1 1> diff(c(1, 1.1, 1.2, 1.3))[1] 0.1 0.1 0.1 However, in reality, due to the floating point representation issues noted above:> print(diff(c(1, 1.1, 1.2, 1.3)), 20)[1] 0.100000000000000088818 0.099999999999999866773 [3] 0.100000000000000088818 So the differences between the numbers are not exactly 0.1. Using the function above, you get:> evenlyspaced(c(1, 2, 3, 4))[1] TRUE> evenlyspaced(c(1, 3, 4, 5))[1] FALSE> evenlyspaced(c(1, 1.1, 1.2, 1.3))[1] FALSE As has been noted, if you want the gap comparison to be based upon some margin of error, use ?all.equal rather than the explicit equals comparison that I have in the function above. Something along the lines of: evenlyspaced <- function(x) { gaps <- diff(sort(x)) all(sapply(gaps[-1], function(x) all.equal(x, gaps[1]))) } On which case, you now get: evenlyspaced(c(1, 1.1, 1.2, 1.3)) [1] TRUE Regards, Marc Schwartz [[alternative HTML version deleted]]