This is likely yet another instance of round off error, but it caught me by surprise. tmt% R --vanilla (headers skipped, version 3.0.2 on Linux)> load('qtest.rda') > length(temp)[1] 3622> max(temp) >= quantile(temp, .98)98% FALSE I can send the file to anyone who would like to understand this more deeply. The top 3% of the vector is a single repeated value. Terry Therneau
>>>>> Therneau, Terry M , Ph D <therneau at mayo.edu> >>>>> on Wed, 30 Apr 2014 12:00:33 -0500 writes:> This is likely yet another instance of round off error, > but it caught me by surprise. tmt% R --vanilla (headers > skipped, version 3.0.2 on Linux) >> load('qtest.rda') length(temp) > [1] 3622 >> max(temp) >= quantile(temp, .98) > 98% FALSE > I can send the file to anyone who would like to understand > this more deeply. The top 3% of the vector is a single > repeated value. Hmm... this may be the same as PR#15746 : https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15746 on which I commented that it was an example of FAQ 7.31 : > Martin Maechler 2014-04-10 07:10:41 UTC > What really happens is simply > > x0 <- 0.000384478262997113 > > h <- (1:9)/10 > > diff(range((1-h)*x0 + h*x0)) > [1] 1.084202e-19 > > > and that of course *is* just R FAQ 7.31 > I'm not sure this is something we want to amend; we *could* > ensure > min(x) <= quantile(x, *) <= max(x) > with a bit of extra code that makes *all* calls of quantile > slower than > now... just for the sake of this one very very rare exception ? > Alternatively we just document it .. > Maybe this needs public discussion on R-devel and so I did bring it to R-devel. If look at the fuller report, you also see that I claim that this seems to happen only for very rare numbers..., I had a loop running, searching for such numbers as the above x0 and it did not find one in many hours..., OTOH, you now found a second example (of the same phenomenon I think). So, yes, please, send me the qtest.rda... or to all of us a dput(.) of the top 5% of the data which may suffice for a reproducible example. Also, it seems that for these rare examples we may need to rewrite the code of quantile.default() ... making it less nice, less transparent, not something I think we should commit lightly. Martin > Terry Therneau > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
[this was meant as "reply to all" to R-devel :]>>>>> Therneau, Terry M , Ph D <therneau at mayo.edu> >>>>> on Wed, 30 Apr 2014 12:00:33 -0500 writes:> This is likely yet another instance of round off error, > but it caught me by surprise. tmt% R --vanilla (headers > skipped, version 3.0.2 on Linux) >> load('qtest.rda') length(temp) > [1] 3622 >> max(temp) >= quantile(temp, .98) > 98% FALSE > I can send the file to anyone who would like to understand > this more deeply. The top 3% of the vector is a single > repeated value. Hmm... this may be the same as PR#15746 : https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15746 on which I commented that it was an example of FAQ 7.31 : > Martin Maechler 2014-04-10 07:10:41 UTC > What really happens is simply > > x0 <- 0.000384478262997113 > > h <- (1:9)/10 > > diff(range((1-h)*x0 + h*x0)) > [1] 1.084202e-19 > > > and that of course *is* just R FAQ 7.31 > I'm not sure this is something we want to amend; we *could* > ensure > min(x) <= quantile(x, *) <= max(x) > with a bit of extra code that makes *all* calls of quantile > slower than > now... just for the sake of this one very very rare exception ? > Alternatively we just document it .. > Maybe this needs public discussion on R-devel and so I did bring it to R-devel. If look at the fuller report, you also see that I claim that this seems to happen only for very rare numbers..., I had a loop running, searching for such numbers as the above x0 and it did not find one in many hours..., OTOH, you now found a second example (of the same phenomenon I think). So, yes, please, send me the qtest.rda... or to all of us a dput(.) of the top 5% of the data which may suffice for a reproducible example. Also, it seems that for these rare examples we may need to rewrite the code of quantile.default() ... making it less nice, less transparent, not something I think we should commit lightly. Martin > Terry Therneau > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel