I've discovered an infelicity (I guess) in qbeta(): it's not a bug, since there's a clear warning about lack of convergence of the numerical algorithm ("full precision may not have been achieved"). I can work around this, but I'm curious why it happens and whether there's a better workaround -- it doesn't seem to be in a particularly extreme corner of parameter space. It happens, e.g., for these parameters: phi <- 1.1 i <- 0.01 t <- 0.001 shape1 = i/phi ## 0.009090909 shape2 = (1-i)/phi ## 0.9 qbeta(t,shape1,shape2) ## 5.562685e-309 ## brute-force uniroot() version, see below Qbeta0(t,shape1,shape2) ## 0.9262824 The qbeta code is pretty scary to read: the warning "full precision may not have been achieved" is triggered here: https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 Any thoughts? Should I report this on the bug list? A more general illustration: http://www.math.mcmaster.ca/bolker/misc/qbeta.png ==fun <- function(phi,i=0.01,t=0.001, f=qbeta) { f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE) } ## brute-force beta quantile function Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) { fn <- function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)-t} uniroot(fn,interval=c(0,1))$root } Qbeta <- Vectorize(Qbeta0,c("t","shape1","shape2")) curve(fun,from=1,to=4) curve(fun(x,f=Qbeta),add=TRUE,col=2)
>>>>> Ben Bolker >>>>> on Wed, 25 Mar 2020 21:09:16 -0400 writes:> I've discovered an infelicity (I guess) in qbeta(): it's not a bug, > since there's a clear warning about lack of convergence of the numerical > algorithm ("full precision may not have been achieved"). I can work > around this, but I'm curious why it happens and whether there's a better > workaround -- it doesn't seem to be in a particularly extreme corner of > parameter space. It happens, e.g., for these parameters: > phi <- 1.1 > i <- 0.01 > t <- 0.001 > shape1 = i/phi ## 0.009090909 > shape2 = (1-i)/phi ## 0.9 > qbeta(t,shape1,shape2) ## 5.562685e-309 > ## brute-force uniroot() version, see below > Qbeta0(t,shape1,shape2) ## 0.9262824 > The qbeta code is pretty scary to read: the warning "full precision > may not have been achieved" is triggered here: > https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 > Any thoughts? Well, qbeta() is mostly based on inverting pbeta() and pbeta() has *several* "dangerous" corners in its parameter spaces {in some cases, it makes sense to look at the 4 different cases log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..} pbeta() itself is based on the most complex numerical code in all of base R, i.e., src/nmath/toms708.c and that algorithm (TOMS 708) had been sophisticated already when it was published, and it has been improved and tweaked several times since being part of R, notably for the log.p=TRUE case which had not been in the focus of the publication and its algorithm. [[ NB: part of this you can read when reading help(pbeta) to the end ! ]] I've spent many "man weeks", or even "man months" on pbeta() and qbeta(), already and have dreamed to get a good student do a master's thesis about the problem and potential solutions I've looked into in the mean time. My current gut feeling is that in some cases, new approximations are necessary (i.e. tweaking of current approximations is not going to help sufficiently). Also not (in the R sources) tests/p-qbeta-strict-tst.R a whole file of "regression tests" about pbeta() and qbeta() {where part of the true values have been computed with my CRAN package Rmpfr (for high precision computation) with the Rmpfr::pbetaI() function which gives arbitrarily precise pbeta() values but only when (a,b) are integers -- that's the "I" in pbetaI(). Yes, it's intriguing ... and I'll look into your special findings a bit later today. > Should I report this on the bug list? Yes, please. Not all problem of pbeta() / qbeta() are part yet, of R's bugzilla data base, and maybe this will help to draw more good applied mathematicians look into it. Martin Maechler ETH Zurich and R Core team (I'd call myself the "dpq-hacker" within R core -- related to my CRAN package 'DPQ') > A more general illustration: > http://www.math.mcmaster.ca/bolker/misc/qbeta.png > == > fun <- function(phi,i=0.01,t=0.001, f=qbeta) { > f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE) > } > ## brute-force beta quantile function > Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) { > fn <- function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)-t} > uniroot(fn,interval=c(0,1))$root > } > Qbeta <- Vectorize(Qbeta0,c("t","shape1","shape2")) > curve(fun,from=1,to=4) > curve(fun(x,f=Qbeta),add=TRUE,col=2) > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
It's a pretty extreme case, try e.g. curve(pbeta(x, shape1, shape2), n=10001), and (probably -- I can't be bothered to work out the relation between beta shapes and F df parameters this morning...) outside what is normally encountered in statistical analyses. Notice though, that you have lower=FALSE in your Qbeta0, so you should have it in qbeta as well:> qbeta(t,shape1,shape2, lower=FALSE)[1] 0.9999949 Warning message: In qbeta(t, shape1, shape2, lower = FALSE) : full precision may not have been achieved in 'qbeta' which of course is still wrong (I dont't think there is a problem in the other tail, Qbeta0(t,...., lower=TRUE) returns zero. You can see the effect also with curve(qbeta(x, shape1, shape2), n=10001, from=.99, to=1) which kind of suggests one of those regime-switching bugs, where different methods are used for various parts of the domain, and the cross-over is not done quite right. At any rate, qbeta is one of R's very basic workhorses, so we do want it to work right, so by all means file a report. -pd> On 26 Mar 2020, at 02:09 , Ben Bolker <bbolker at gmail.com> wrote: > > > I've discovered an infelicity (I guess) in qbeta(): it's not a bug, > since there's a clear warning about lack of convergence of the numerical > algorithm ("full precision may not have been achieved"). I can work > around this, but I'm curious why it happens and whether there's a better > workaround -- it doesn't seem to be in a particularly extreme corner of > parameter space. It happens, e.g., for these parameters: > > phi <- 1.1 > i <- 0.01 > t <- 0.001 > shape1 = i/phi ## 0.009090909 > shape2 = (1-i)/phi ## 0.9 > qbeta(t,shape1,shape2) ## 5.562685e-309 > ## brute-force uniroot() version, see below > Qbeta0(t,shape1,shape2) ## 0.9262824 > > The qbeta code is pretty scary to read: the warning "full precision > may not have been achieved" is triggered here: > > https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 > > Any thoughts? Should I report this on the bug list? > > > A more general illustration: > http://www.math.mcmaster.ca/bolker/misc/qbeta.png > > ==> fun <- function(phi,i=0.01,t=0.001, f=qbeta) { > f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE) > } > ## brute-force beta quantile function > Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) { > fn <- function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)-t} > uniroot(fn,interval=c(0,1))$root > } > Qbeta <- Vectorize(Qbeta0,c("t","shape1","shape2")) > curve(fun,from=1,to=4) > curve(fun(x,f=Qbeta),add=TRUE,col=2) > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Given that a number of us are housebound, it might be a good time to try to improve the approximation. It's not an area where I have much expertise, but in looking at the qbeta.c code I see a lot of root-finding, where I do have some background. However, I'm very reluctant to work alone on this, and will ask interested others to email off-list. If there are others, I'll report back. Ben: Do you have an idea of parameter region where approximation is poor? I think that it would be smart to focus on that to start with. Martin: On a separate precision matter, did you get my query early in year about double length accumulation of inner products of vectors in Rmpfr? R-help more or less implied that Rmpfr does NOT use extra length. I've been using David Smith's FM Fortran where the DOT_PRODUCT does use double length, but it would be nice to have that in R. My attempts to find "easy" workarounds have not been successful, but I'll admit that other things took precedence. Best, John Nash On 2020-03-26 4:02 a.m., Martin Maechler wrote:>>>>>> Ben Bolker >>>>>> on Wed, 25 Mar 2020 21:09:16 -0400 writes: > > > I've discovered an infelicity (I guess) in qbeta(): it's not a bug, > > since there's a clear warning about lack of convergence of the numerical > > algorithm ("full precision may not have been achieved"). I can work > > around this, but I'm curious why it happens and whether there's a better > > workaround -- it doesn't seem to be in a particularly extreme corner of > > parameter space. It happens, e.g., for these parameters: > > > phi <- 1.1 > > i <- 0.01 > > t <- 0.001 > > shape1 = i/phi ## 0.009090909 > > shape2 = (1-i)/phi ## 0.9 > > qbeta(t,shape1,shape2) ## 5.562685e-309 > > ## brute-force uniroot() version, see below > > Qbeta0(t,shape1,shape2) ## 0.9262824 > > > The qbeta code is pretty scary to read: the warning "full precision > > may not have been achieved" is triggered here: > > > https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 > > > Any thoughts? > > Well, qbeta() is mostly based on inverting pbeta() and pbeta() > has *several* "dangerous" corners in its parameter spaces > {in some cases, it makes sense to look at the 4 different cases > log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..} > > pbeta() itself is based on the most complex numerical code in > all of base R, i.e., src/nmath/toms708.c and that algorithm > (TOMS 708) had been sophisticated already when it was published, > and it has been improved and tweaked several times since being > part of R, notably for the log.p=TRUE case which had not been in > the focus of the publication and its algorithm. > [[ NB: part of this you can read when reading help(pbeta) to the end ! ]] > > I've spent many "man weeks", or even "man months" on pbeta() and > qbeta(), already and have dreamed to get a good student do a > master's thesis about the problem and potential solutions I've > looked into in the mean time. > > My current gut feeling is that in some cases, new approximations > are necessary (i.e. tweaking of current approximations is not > going to help sufficiently). > > Also not (in the R sources) tests/p-qbeta-strict-tst.R > a whole file of "regression tests" about pbeta() and qbeta() > {where part of the true values have been computed with my CRAN > package Rmpfr (for high precision computation) with the > Rmpfr::pbetaI() function which gives arbitrarily precise pbeta() > values but only when (a,b) are integers -- that's the "I" in pbetaI(). > > Yes, it's intriguing ... and I'll look into your special > findings a bit later today. > > > > Should I report this on the bug list? > > Yes, please. Not all problem of pbeta() / qbeta() are part yet, > of R's bugzilla data base, and maybe this will help to draw > more good applied mathematicians look into it. > > > > Martin Maechler > ETH Zurich and R Core team > (I'd call myself the "dpq-hacker" within R core -- related to > my CRAN package 'DPQ') > > > > A more general illustration: > > http://www.math.mcmaster.ca/bolker/misc/qbeta.png > > > ==> > fun <- function(phi,i=0.01,t=0.001, f=qbeta) { > > f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE) > > } > > ## brute-force beta quantile function > > Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) { > > fn <- function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)-t} > > uniroot(fn,interval=c(0,1))$root > > } > > Qbeta <- Vectorize(Qbeta0,c("t","shape1","shape2")) > > curve(fun,from=1,to=4) > > curve(fun(x,f=Qbeta),add=TRUE,col=2) > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
On 2020-03-26 4:02 a.m., Martin Maechler wrote:>>>>>> Ben Bolker >>>>>> on Wed, 25 Mar 2020 21:09:16 -0400 writes: > > > I've discovered an infelicity (I guess) in qbeta(): it's not a bug, > > since there's a clear warning about lack of convergence of the numerical > > algorithm ("full precision may not have been achieved"). I can work > > around this, but I'm curious why it happens and whether there's a better > > workaround -- it doesn't seem to be in a particularly extreme corner of > > parameter space. It happens, e.g., for these parameters: > > > phi <- 1.1 > > i <- 0.01 > > t <- 0.001 > > shape1 = i/phi ## 0.009090909 > > shape2 = (1-i)/phi ## 0.9 > > qbeta(t,shape1,shape2) ## 5.562685e-309 > > ## brute-force uniroot() version, see below > > Qbeta0(t,shape1,shape2) ## 0.9262824 > > > The qbeta code is pretty scary to read: the warning "full precision > > may not have been achieved" is triggered here: > > > https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 > > > Any thoughts? > > Well, qbeta() is mostly based on inverting pbeta() and pbeta() > has *several* "dangerous" corners in its parameter spaces > {in some cases, it makes sense to look at the 4 different cases > log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..} > > pbeta() itself is based on the most complex numerical code in > all of base R, i.e., src/nmath/toms708.c and that algorithm > (TOMS 708) had been sophisticated already when it was published, > and it has been improved and tweaked several times since being > part of R, notably for the log.p=TRUE case which had not been in > the focus of the publication and its algorithm. > [[ NB: part of this you can read when reading help(pbeta) to the end ! ]] > > I've spent many "man weeks", or even "man months" on pbeta() and > qbeta(), already and have dreamed to get a good student do a > master's thesis about the problem and potential solutions I've > looked into in the mean time. > > My current gut feeling is that in some cases, new approximations > are necessary (i.e. tweaking of current approximations is not > going to help sufficiently). > > Also not (in the R sources) tests/p-qbeta-strict-tst.R > a whole file of "regression tests" about pbeta() and qbeta() > {where part of the true values have been computed with my CRAN > package Rmpfr (for high precision computation) with the > Rmpfr::pbetaI() function which gives arbitrarily precise pbeta() > values but only when (a,b) are integers -- that's the "I" in pbetaI(). > > Yes, it's intriguing ... and I'll look into your special > findings a bit later today. > > > > Should I report this on the bug list? > > Yes, please. Not all problem of pbeta() / qbeta() are part yet, > of R's bugzilla data base, and maybe this will help to draw > more good applied mathematicians look into it.Will report. I'm not at all surprised that this is a super-tough problem. The only part that was surprising to me was that my naive uniroot-based solution worked (for this particular corner of parameter space where qbeta() has trouble: it was terrible elsewhere, so now I'm using a hybrid solution where I use my brute-force uniroot thing if I get a warning from qbeta(). I hesitated to even bring it up because I know you're really busy, but I figured it was better to tag it now and let you deal with it some time later. Bugzilla report at https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17746 cheers Ben Bolker> > > > Martin Maechler > ETH Zurich and R Core team > (I'd call myself the "dpq-hacker" within R core -- related to > my CRAN package 'DPQ') > > > > A more general illustration: > > http://www.math.mcmaster.ca/bolker/misc/qbeta.png > > > ==> > fun <- function(phi,i=0.01,t=0.001, f=qbeta) { > > f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE) > > } > > ## brute-force beta quantile function > > Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) { > > fn <- function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)-t} > > uniroot(fn,interval=c(0,1))$root > > } > > Qbeta <- Vectorize(Qbeta0,c("t","shape1","shape2")) > > curve(fun,from=1,to=4) > > curve(fun(x,f=Qbeta),add=TRUE,col=2) > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel >