thr3ads.net - R devel - [Rd] unstable corner of parameter space for qbeta? [Mar 2020]

If this information is useful, please help other people find it:
Share via:

J C Nash

2020-Mar-26 13:29 UTC

[Rd] unstable corner of parameter space for qbeta?

Given that a number of us are housebound, it might be a good time to try to
improve the approximation. It's not an area where I have much expertise, but
in
looking at the qbeta.c code I see a lot of root-finding, where I do have some
background. However, I'm very reluctant to work alone on this, and will ask
interested others to email off-list. If there are others, I'll report back.

Ben: Do you have an idea of parameter region where approximation is poor?
I think that it would be smart to focus on that to start with.

Martin: On a separate precision matter, did you get my query early in year about
double
length accumulation of inner products of vectors in Rmpfr? R-help more or
less implied that Rmpfr does NOT use extra length. I've been using David
Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
would be nice to have that in R. My attempts to find "easy"
workarounds have
not been successful, but I'll admit that other things took precedence.

Best,

John Nash



On 2020-03-26 4:02 a.m., Martin Maechler wrote:>>>>>> Ben Bolker 
>>>>>>     on Wed, 25 Mar 2020 21:09:16 -0400 writes:
> 
>     > I've discovered an infelicity (I guess) in qbeta(): it's
not a bug,
>     > since there's a clear warning about lack of convergence of the
numerical
>     > algorithm ("full precision may not have been achieved").
I can work
>     > around this, but I'm curious why it happens and whether
there's a better
>     > workaround -- it doesn't seem to be in a particularly extreme
corner of
>     > parameter space. It happens, e.g., for 	these parameters:
> 
>     > phi <- 1.1
>     > i <- 0.01
>     > t <- 0.001
>     > shape1 = i/phi  ##  0.009090909
>     > shape2 = (1-i)/phi  ## 0.9
>     > qbeta(t,shape1,shape2)  ##  5.562685e-309
>     > ##  brute-force uniroot() version, see below
>     > Qbeta0(t,shape1,shape2)  ## 0.9262824
> 
>     > The qbeta code is pretty scary to read: the warning "full
precision
>     > may not have been achieved" is triggered here:
> 
>     >
https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530
> 
>     > Any thoughts?
> 
> Well,  qbeta() is mostly based on inverting pbeta()  and pbeta()
> has *several* "dangerous" corners in its parameter spaces
> {in some cases, it makes sense to look at the 4 different cases
>  log.p = TRUE/FALSE  //  lower.tail = TRUE/FALSE  separately ..}
> 
> pbeta() itself is based on the most complex numerical code in
> all of base R, i.e., src/nmath/toms708.c  and that algorithm
> (TOMS 708) had been sophisticated already when it was published,
> and it has been improved and tweaked several times since being
> part of R, notably for the log.p=TRUE case which had not been in
> the focus of the publication and its algorithm.
> [[ NB: part of this you can read when reading  help(pbeta)  to the end ! ]]
> 
> I've spent many "man weeks", or even "man months"
on pbeta() and
> qbeta(), already and have dreamed to get a good student do a
> master's thesis about the problem and potential solutions I've
> looked into in the mean time.
> 
> My current gut feeling is that in some cases, new approximations
> are necessary (i.e. tweaking of current approximations is not
> going to help sufficiently).
> 
> Also not (in the R sources)  tests/p-qbeta-strict-tst.R
> a whole file of "regression tests" about  pbeta() and qbeta()
> {where part of the true values have been computed with my CRAN
> package Rmpfr (for high precision computation) with the
> Rmpfr::pbetaI() function which gives arbitrarily precise pbeta()
> values but only when  (a,b) are integers -- that's the "I" in
pbetaI().
> 
> Yes, it's intriguing ... and I'll look into your special
> findings a bit later today.
> 
> 
>   > Should I report this on the bug list?
> 
> Yes, please.  Not all problem of pbeta() / qbeta() are part yet,
> of R's bugzilla data base,  and maybe this will help to draw
> more good applied mathematicians look into it.
> 
> 
> 
> Martin Maechler
> ETH Zurich and R Core team
> (I'd call myself the "dpq-hacker" within R core -- related to
>  my CRAN package 'DPQ')
> 
> 
>     > A more general illustration:
>     > http://www.math.mcmaster.ca/bolker/misc/qbeta.png
> 
>     > ==>     > fun <- function(phi,i=0.01,t=0.001, f=qbeta) {
>     > f(t,shape1=i/phi,shape2=(1-i)/phi, lower.tail=FALSE)
>     > }
>     > ## brute-force beta quantile function
>     > Qbeta0 <- function(t,shape1,shape2,lower.tail=FALSE) {
>     > fn <- function(x)
{pbeta(x,shape1,shape2,lower.tail=lower.tail)-t}
>     > uniroot(fn,interval=c(0,1))$root
>     > }
>     > Qbeta <-
Vectorize(Qbeta0,c("t","shape1","shape2"))
>     > curve(fun,from=1,to=4)
>     > curve(fun(x,f=Qbeta),add=TRUE,col=2)
> 
>     > ______________________________________________
>     > R-devel at r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Martin Maechler

2020-Mar-26 14:34 UTC

head link

[Rd] unstable corner of parameter space for qbeta?

>>>>> J C Nash 
>>>>>     on Thu, 26 Mar 2020 09:29:53 -0400 writes:
    > Given that a number of us are housebound, it might be a good time to
try to
    > improve the approximation. It's not an area where I have much
expertise, but in
    > looking at the qbeta.c code I see a lot of root-finding, where I do
have some
    > background. However, I'm very reluctant to work alone on this, and
will ask
    > interested others to email off-list. If there are others, I'll
report back.

Hi John.
Yes, qbeta() {in its "main branches"}  does zero finding, but
zero finding of   pbeta(...) - p*   and I tried to explain in my
last e-mail that the real problem is that already pbeta() is not
accurate enough in some unstable corners ...
The order fixing should typically be
1) fix pbeta()
2) look at qbeta() which now may not even need a fix because its
   problems may have been entirely a consequence of pbeta()'s inaccuracies.
   And if there are cases where the qbeta() problems are not
   only pbeta's "fault", it is still true that the fixes that
   would still be needed crucially depend on the detailed
   working of the function whose zero(s) are sought, i.e.,  pbeta()

    > Ben: Do you have an idea of parameter region where approximation is
poor?
    > I think that it would be smart to focus on that to start with.

----------------------------

Rmpfr  matrix-/vector - products:

    > Martin: On a separate precision matter, did you get my query early in
year about double
    > length accumulation of inner products of vectors in Rmpfr? R-help more
or
    > less implied that Rmpfr does NOT use extra length. I've been using
David
    > Smith's FM Fortran where the DOT_PRODUCT does use double length,
but it
    > would be nice to have that in R. My attempts to find "easy"
workarounds have
    > not been successful, but I'll admit that other things took
precedence.

Well, the current development version of 'Rmpfr' on R-forge now
contains facilities to enlarge the precision of the computations
by a factor 'fPrec' with default 'fPrec = 1';
notably, instead of  x %*% y   (where the `%*%` cannot have more
than two arguments) does have a counterpart  matmult(x,y, ....)
which allows more arguments, namely 'fPrec', or directly
'precBits';
and of course there are  crossprod() and tcrossprod() one should
use when applicable and they also got the  'fPrec' and
'precBits' arguments.

{The %*% etc precision increase still does not work optimally
 efficiency wise, as it simply increases the precision of all
 computations by just increasing the precision of x and y (the inputs)}.

The whole  Matrix and Matrix-vector arithmetic is still
comparibly slow in Rmpfr .. mostly because I valued human time
(mine!) much higher than computer time in its implementation.
That's one reason I would never want to double the precision
everywhere as it decreases speed even more, and often times
unnecessarily: doubling the accuracy is basically "worst-case
scenario" precaution

Martin

J C Nash

2020-Mar-26 14:40 UTC

head link

[Rd] unstable corner of parameter space for qbeta?

Despite the need to focus on pbeta, I'm still willing to put in some effort.
But I find it really helps to have 2-3 others involved, since the questions back
and forth keep matters moving forward. Volunteers?

Thanks to Martin for detailed comments.

JN


On 2020-03-26 10:34 a.m., Martin Maechler wrote:>>>>>> J C Nash 
>>>>>>     on Thu, 26 Mar 2020 09:29:53 -0400 writes:
> 
>     > Given that a number of us are housebound, it might be a good time
to try to
>     > improve the approximation. It's not an area where I have much
expertise, but in
>     > looking at the qbeta.c code I see a lot of root-finding, where I
do have some
>     > background. However, I'm very reluctant to work alone on this,
and will ask
>     > interested others to email off-list. If there are others, I'll
report back.
> 
> Hi John.
> Yes, qbeta() {in its "main branches"}  does zero finding, but
> zero finding of   pbeta(...) - p*   and I tried to explain in my
> last e-mail that the real problem is that already pbeta() is not
> accurate enough in some unstable corners ...
> The order fixing should typically be
> 1) fix pbeta()
> 2) look at qbeta() which now may not even need a fix because its
>    problems may have been entirely a consequence of pbeta()'s
inaccuracies.
>    And if there are cases where the qbeta() problems are not
>    only pbeta's "fault", it is still true that the fixes that
>    would still be needed crucially depend on the detailed
>    working of the function whose zero(s) are sought, i.e.,  pbeta()
> 
>     > Ben: Do you have an idea of parameter region where approximation
is poor?
>     > I think that it would be smart to focus on that to start with.
> 
> ----------------------------
> 
> Rmpfr  matrix-/vector - products:
> 
>     > Martin: On a separate precision matter, did you get my query early
in year about double
>     > length accumulation of inner products of vectors in Rmpfr? R-help
more or
>     > less implied that Rmpfr does NOT use extra length. I've been
using David
>     > Smith's FM Fortran where the DOT_PRODUCT does use double
length, but it
>     > would be nice to have that in R. My attempts to find
"easy" workarounds have
>     > not been successful, but I'll admit that other things took
precedence.
> 
> Well, the current development version of 'Rmpfr' on R-forge now
> contains facilities to enlarge the precision of the computations
> by a factor 'fPrec' with default 'fPrec = 1';
> notably, instead of  x %*% y   (where the `%*%` cannot have more
> than two arguments) does have a counterpart  matmult(x,y, ....)
> which allows more arguments, namely 'fPrec', or directly
'precBits';
> and of course there are  crossprod() and tcrossprod() one should
> use when applicable and they also got the  'fPrec' and
> 'precBits' arguments.
> 
> {The %*% etc precision increase still does not work optimally
>  efficiency wise, as it simply increases the precision of all
>  computations by just increasing the precision of x and y (the inputs)}.
> 
> The whole  Matrix and Matrix-vector arithmetic is still
> comparibly slow in Rmpfr .. mostly because I valued human time
> (mine!) much higher than computer time in its implementation.
> That's one reason I would never want to double the precision
> everywhere as it decreases speed even more, and often times
> unnecessarily: doubling the accuracy is basically "worst-case
> scenario" precaution
> 
> Martin
>

Maybe Matching Threads

Search for more maybe matching threads

R devel - Mar 2020 - unstable corner of parameter space for qbeta?

[Rd] unstable corner of parameter space for qbeta?

[Rd] unstable corner of parameter space for qbeta?

[Rd] unstable corner of parameter space for qbeta?

Maybe Matching Threads