thr3ads.net - R devel - [Rd] inconsistency in pchisq (PR#7099) [Jul 2004]

If this information is useful, please help other people find it:
Share via:

Richard.Mott@well.ox.ac.uk

2004-Jul-16 16:30 UTC

[Rd] inconsistency in pchisq (PR#7099)

Full_Name: Richard Mott
Version: 1.9.0
OS: Windows XP
Submission from: (NULL) (81.178.233.208)


Shouldn't these give the same answer?
> pchisq(67.60644,df=1,lower.tail=F,ncp=0)
[1] 3.219647e-15> pchisq(67.60644,df=1,lower.tail=F)
[1] 1.996145e-16>

maechler@stat.math.ethz.ch

2004-Jul-16 18:17 UTC

head link

[Rd] inconsistency in pchisq (PR#7099)

>>>>> "Richard" == Richard Mott
<Richard.Mott@well.ox.ac.uk>
>>>>>     on Fri, 16 Jul 2004 16:30:05 +0200 (CEST) writes:
    Richard> Full_Name: Richard Mott Version: 1.9.0 OS: Windows
    Richard> XP Submission from: (NULL) (81.178.233.208)


    Richard> Shouldn't these give the same answer?

well, no,..
though I agree it should be mentioned in the documentation

    > pchisq(67.60644,df=1,lower.tail=F,ncp=0)
    [1] 3.219647e-15
    > pchisq(67.60644,df=1,lower.tail=F)
    [1] 1.996145e-16

The first call uses the internal code
for non-central chisq, the 2nd one using the central chisq.
In extreme cases (like the above), these differ more than just
neglibly.  
But we want to have both available, partly for experimental reasons

BTW 1: There are still concrete plans (by me) to use better (and much
     larger) code for improved incomplete beta and gamma
     functions which are really at the base of many of these
     p*() functions [and often consequently the q*() ones] in R.

BTW 2: the same (difference between "no ncp" and "ncp=0")
       applies to   pt(*, ncp).

--
Martin Maechler

maechler@stat.math.ethz.ch

2004-Jul-17 13:21 UTC

head link

[Rd] inconsistency in pchisq (PR#7099)

>>>>> "Richard" == Richard Mott
<Richard.Mott@well.ox.ac.uk>
>>>>>     on Sat, 17 Jul 2004 11:12:23 +0100 (BST) writes:
    Richard> Martin - I agree that the p-values are essentially
    Richard> identical. i raised the issue becuase i can get
    Richard> negative p-values with the ncp=0 version. It is
    Richard> hard to reproduce this problem in the sense that
    Richard> the value of the chi-squared statistic that causes
    Richard> the phenomenom is identical to the one i sent you
    Richard> to 5 sig figures - i don't know how to print the
    Richard> full value in order to send it to you. Negative
    Richard> small probabilities could of course be treated as
    Richard> 0, but this creates problems when i take logs

sure, and much more of a problem, i.e., a clear bug.
--> so after all your bug report was well valid [-> CC'ed back to
R-bugs]

To see it more extremely, try the following :

 > curve(pchisq(x, df=1 ,       lower=FALSE), 65, 70, ylim=c(-1,4)*4e-16,
col=2)
 > curve(pchisq(x, df=1, ncp=0, lower=FALSE), 65, 70, add=TRUE)

The reason for this behavior is simply that internal pnchisq(*, lower=FALSE)
at the moment simply is equivalent to  1 - pnchisq(*, lower=TRUE), 
and since the computer epsilon is 2e-16 it's no wonder that
cancellation swamps everything for these extreme abscissa values.

Note that we could easily change the case  'ncp=0' to use the
central chisq,  but that's not the case for e.g. ncp = 0.001.

==> pchisq(*, ncp) definitely needs to be improved.

I have code to do it (using "Wiener germ approximations"), but not
finished testing it yet.

    Richard> If you like i can send you the data and program
    Richard> used to generate the problem - the chi-squared
    Richard> value is generated from a call to glm()

    Richard> Richard

    >>>>>>> "Richard" == Richard Mott
<rmott@well.ox.ac.uk> on
    >>>>>>> Sat, 17 Jul 2004 01:42:18 +0100 writes:
    >>
    Richard> Martin - thanks - so is it always better to use
    Richard> pchisq(67.60644,df=1,lower.tail=F) if indeed ncp=0
    Richard> ?
    >>  Yes, (for the algorithms currently in use; and as I
    >> said, these are destined to be improved).
    >> 
    >> However I do wonder: In which situations would it matter
    >> to have P = 2e-16 vs P = 3e-15 ??
    >> 
    >> Very often everything depends on underlying model
    >> assumptions, and such extreme tail probabilities are
    >> typically extremely dependent, i.e. would vary heavily by
    >> small changes in the underlying model.
    >> 
    >> Regards, Martin

Seemingly Similar Threads

Search for more reasonably related threads

R devel - Jul 2004 - inconsistency in pchisq (PR#7099)

[Rd] inconsistency in pchisq (PR#7099)

[Rd] inconsistency in pchisq (PR#7099)

[Rd] inconsistency in pchisq (PR#7099)

Seemingly Similar Threads