thr3ads.net - R devel - [Rd] The two chisq.test p values differ when the contingency table (PR#3896) [Aug 2003]

If this information is useful, please help other people find it:
Share via:

Kurt.Hornik@wu-wien.ac.at

2003-Aug-21 20:20 UTC

[Rd] The two chisq.test p values differ when the contingency table (PR#3896)

>>>>> dmurdoch  writes:
>> Date: Wed, 16 Jul 2003 01:27:25 +0200 (MET DST)
>> From: shitao@ucla.edu
>>> x
>> [,1] [,2]
>> [1,]  149  151
>> [2,]    1    8
>>> c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value
>>> for(i in (1:20)){c2x<-c(c2x,chisq.test(x,
>> simulate.p.value=T,B=100000)$p.value)}
>>> c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value
>>> for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x),
simulate.p.value=T,
>> + B=100000)$p.value)}
>>> cbind(c2x,c2tx)
>> c2x    c2tx
>> [1,] 0.03711 0.01683
>> [2,] 0.03717 0.01713
> The problem is in ctest/R/chisq.test.R, where the p-value is
> calculated as 
>             STATISTIC <- sum((x - E) ^ 2 / E)
>             PARAMETER <- NA
>             PVAL <- sum(tmp$results >= STATISTIC) / B
> Here tmp$results is a collection of simulated chisquare values, but
> because of different rounding, the statistics corresponding to tables
> equal to the observed table are slightly smaller than the value
> calculated in STATISTIC, and effectively the p-value is calcuated as
>              PVAL <- sum(tmp$results > STATISTIC) / B
> instead.
> What's the appropriate fix here? 
> PVAL <- sum(tmp$results > STATISTIC - .Machine$double.eps^0.5) / B
> works on this example, but is there something better?
Argh.  Very interesting ...

I think it works to use

            STATISTIC <- sum(sort((x - E) ^ 2 / E, decreasing = TRUE))

instead: this starts by summing the big values, and hence if at all
slightly 'underestimates' the real value, which is fine for the
comparisons.

Fix committed to r-devel.  Thanks for looking into this.

-k

Maybe Matching Threads

Search for more possibly parallel threads

R devel - Aug 2003 - The two chisq.test p values differ when the contingency table (PR#3896)

[Rd] The two chisq.test p values differ when the contingency table (PR#3896)

Maybe Matching Threads

Wisdom of the Ancients