shitao@ucla.edu
2003-Jul-16 01:27 UTC
[Rd] The two chisq.test p values differ when the contingency table is transposed! (PR#3486)
Full_Name: Tao Shi Version: 1.7.0 OS: Windows XP Professional Submission from: (NULL) (149.142.163.65)> x[,1] [,2] [1,] 149 151 [2,] 1 8> c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value > for(i in (1:20)){c2x<-c(c2x,chisq.test(x,simulate.p.value=T,B=100000)$p.value)}> c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value > for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x), simulate.p.value=T,+ B=100000)$p.value)}> cbind(c2x,c2tx)c2x c2tx [1,] 0.03711 0.01683 [2,] 0.03717 0.01713 [3,] 0.03709 0.01609 [4,] 0.03833 0.01657 [5,] 0.03696 0.01668 [6,] 0.03698 0.01660 [7,] 0.03704 0.01805 [8,] 0.03731 0.01699 [9,] 0.03746 0.01683 [10,] 0.03671 0.01676 [11,] 0.03589 0.01689 [12,] 0.03684 0.01670 [13,] 0.03678 0.01709 [14,] 0.03742 0.01658 [15,] 0.03734 0.01664 [16,] 0.03723 0.01778 [17,] 0.03690 0.01615 [18,] 0.03650 0.01621 [19,] 0.03759 0.01740 [20,] 0.03712 0.01653 [21,] 0.03788 0.01702
dmurdoch@pair.com
2003-Aug-15 16:41 UTC
[Rd] The two chisq.test p values differ when the contingency table is transposed! (PR#3486)
>Date: Wed, 16 Jul 2003 01:27:25 +0200 (MET DST) >From: shitao@ucla.edu>> x > [,1] [,2] >[1,] 149 151 >[2,] 1 8 >> c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value >> for(i in (1:20)){c2x<-c(c2x,chisq.test(x, >simulate.p.value=T,B=100000)$p.value)} >> c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value >> for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x), simulate.p.value=T, >+ B=100000)$p.value)} >> cbind(c2x,c2tx) > c2x c2tx > [1,] 0.03711 0.01683 > [2,] 0.03717 0.01713The problem is in ctest/R/chisq.test.R, where the p-value is calculated as STATISTIC <- sum((x - E) ^ 2 / E) PARAMETER <- NA PVAL <- sum(tmp$results >= STATISTIC) / B Here tmp$results is a collection of simulated chisquare values, but because of different rounding, the statistics corresponding to tables equal to the observed table are slightly smaller than the value calculated in STATISTIC, and effectively the p-value is calcuated as PVAL <- sum(tmp$results > STATISTIC) / B instead. What's the appropriate fix here? PVAL <- sum(tmp$results > STATISTIC - .Machine$double.eps^0.5) / B works on this example, but is there something better? Duncan Murdoch
Possibly Parallel Threads
- The two chisq.test p values differ when the contingency table (PR#3896)
- Why two chisq.test p values differ when the contingency
- Why two chisq.test p values differ when the contingency table is transposed?
- Numerical stability in chisq.test
- Calling function from non-default floating-point environment