Dear R helpers,
I would like to ask why polr occasionally generates results that look very
odd.
I have been trying to compare the power of proportional odds logistic
regression with
the Wilcoxon test. I generated random samples, applied both tests and
extracted and
compared the p-values, thus:-
library(MASS)
c1=rep(NA,100); c2=c1
for (run in 1:100)
{
dat=c(rbinom(20,12,0.65),rbinom(20,12,0.35))
grp=c(rep(0,20),rep(1,20))
fit1=polr(ordered(dat)~grp, control=c(maxiter=10000, trace=0))
fit2=wilcox.test(dat~grp)
#extract t-value from fit1 and find associated p-value
c1[run]=pt(as.numeric(unlist(summary(fit1))[((nlevels(ordered(dat)))*2)+1]),
fit1$df.residual)
c2[run]=fit2$p.value
if (c1[run]>0.2 & c2[run]<0.01)
{print(rbind(c1,c2)); print(rbind(grp,dat)); print(fit2);
print(summary(fit1)); stop()}
} # end for run
The p-values from polr are mostly comparable with those from Wilcoxon test.
But, sometimes,
polr gives a very small t-value, when the groups are obviously different.
For example:-
> rbind(c1,c2)
[,1] [,2] [,3] [,4] [,5]
[,6]
c1 5.593865e-05 2.442332e-04 0.001733831 1.606033e-04 2.412809e-04
6.636155e-05
c2 3.091763e-06 7.549811e-05 0.001548798 4.279157e-05 8.251237e-05
1.947336e-07
[,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14]
c1 5.622105e-05 9.373143e-05 3.422841e-05 4.630825e-01 NA NA NA
NA
c2 1.522268e-06 1.319416e-05 4.393431e-07 9.619527e-08 NA NA NA
NA
> rbind(grp,dat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14]
grp 0 0 0 0 0 0 0 0 0 0 0 0 0
0
dat 8 7 6 9 9 6 10 8 7 8 6 9 7
9
[,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
[,27]
grp 0 0 0 0 0 0 1 1 1 1 1 1
1
dat 9 7 6 7 12 8 3 6 6 3 5 3
3
[,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39]
[,40]
grp 1 1 1 1 1 1 1 1 1 1 1 1
1
dat 3 4 5 5 3 5 5 2 4 5 2 4
3
> fit2
Wilcoxon rank sum test with continuity correction
data: dat by grp
W = 396, p-value = 9.62e-08
alternative hypothesis: true mu is not equal to 0
Re-fitting to get Hessian
> summary(fit1)
Call:
polr(formula = ordered(dat) ~ grp, control = c(maxiter = 10000,
trace = 0))
Coefficients:
Value Std. Error t value
grp -15.82468 169.3329 -0.0934531
Intercepts:
Value Std. Error t value
2|3 -18.0223 169.3342 -0.1064
3|4 -16.0254 169.3329 -0.0946
4|5 -15.4192 169.3327 -0.0911
5|6 -13.6274 169.3314 -0.0805
6|7 -1.3866 0.5591 -2.4803
7|8 -0.2011 0.4495 -0.4474
8|9 0.6187 0.4688 1.3198
9|10 2.1965 0.7453 2.9472
10|12 2.9441 1.0259 2.8698
Residual Deviance: 124.4085
As far as I can see, there is no error message from polr. Could someone let
me know
what I am doing wrong?
Thanks, in advance,
Jonathan Williams
OPTIMA
Radcliffe Infirmary
Woodstock Road
OXFORD OX2 6HE
Tel +1865 (2)24356
Jonathan Williams
OPTIMA
Radcliffe Infirmary
Woodstock Road
OXFORD OX2 6HE
Tel +1865 (2)24356