Hello.
I am probably wrong... I am wondering if it could have a mistake
in the code of the ansari.test function. For me, it seems that the function
do not recover the p value at the correct side of the normal law N(0, 1) when it
use
the normal approximation (presence of ties) in a one tailed test.
Exemple :
quanti<-c(197, 205, 228, 234, 237, 195, 233, 226, 244, 227, 259, 185, 198,
253,
207, 250, 215, 239, 261, 247, 215, 241, 223, 247, 231, 221, 272, 215, 222, 205,
256, 253, 237, 262, 266, 243, 242, 236)
quali<-as.factor(c("sud", "sud", "sud",
"sud", "sud", "sud", "sud",
"sud",
"sud", "sud", "sud", "sud",
"sud", "sud", "sud", "sud",
"sud", "sud", "sud",
"ouest", "ouest", "ouest", "ouest",
"ouest", "ouest", "ouest", "ouest",
"ouest",
"ouest", "ouest", "ouest", "ouest",
"ouest", "ouest", "ouest", "ouest",
"ouest",
"ouest"))
obs2<-data.frame(quanti,quali)
obs2
xr<-sort(obs2[,1])
ci<-obs2[,2]
cr<-ci[order(obs2[,1])]
n<-length(obs2[,1])
if(n%%2==0) {rAB<-c(1:(n/2),(n/2):1)} else{rAB<-c(1:((n-1)/2), (n+1)/2,
((n-1)/2):1)}
obs3<-data.frame(rAB,xr,cr)
ties<-as.numeric(names(which((table(obs2[,1])>1)==TRUE))) # extrait les xi
r?p?t?s
obs4<-obs3
for(i in
ties){obs4$rAB[(which(obs3$xr==i))]<-mean(obs3$rAB[(which(obs3$xr==i))])}
obs4
classe1<- split(obs2[,1], obs2[,2])$sud
classe2<- split(obs2[,1], obs2[,2])$ouest
ansari.test(classe1, classe2, alternative = "greater")
# AB statistic :
AB1.calc<-sum(subset(obs4[,1],obs4[,3]=="sud"))
AB1.calc
# Z statistic :
n1<-length(classe1)
n2<-length(classe2)
n<-n1+n2
n         # n is even
mAB1<-n1*(n+2)/4
tj<- table(obs4$xr)
rABj<- obs4$rAB[c(which(diff(obs4$xr)!=0),n)]
s2.AB<-(n1*n2*(16*sum(tj*rABj^2)-n*(n+2)^2))/(16*n*(n-1))
Z.calc<-(AB1.calc-mAB1)/ (s2.AB)^0.5
Z.calc
# p value :
pnorm(Z.calc, lower.tail = FALSE)
pnorm(Z.calc)
Z is negative. So at the right of the normal law, the p value should be
over 0.5 but the ansari.test function gives a p-value = 0.3737 .
I used the same formule for Z as in the code. But what I can see, is it
may have a mistake in this part of the code :
p <- pnorm(normalize(STATISTIC, r, TIES))
        PVAL <- switch(alternative,
                       two.sided = 2 * min(p, 1 - p),
                       less = 1 - p,
                       greater = p)
pnorm() is written without "lowertail = FALSE". So it should be :
less = p
greater = 1-p
Am I wrong ???
Thanks very much for your help.
Gael.
Gael Millot
UMR 7147 et Universite Paris 6
Equipe Recombinaison et instabilite genetique
Pav Trouillet Rossignol 5eme etage
Institut Curie
26 rue d'Ulm
75248 Paris Cedex 05
tel : 01 42 34 66 34
fax : 01 42 34 66 44
Email : gael.millot at curie.fr