thr3ads.net - R help - [R] Weird differing results when using the Wilcoxon-test [Aug 2010]

If this information is useful, please help other people find it:
Share via:

Cedric Laczny

2010-Aug-17 11:48 UTC

[R] Weird differing results when using the Wilcoxon-test

Hi,

I became a little bit confused when working with the Wilcoxon test in R.
As far as I understood, there are mainly two versions:
1) wilcox.test{stats}, which is the default and an approximation, especially, 
when ties are involved
2) wilcox_test{coin}, which does calculate the distribution _exactly_ even, 
with ties.

I have the following scenario:

#---BeginCode---
# big example
size = 60
big1 = rnorm(size, 0, 1)
big2 = rnorm(size, 0.5, 1

g1f = rep(1, size)
g2f = rep(2, size)
big = c(big1, big2)
data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))

wilcox_approx = wilcox.test(big1, big2)
wilcox_exact = wilcox_test(big ~ gr, data=data_frame,
distribution="exact")
#---EndCode---

I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf 
that wilcox.test (at least for the signed rank test) relies on exact 
(p-)values until N = n1 + n2 = 50.
I can reproduce this, when using e.g. size = 15. The p-values then are the 
same, as I would expect it, having read the info from the link.

#---BeginCode---
print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F)
print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F)
#---EndCode---

That said, if I set e.g. size = 60, then the p-values of wilcox.test and 
wilcox_test differ, as expected.

What's puzzling me particularly is the differing results when wanting to 
calculate the p-value manually, for bigger sample sizes.

So, if we get the W-score from wilcox.test:

#---BeginCode---
W_big = wilcox.test(big1, big2))$statistic
#---EndCode---

and "convert" it to a Z-score, like this:

#---BeginCode---
mu_big = (size^2)/2
sd_big = sqrt(size*size*(size + size + 1)/12)
N = size + size
sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 )

Z_big = (((W_big - mu_big)/sd_big)
#---EndCode---

The Z-Score (Z_big) is equal to the statistic of wilcox_test.
So far so good. And now comes the main problem.
When I follow the documentation correctly, the p-value for a given W-score/-
statistic ist calculated using the normal-approximation with the Z-score.
However, when I do that, I get a different result than what I would expect.
Because I would expect the p-value of wilcox.test to be equal to 
2*pnorm(Z_big), which is in fact _not_ equal. Please see:

#---BeginCode---
p_value_manual = 2 * pnorm(Z_big)

print("--- Resulting pvalues --- ", quote=F)
print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F)
print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F)
print(paste("P-value manual:", p_value_manual), quote=F)
#---EndCode---

So how is the calculation of the p-value performed in wilcox.test, when the 
sample sizes are big? Because this might explain why the value differs from 
that being calculated manually.

Best regards,

Cedric

Thomas Lumley

2010-Aug-17 15:50 UTC

head link

[R] Weird differing results when using the Wilcoxon-test

After fixing the parentheses in your code so it does run, it seems that the
difference is that wilcox.test defaults to using a continuity correction and
your manual calculation does not.   Use wilcox.test(big1, big2, correct=FALSE).

    -thomas

On Tue, 17 Aug 2010, Cedric Laczny wrote:
> Hi,
>
> I became a little bit confused when working with the Wilcoxon test in R.
> As far as I understood, there are mainly two versions:
> 1) wilcox.test{stats}, which is the default and an approximation,
especially,
> when ties are involved
> 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even,
> with ties.
>
> I have the following scenario:
>
> #---BeginCode---
> # big example
> size = 60
> big1 = rnorm(size, 0, 1)
> big2 = rnorm(size, 0.5, 1
>
> g1f = rep(1, size)
> g2f = rep(2, size)
> big = c(big1, big2)
> data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
>
> wilcox_approx = wilcox.test(big1, big2)
> wilcox_exact = wilcox_test(big ~ gr, data=data_frame,
distribution="exact")
> #---EndCode---
>
> I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf
> that wilcox.test (at least for the signed rank test) relies on exact
> (p-)values until N = n1 + n2 = 50.
> I can reproduce this, when using e.g. size = 15. The p-values then are the
> same, as I would expect it, having read the info from the link.
>
> #---BeginCode---
> print(paste("Wilcox approx p-value:", wilcox_approx$p.value),
quote=F)
> print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)),
quote=F)
> #---EndCode---
>
> That said, if I set e.g. size = 60, then the p-values of wilcox.test and
> wilcox_test differ, as expected.
>
> What's puzzling me particularly is the differing results when wanting
to
> calculate the p-value manually, for bigger sample sizes.
>
> So, if we get the W-score from wilcox.test:
>
> #---BeginCode---
> W_big = wilcox.test(big1, big2))$statistic
> #---EndCode---
>
> and "convert" it to a Z-score, like this:
>
> #---BeginCode---
> mu_big = (size^2)/2
> sd_big = sqrt(size*size*(size + size + 1)/12)
> N = size + size
> sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 )
>
> Z_big = (((W_big - mu_big)/sd_big)
> #---EndCode---
>
> The Z-Score (Z_big) is equal to the statistic of wilcox_test.
> So far so good. And now comes the main problem.
> When I follow the documentation correctly, the p-value for a given
W-score/-
> statistic ist calculated using the normal-approximation with the Z-score.
> However, when I do that, I get a different result than what I would expect.
> Because I would expect the p-value of wilcox.test to be equal to
> 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
>
> #---BeginCode---
> p_value_manual = 2 * pnorm(Z_big)
>
> print("--- Resulting pvalues --- ", quote=F)
> print(paste("Wilcox approx p-value:", wilcox_approx$p.value),
quote=F)
> print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)),
quote=F)
> print(paste("P-value manual:", p_value_manual), quote=F)
> #---EndCode---
>
> So how is the calculation of the p-value performed in wilcox.test, when the
> sample sizes are big? Because this might explain why the value differs from
> that being calculated manually.
>
> Best regards,
>
> Cedric
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

Cedric Laczny

2010-Aug-18 09:55 UTC

head link

[R] Weird differing results when using the Wilcoxon-test

I was able to trace down the unexpected behavior to the following line
SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - 
                sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - 
                  1))))
My calculations of the Z-score for the normal approximation where based on 
using the standard deviation for ranks _without_ ties. The above formula seems 
to account for ties and thus, yields a slightly different z-score. However, the 
data seems to include at most 1 tie (based on rnorm), so it would be the same 
result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, 
obviously ;) ) and thus I would expect the result to be the same as when using 
the formula for the standard deviation without ties.
Interestigly, this z-score is also different from that reported by wilcox_test 
(exact calculation of p-value).
I have not been able to find this formula in any textbook nearby or at any 
website. Therefore I am wondering where it does actually com from?

Best,

Cedric

On Tuesday, 17. August 2010 13:48:10 Cedric Laczny
wrote:> Hi,
> 
> I became a little bit confused when working with the Wilcoxon test in R.
> As far as I understood, there are mainly two versions:
> 1) wilcox.test{stats}, which is the default and an approximation,
> especially, when ties are involved
> 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even,
> with ties.
> 
> I have the following scenario:
> 
> #---BeginCode---
> # big example
> size = 60
> big1 = rnorm(size, 0, 1)
> big2 = rnorm(size, 0.5, 1
> 
> g1f = rep(1, size)
> g2f = rep(2, size)
> big = c(big1, big2)
> data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
> 
> wilcox_approx = wilcox.test(big1, big2)
> wilcox_exact = wilcox_test(big ~ gr, data=data_frame,
distribution="exact")
> #---EndCode---
> 
> I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf
> that wilcox.test (at least for the signed rank test) relies on exact
> (p-)values until N = n1 + n2 = 50.
> I can reproduce this, when using e.g. size = 15. The p-values then are the
> same, as I would expect it, having read the info from the link.
> 
> #---BeginCode---
> print(paste("Wilcox approx p-value:", wilcox_approx$p.value),
quote=F)
> print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)),
quote=F)
> #---EndCode---
> 
> That said, if I set e.g. size = 60, then the p-values of wilcox.test and
> wilcox_test differ, as expected.
> 
> What's puzzling me particularly is the differing results when wanting
to
> calculate the p-value manually, for bigger sample sizes.
> 
> So, if we get the W-score from wilcox.test:
> 
> #---BeginCode---
> W_big = wilcox.test(big1, big2))$statistic
> #---EndCode---
> 
> and "convert" it to a Z-score, like this:
> 
> #---BeginCode---
> mu_big = (size^2)/2
> sd_big = sqrt(size*size*(size + size + 1)/12)
> N = size + size
> sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 )
> 
> Z_big = (((W_big - mu_big)/sd_big)
> #---EndCode---
> 
> The Z-Score (Z_big) is equal to the statistic of wilcox_test.
> So far so good. And now comes the main problem.
> When I follow the documentation correctly, the p-value for a given
> W-score/- statistic ist calculated using the normal-approximation with the
> Z-score. However, when I do that, I get a different result than what I
> would expect. Because I would expect the p-value of wilcox.test to be
> equal to
> 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
> 
> #---BeginCode---
> p_value_manual = 2 * pnorm(Z_big)
> 
> print("--- Resulting pvalues --- ", quote=F)
> print(paste("Wilcox approx p-value:", wilcox_approx$p.value),
quote=F)
> print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)),
quote=F)
> print(paste("P-value manual:", p_value_manual), quote=F)
> #---EndCode---
> 
> So how is the calculation of the p-value performed in wilcox.test, when the
> sample sizes are big? Because this might explain why the value differs from
> that being calculated manually.
> 
> Best regards,
> 
> Cedric
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

peter dalgaard

2010-Aug-18 13:47 UTC

head link

[R] Weird differing results when using the Wilcoxon-test

On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote:
> I was able to trace down the unexpected behavior to the following line
> SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - 
>                sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - 
>                  1))))
> My calculations of the Z-score for the normal approximation where based on 
> using the standard deviation for ranks _without_ ties. The above formula
seems
> to account for ties and thus, yields a slightly different z-score. However,
the
> data seems to include at most 1 tie (based on rnorm), so it would be the
same
> result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, 
> obviously ;) ) and thus I would expect the result to be the same as when
using
> the formula for the standard deviation without ties.
Note the definition of NTIES <- table(r), counting the number of observations
tied for a particular rank, so it is all ones if and only if there are NO ties
in data.

(If you are in paper-and-pencil mode, these formulas are fairly easily worked
out once you realize that you only need the mean and variance of the rank of a
single observation -- the covariances are C(R1,R2) = -1/(N-1) V(R1) because of
symmetry and the fact that the sum of all N ranks is fixed.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Seemingly Similar Threads

Search for more maybe matching threads

R help - Aug 2010 - Weird differing results when using the Wilcoxon-test

[R] Weird differing results when using the Wilcoxon-test

[R] Weird differing results when using the Wilcoxon-test

[R] Weird differing results when using the Wilcoxon-test

[R] Weird differing results when using the Wilcoxon-test

Seemingly Similar Threads