Cedric Laczny
2010-Aug-17 11:48 UTC
[R] Weird differing results when using the Wilcoxon-test
Hi, I became a little bit confused when working with the Wilcoxon test in R. As far as I understood, there are mainly two versions: 1) wilcox.test{stats}, which is the default and an approximation, especially, when ties are involved 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even, with ties. I have the following scenario: #---BeginCode--- # big example size = 60 big1 = rnorm(size, 0, 1) big2 = rnorm(size, 0.5, 1 g1f = rep(1, size) g2f = rep(2, size) big = c(big1, big2) data_frame = data.frame(big, gr=as.factor(c(g1f, g2f))) wilcox_approx = wilcox.test(big1, big2) wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact") #---EndCode--- I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf that wilcox.test (at least for the signed rank test) relies on exact (p-)values until N = n1 + n2 = 50. I can reproduce this, when using e.g. size = 15. The p-values then are the same, as I would expect it, having read the info from the link. #---BeginCode--- print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) #---EndCode--- That said, if I set e.g. size = 60, then the p-values of wilcox.test and wilcox_test differ, as expected. What's puzzling me particularly is the differing results when wanting to calculate the p-value manually, for bigger sample sizes. So, if we get the W-score from wilcox.test: #---BeginCode--- W_big = wilcox.test(big1, big2))$statistic #---EndCode--- and "convert" it to a Z-score, like this: #---BeginCode--- mu_big = (size^2)/2 sd_big = sqrt(size*size*(size + size + 1)/12) N = size + size sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 ) Z_big = (((W_big - mu_big)/sd_big) #---EndCode--- The Z-Score (Z_big) is equal to the statistic of wilcox_test. So far so good. And now comes the main problem. When I follow the documentation correctly, the p-value for a given W-score/- statistic ist calculated using the normal-approximation with the Z-score. However, when I do that, I get a different result than what I would expect. Because I would expect the p-value of wilcox.test to be equal to 2*pnorm(Z_big), which is in fact _not_ equal. Please see: #---BeginCode--- p_value_manual = 2 * pnorm(Z_big) print("--- Resulting pvalues --- ", quote=F) print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) print(paste("P-value manual:", p_value_manual), quote=F) #---EndCode--- So how is the calculation of the p-value performed in wilcox.test, when the sample sizes are big? Because this might explain why the value differs from that being calculated manually. Best regards, Cedric
Thomas Lumley
2010-Aug-17 15:50 UTC
[R] Weird differing results when using the Wilcoxon-test
After fixing the parentheses in your code so it does run, it seems that the difference is that wilcox.test defaults to using a continuity correction and your manual calculation does not. Use wilcox.test(big1, big2, correct=FALSE). -thomas On Tue, 17 Aug 2010, Cedric Laczny wrote:> Hi, > > I became a little bit confused when working with the Wilcoxon test in R. > As far as I understood, there are mainly two versions: > 1) wilcox.test{stats}, which is the default and an approximation, especially, > when ties are involved > 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even, > with ties. > > I have the following scenario: > > #---BeginCode--- > # big example > size = 60 > big1 = rnorm(size, 0, 1) > big2 = rnorm(size, 0.5, 1 > > g1f = rep(1, size) > g2f = rep(2, size) > big = c(big1, big2) > data_frame = data.frame(big, gr=as.factor(c(g1f, g2f))) > > wilcox_approx = wilcox.test(big1, big2) > wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact") > #---EndCode--- > > I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf > that wilcox.test (at least for the signed rank test) relies on exact > (p-)values until N = n1 + n2 = 50. > I can reproduce this, when using e.g. size = 15. The p-values then are the > same, as I would expect it, having read the info from the link. > > #---BeginCode--- > print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) > print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) > #---EndCode--- > > That said, if I set e.g. size = 60, then the p-values of wilcox.test and > wilcox_test differ, as expected. > > What's puzzling me particularly is the differing results when wanting to > calculate the p-value manually, for bigger sample sizes. > > So, if we get the W-score from wilcox.test: > > #---BeginCode--- > W_big = wilcox.test(big1, big2))$statistic > #---EndCode--- > > and "convert" it to a Z-score, like this: > > #---BeginCode--- > mu_big = (size^2)/2 > sd_big = sqrt(size*size*(size + size + 1)/12) > N = size + size > sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 ) > > Z_big = (((W_big - mu_big)/sd_big) > #---EndCode--- > > The Z-Score (Z_big) is equal to the statistic of wilcox_test. > So far so good. And now comes the main problem. > When I follow the documentation correctly, the p-value for a given W-score/- > statistic ist calculated using the normal-approximation with the Z-score. > However, when I do that, I get a different result than what I would expect. > Because I would expect the p-value of wilcox.test to be equal to > 2*pnorm(Z_big), which is in fact _not_ equal. Please see: > > #---BeginCode--- > p_value_manual = 2 * pnorm(Z_big) > > print("--- Resulting pvalues --- ", quote=F) > print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) > print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) > print(paste("P-value manual:", p_value_manual), quote=F) > #---EndCode--- > > So how is the calculation of the p-value performed in wilcox.test, when the > sample sizes are big? Because this might explain why the value differs from > that being calculated manually. > > Best regards, > > Cedric > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Thomas Lumley Professor of Biostatistics University of Washington, Seattle
Cedric Laczny
2010-Aug-18 09:55 UTC
[R] Weird differing results when using the Wilcoxon-test
I was able to trace down the unexpected behavior to the following line SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - 1)))) My calculations of the Z-score for the normal approximation where based on using the standard deviation for ranks _without_ ties. The above formula seems to account for ties and thus, yields a slightly different z-score. However, the data seems to include at most 1 tie (based on rnorm), so it would be the same result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, obviously ;) ) and thus I would expect the result to be the same as when using the formula for the standard deviation without ties. Interestigly, this z-score is also different from that reported by wilcox_test (exact calculation of p-value). I have not been able to find this formula in any textbook nearby or at any website. Therefore I am wondering where it does actually com from? Best, Cedric On Tuesday, 17. August 2010 13:48:10 Cedric Laczny wrote:> Hi, > > I became a little bit confused when working with the Wilcoxon test in R. > As far as I understood, there are mainly two versions: > 1) wilcox.test{stats}, which is the default and an approximation, > especially, when ties are involved > 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even, > with ties. > > I have the following scenario: > > #---BeginCode--- > # big example > size = 60 > big1 = rnorm(size, 0, 1) > big2 = rnorm(size, 0.5, 1 > > g1f = rep(1, size) > g2f = rep(2, size) > big = c(big1, big2) > data_frame = data.frame(big, gr=as.factor(c(g1f, g2f))) > > wilcox_approx = wilcox.test(big1, big2) > wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact") > #---EndCode--- > > I found here http://www-stat.stanford.edu/~susan/courses/s141/hononpara.pdf > that wilcox.test (at least for the signed rank test) relies on exact > (p-)values until N = n1 + n2 = 50. > I can reproduce this, when using e.g. size = 15. The p-values then are the > same, as I would expect it, having read the info from the link. > > #---BeginCode--- > print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) > print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) > #---EndCode--- > > That said, if I set e.g. size = 60, then the p-values of wilcox.test and > wilcox_test differ, as expected. > > What's puzzling me particularly is the differing results when wanting to > calculate the p-value manually, for bigger sample sizes. > > So, if we get the W-score from wilcox.test: > > #---BeginCode--- > W_big = wilcox.test(big1, big2))$statistic > #---EndCode--- > > and "convert" it to a Z-score, like this: > > #---BeginCode--- > mu_big = (size^2)/2 > sd_big = sqrt(size*size*(size + size + 1)/12) > N = size + size > sd_big_corr = sqrt( (size * size) / (N * (N - 1)) * (N^3 - N) / 12 ) > > Z_big = (((W_big - mu_big)/sd_big) > #---EndCode--- > > The Z-Score (Z_big) is equal to the statistic of wilcox_test. > So far so good. And now comes the main problem. > When I follow the documentation correctly, the p-value for a given > W-score/- statistic ist calculated using the normal-approximation with the > Z-score. However, when I do that, I get a different result than what I > would expect. Because I would expect the p-value of wilcox.test to be > equal to > 2*pnorm(Z_big), which is in fact _not_ equal. Please see: > > #---BeginCode--- > p_value_manual = 2 * pnorm(Z_big) > > print("--- Resulting pvalues --- ", quote=F) > print(paste("Wilcox approx p-value:", wilcox_approx$p.value), quote=F) > print(paste("Wilcox exact p-value:", pvalue(wilcox_exact)), quote=F) > print(paste("P-value manual:", p_value_manual), quote=F) > #---EndCode--- > > So how is the calculation of the p-value performed in wilcox.test, when the > sample sizes are big? Because this might explain why the value differs from > that being calculated manually. > > Best regards, > > Cedric > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code.
peter dalgaard
2010-Aug-18 13:47 UTC
[R] Weird differing results when using the Wilcoxon-test
On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote:> I was able to trace down the unexpected behavior to the following line > SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) - > sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y - > 1)))) > My calculations of the Z-score for the normal approximation where based on > using the standard deviation for ranks _without_ ties. The above formula seems > to account for ties and thus, yields a slightly different z-score. However, the > data seems to include at most 1 tie (based on rnorm), so it would be the same > result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0, > obviously ;) ) and thus I would expect the result to be the same as when using > the formula for the standard deviation without ties.Note the definition of NTIES <- table(r), counting the number of observations tied for a particular rank, so it is all ones if and only if there are NO ties in data. (If you are in paper-and-pencil mode, these formulas are fairly easily worked out once you realize that you only need the mean and variance of the rank of a single observation -- the covariances are C(R1,R2) = -1/(N-1) V(R1) because of symmetry and the fact that the sum of all N ranks is fixed.) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com