thr3ads.net - R help - [R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct? [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Michael G Rupert

2011-Apr-12 23:57 UTC

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

I have a question concerning the Wilcoxon signed-rank test, and 
specifically, which R subroutine I should use for my particular dataset. 
There are three different commands in R (that I'm aware of) that calculate 
the Wilcoxon signed-rank test; wilcox.test, wilcox.exact, and 
wilcoxsign_test. When I run the three commands on the same dataset, I get 
different p-values. I'm hoping that someone can give me guidance on the 
strengths and weaknesses of each command, why they produce different 
p-values, and which one is the most appropriate for my particular needs.

First, let me describe the dataset I am working with. The project I am 
working on collected water samples from groups/networks of about 30 water 
wells and analyzed them for nitrate, major ions, and other chemical 
constituents. We revisited those same wells about 10 years later and 
analyzed the water samples for the same chemical constituents. I now have 
a paired dataset, and the question I would like to answer is whether there 
was a "significant" change in concentrations of those chemical 
constituents (such as nitrate or chloride). Concentrations measured in 
water from some wells have increased, some have decreased, and some have 
stayed the same over the ten-year time period. In water from some wells, 
the concentrations were below the laboratory detection limits, so those 
concentrations are "tied" at the reporting level. The following is an 
example of the data I am evaluating. 

x <- c(13.60, 9.10, 22.01, 9.08, 1.97, 2.81, 0.66, 0.97, 0.21, 2.23, 0.08, 
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
0.06, 0.06, 3.44, 15.18, 5.25, 4.27, 17.81)
y <- c( 4.32, 3.39, 16.36, 7.10, 0.08, 2.02, 0.19, 0.59, 0.06, 2.15, 0.06, 
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
0.06, 0.06, 4.02, 16.13, 7.30, 7.98, 24.37)

The nonparametric Wilcoxon signed-rank test seems to be the most 
appropriate test for these data. There are two different methods to 
calculate the signed-rank test. The first is by Wilcoxon (1945), who 
discards any tied data and then calculates the signed ranks. The second 
method incorporates tied values in the ranking procedure (see J.W. Pratt, 
1959, Remarks on zeros and ties in the Wilcoxon signed rank procedure: 
Journal of the American Statistical Association, Vol. 54, No. 287, pp. 
655-667). There are two commands in R that calculate the original method 
by Wilcoxon (that I know of), wilcox.test and wilcoxsign_test (make sure 
to include the argument "zero.method = c("Wilcoxon")").
There are two
other commands in R that incorporate ties in the signed-rank test, 
wilcox.exact and wilcoxsign_test (make sure to include the 
argument"zero.method = c("Pratt")").

Here's my problem. I get different p-values from each of the 4 signed-rank 
tests in R, and I don't know which one to believe. Wilcox.test and 
wilcoxsign_test(zero.method = c("Wilcoxon") calculate the standard 
Wilcoxon signed-rank test. Even though they are not designed to deal with 
tied data, they should at least calculate the same p-value, but they do 
not. I ran the same datasets in SYSTAT and Minitab to check on the results 
from R. Minitab gives the same results as wilcox.test, and SYSTAT gives 
the same results as wilcoxsign_test(zero.method = c("Wilcoxon"). 
Similarly, wilcox.exact and wilcoxsign_test(zero.method = c("Pratt"))
are
designed to incorporate ties, but they give different p-values from each 
other. The signed-rank test procedure is relatively straightforward, so 
I'm surprised I'm not getting identical results. 

To check on these R commands, I calculated the signed-rank tests using the 
dataset shown on page 658-659 of Pratt (1959). These R routines do not 
produce the same results as that listed in Pratt, which makes me think 
that the R routines are not calculating the statistics correctly. The 
following text shows the commands I use in R to calculate the signed-rank 
test using these different R commands:

Thanks in advance for any assistance.

--Mike

#################################################################################
library(exactRankTests) #this loads the package for calculating the 
modified signed-rank test
library(coin) #this adds additional routines for the wilcoxon signed-rank 
test and the Pratt signed-rank test
#
# Data from Page 658 of Pratt
x <- c(1, 1, 1, 1, 1, 7, 10, 12, 13, 16, 17)
y <- c(1, 1, 3, 4, 6, 1,  1,  1,  1,  1, 1)
#
# STANDARD WILCOXON SIGNED RANK USING WILCOX.TEST
wilcox.test(x, y, alternative='two.sided', paired=TRUE)
# STANDARD WILCOXON SIGNED-RANK USING WILCOXSIGN_TEST. 
wilcoxsign_test (x ~ y, zero.method = c("Wilcoxon"))
#
#
# MODIFIED WILCOXON SIGNED-RANK USING WILCOX.EXACT
wilcox.exact(x, y, alternative='two.sided', paired=TRUE, mu=0)
# PRATT SIGNED-RANK TEST
wilcoxsign_test (x ~ y, zero.method = c("Pratt"))
	[[alternative HTML version deleted]]

Peter Ehlers

2011-Apr-13 10:32 UTC

head link

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

On 2011-04-12 16:57, Michael G Rupert wrote:> I have a question concerning the Wilcoxon signed-rank test, and
> specifically, which R subroutine I should use for my particular dataset.
> There are three different commands in R (that I'm aware of) that
calculate
> the Wilcoxon signed-rank test; wilcox.test, wilcox.exact, and
> wilcoxsign_test. When I run the three commands on the same dataset, I get
> different p-values. I'm hoping that someone can give me guidance on the
> strengths and weaknesses of each command, why they produce different
> p-values, and which one is the most appropriate for my particular needs.
>
> First, let me describe the dataset I am working with. The project I am
> working on collected water samples from groups/networks of about 30 water
> wells and analyzed them for nitrate, major ions, and other chemical
> constituents. We revisited those same wells about 10 years later and
> analyzed the water samples for the same chemical constituents. I now have
> a paired dataset, and the question I would like to answer is whether there
> was a "significant" change in concentrations of those chemical
> constituents (such as nitrate or chloride). Concentrations measured in
> water from some wells have increased, some have decreased, and some have
> stayed the same over the ten-year time period. In water from some wells,
> the concentrations were below the laboratory detection limits, so those
> concentrations are "tied" at the reporting level. The following
is an
> example of the data I am evaluating.
>
> x<- c(13.60, 9.10, 22.01, 9.08, 1.97, 2.81, 0.66, 0.97, 0.21, 2.23,
0.08,
> 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
> 0.06, 0.06, 3.44, 15.18, 5.25, 4.27, 17.81)
> y<- c( 4.32, 3.39, 16.36, 7.10, 0.08, 2.02, 0.19, 0.59, 0.06, 2.15,
0.06,
> 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
> 0.06, 0.06, 4.02, 16.13, 7.30, 7.98, 24.37)
>
> The nonparametric Wilcoxon signed-rank test seems to be the most
> appropriate test for these data. There are two different methods to
> calculate the signed-rank test. The first is by Wilcoxon (1945), who
> discards any tied data and then calculates the signed ranks. The second
> method incorporates tied values in the ranking procedure (see J.W. Pratt,
> 1959, Remarks on zeros and ties in the Wilcoxon signed rank procedure:
> Journal of the American Statistical Association, Vol. 54, No. 287, pp.
> 655-667). There are two commands in R that calculate the original method
> by Wilcoxon (that I know of), wilcox.test and wilcoxsign_test (make sure
> to include the argument "zero.method = c("Wilcoxon")").
There are two
> other commands in R that incorporate ties in the signed-rank test,
> wilcox.exact and wilcoxsign_test (make sure to include the
> argument"zero.method = c("Pratt")").
>
> Here's my problem. I get different p-values from each of the 4
signed-rank
> tests in R, and I don't know which one to believe. Wilcox.test and
> wilcoxsign_test(zero.method = c("Wilcoxon") calculate the
standard
> Wilcoxon signed-rank test. Even though they are not designed to deal with
> tied data, they should at least calculate the same p-value, but they do
> not. I ran the same datasets in SYSTAT and Minitab to check on the results
> from R. Minitab gives the same results as wilcox.test, and SYSTAT gives
> the same results as wilcoxsign_test(zero.method = c("Wilcoxon").
> Similarly, wilcox.exact and wilcoxsign_test(zero.method =
c("Pratt")) are
> designed to incorporate ties, but they give different p-values from each
> other. The signed-rank test procedure is relatively straightforward, so
> I'm surprised I'm not getting identical results.
>
> To check on these R commands, I calculated the signed-rank tests using the
> dataset shown on page 658-659 of Pratt (1959). These R routines do not
> produce the same results as that listed in Pratt, which makes me think
> that the R routines are not calculating the statistics correctly.
Ahem .... that's a pretty strong claim.

Actually, the problem is user misunderstanding and the relevant
help pages do tell you where the differences lie.

Let's take the 3 functions one at a time, using your
x,y data from Pratt:

1. wilcox.test() in the stats package
This function automatically switches to using a Normal
approximation when there are ties in the data:

  wilcox.test(x, y, paired=TRUE)$p.value
#[1] 0.05802402
(You can suppress the warning (due to ties) by specifying
the argument 'exact=FALSE'.)

This function also uses a continuity correction unless
told not to:

  wilcox.test(x, y, paired=TRUE, correct=FALSE)$p.value
#[1] 0.05061243

2. wilcox.exact() in pkg exactRankTests
This function can handle ties (using the "Wilcoxon" method)
with an 'exact' calculation:

  wilcox.exact(x, y, paired=TRUE)$p.value
#[1] 0.0546875

If you want the Normal approximation:

  wilcox.exact(x, y, paired=TRUE, exact=FALSE)$p.value
#[1] 0.05061243  <-- cf. above

3. wilcoxsign_test() in pkg coin
This is the most comprehensive of these functions.
It is also the only one that offers the "Pratt" method
of handling ties. It will default to this method and
a Normal approximation:

  pvalue(wilcoxsign_test(x ~ y))
#[1] 0.08143996

  pvalue(wilcoxsign_test(x ~ y, zero.method="Pratt",
         distribution="asympt"))
#[1] 0.08143996

You can get the results from wilcox.exact() with

  pvalue(wilcoxsign_test(x ~ y, zero.method="Wilcoxon",
         distribution="asympt"))
#[1] 0.05061243

and

  pvalue(wilcoxsign_test(x ~ y, zero.method="Wilcoxon",
         dist="exact"))
#[1] 0.0546875

As to which method you should use, that's up to you.

Peter Ehlers

The> following text shows the commands I use in R to calculate the signed-rank
> test using these different R commands:
>
> Thanks in advance for any assistance.
>
> --Mike
>
>
#################################################################################
> library(exactRankTests) #this loads the package for calculating the
> modified signed-rank test
> library(coin) #this adds additional routines for the wilcoxon signed-rank
> test and the Pratt signed-rank test
> #
> # Data from Page 658 of Pratt
> x<- c(1, 1, 1, 1, 1, 7, 10, 12, 13, 16, 17)
> y<- c(1, 1, 3, 4, 6, 1,  1,  1,  1,  1, 1)
> #
> # STANDARD WILCOXON SIGNED RANK USING WILCOX.TEST
> wilcox.test(x, y, alternative='two.sided', paired=TRUE)
> # STANDARD WILCOXON SIGNED-RANK USING WILCOXSIGN_TEST.
> wilcoxsign_test (x ~ y, zero.method = c("Wilcoxon"))
> #
> #
> # MODIFIED WILCOXON SIGNED-RANK USING WILCOX.EXACT
> wilcox.exact(x, y, alternative='two.sided', paired=TRUE, mu=0)
> # PRATT SIGNED-RANK TEST
> wilcoxsign_test (x ~ y, zero.method = c("Pratt"))
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

peter dalgaard

2011-Apr-13 11:12 UTC

head link

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

On Apr 13, 2011, at 01:57 , Michael G Rupert wrote:
> I have a question concerning the Wilcoxon signed-rank test, and 
> specifically, which R subroutine I should use for my particular dataset. 
> There are three different commands in R (that I'm aware of) that
calculate
> the Wilcoxon signed-rank test; wilcox.test, wilcox.exact, and 
> wilcoxsign_test. When I run the three commands on the same dataset, I get 
> different p-values. I'm hoping that someone can give me guidance on the
> strengths and weaknesses of each command, why they produce different 
> p-values, and which one is the most appropriate for my particular needs.
Well, there are two version of zero-handling, and for each of these, you can
have exact p values or asymptotic p values with or without continuity
correction, so that's 6 possibilities already.

> 
> First, let me describe the dataset I am working with. The project I am 
> working on collected water samples from groups/networks of about 30 water 
> wells and analyzed them for nitrate, major ions, and other chemical 
> constituents. We revisited those same wells about 10 years later and 
> analyzed the water samples for the same chemical constituents. I now have 
> a paired dataset, and the question I would like to answer is whether there 
> was a "significant" change in concentrations of those chemical 
> constituents (such as nitrate or chloride). Concentrations measured in 
> water from some wells have increased, some have decreased, and some have 
> stayed the same over the ten-year time period. In water from some wells, 
> the concentrations were below the laboratory detection limits, so those 
> concentrations are "tied" at the reporting level. The following
is an
> example of the data I am evaluating. 
> 
> x <- c(13.60, 9.10, 22.01, 9.08, 1.97, 2.81, 0.66, 0.97, 0.21, 2.23,
0.08,
> 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
> 0.06, 0.06, 3.44, 15.18, 5.25, 4.27, 17.81)
> y <- c( 4.32, 3.39, 16.36, 7.10, 0.08, 2.02, 0.19, 0.59, 0.06, 2.15,
0.06,
> 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
> 0.06, 0.06, 4.02, 16.13, 7.30, 7.98, 24.37)
> 
> The nonparametric Wilcoxon signed-rank test seems to be the most 
> appropriate test for these data. There are two different methods to 
> calculate the signed-rank test. The first is by Wilcoxon (1945), who 
> discards any tied data and then calculates the signed ranks. The second 
> method incorporates tied values in the ranking procedure (see J.W. Pratt, 
> 1959, Remarks on zeros and ties in the Wilcoxon signed rank procedure: 
> Journal of the American Statistical Association, Vol. 54, No. 287, pp. 
> 655-667). There are two commands in R that calculate the original method 
> by Wilcoxon (that I know of), wilcox.test and wilcoxsign_test (make sure 
> to include the argument "zero.method = c("Wilcoxon")").
There are two
> other commands in R that incorporate ties in the signed-rank test, 
> wilcox.exact and wilcoxsign_test (make sure to include the 
> argument"zero.method = c("Pratt")").
> 
> Here's my problem. I get different p-values from each of the 4
signed-rank
> tests in R, and I don't know which one to believe. Wilcox.test and 
> wilcoxsign_test(zero.method = c("Wilcoxon") calculate the
standard
> Wilcoxon signed-rank test. Even though they are not designed to deal with 
> tied data, they should at least calculate the same p-value, but they do 
> not.
They do if you turn off the continuity correction in wilcox.test:
> wilcox.test(x, y, alternative='two.sided', paired=TRUE, correct=F)
	Wilcoxon signed rank test

data:  x and y 
V = 39, p-value = 0.05061
alternative hypothesis: true location shift is not equal to 0 


> I ran the same datasets in SYSTAT and Minitab to check on the results 
> from R. Minitab gives the same results as wilcox.test, and SYSTAT gives 
> the same results as wilcoxsign_test(zero.method = c("Wilcoxon").
So one does continuity correction and the other not.
>  
> Similarly, wilcox.exact and wilcoxsign_test(zero.method =
c("Pratt")) are
> designed to incorporate ties, but they give different p-values from each 
> other.
They still handle zeros differently. wilcox.exact does not handle the Pratt
ranking.

To get exact p values for Pratt ranks, try
> perm.test(c(-3,-4,-5,6:11))
	1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08984
alternative hypothesis: true mu is not equal to 0 


... and for the asymptotic counterpart:
> perm.test(c(-3,-4,-5,6:11), exact=F)
	Asymptotic 1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08144
alternative hypothesis: true mu is not equal to 0 

> The signed-rank test procedure is relatively straightforward, so 
> I'm surprised I'm not getting identical results. 
> 
> To check on these R commands, I calculated the signed-rank tests using the 
> dataset shown on page 658-659 of Pratt (1959).
Not found. Apparently, you _constructed_ a data set to get the same set of
ranks.
> These R routines do not 
> produce the same results as that listed in Pratt, which makes me think 
> that the R routines are not calculating the statistics correctly.
Apparently, the Pratt paper predates the convention that a p value is the
probability of observing "the test statistic or more extreme" and he
switches back and forth between "less than" and "less than or
equal" (to a negative rank sum of 6 and 12 resp.). Also, his p-values are
one-sided.

Using modern technology, it is pretty easy to generate the enumerations that
Pratt is referring to:
> M <- as.matrix(do.call("expand.grid", rep(list(0:1),9)))
> table(M%*%1:9)
 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 
 1  1  1  2  2  3  4  5  6  8  9 10 12 13 15 17 18 19 21 21 22 23 23 
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
23 23 22 21 21 19 18 17 15 13 12 10  9  8  6  5  4  3  2  2  1  1  1
> table(M%*%3:11)
 0  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
 1  1  1  1  1  2  2  3  3  4  4  5  6  7  7  8 10 10 11 12 13 13 15 
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
15 16 16 17 17 18 17 17 18 17 17 16 16 15 15 13 13 12 11 10 10  8  7 
48 49 50 51 52 53 54 55 56 57 58 59 60 63 
 7  6  5  4  4  3  3  2  2  1  1  1  1  1 

From these, you can see that the probability of "6 or less" with the
Wilcoxon ranking is
14/512 and that of "12 or less" with the Pratt procedure is 23/512. To
get two-sided tests, include the (symmetric) opposite tail, i.e., multiply by
two and get
> 14/256
[1] 0.0546875> 23/256[1] 0.08984375

Both of those should be recognizable from the exact tests in
coin/exactRankTests.

> The 
> following text shows the commands I use in R to calculate the signed-rank 
> test using these different R commands:
> 
> Thanks in advance for any assistance.
> 
> --Mike
> 
>
#################################################################################
> library(exactRankTests) #this loads the package for calculating the 
> modified signed-rank test
> library(coin) #this adds additional routines for the wilcoxon signed-rank 
> test and the Pratt signed-rank test
> #
> # Data from Page 658 of Pratt
> x <- c(1, 1, 1, 1, 1, 7, 10, 12, 13, 16, 17)
> y <- c(1, 1, 3, 4, 6, 1,  1,  1,  1,  1, 1)
> #
> # STANDARD WILCOXON SIGNED RANK USING WILCOX.TEST
> wilcox.test(x, y, alternative='two.sided', paired=TRUE)
> # STANDARD WILCOXON SIGNED-RANK USING WILCOXSIGN_TEST. 
> wilcoxsign_test (x ~ y, zero.method = c("Wilcoxon"))
> #
> #
> # MODIFIED WILCOXON SIGNED-RANK USING WILCOX.EXACT
> wilcox.exact(x, y, alternative='two.sided', paired=TRUE, mu=0)
> # PRATT SIGNED-RANK TEST
> wilcoxsign_test (x ~ y, zero.method = c("Pratt"))
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Apr 2011 - The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

[R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

Seemingly Similar Threads