thr3ads.net - R help - [R] wilcox.test returned estimates [Feb 2006]

If this information is useful, please help other people find it:
Share via:

pmt1rew@leeds.ac.uk

2006-Feb-15 11:36 UTC

[R] wilcox.test returned estimates

Hi all,

I have being using wilcox.test to test for differences between 2 independent
samples.  I had understood the difference in location to be conventionally the
difference in the sample medians however this is not the case when implemented
in R.  I have tied ranks and therefore non-exact p-value and confidence
intervals are calculated due to the normal approximation.  But what exactly is
this normal approximation i.e. how is it involved in estimating the location
difference?

Further, is it then wrong to refer to the difference in location as the
difference between the medians?  Does anyone have a more appropriate
description?

Thanks

Rebecca

Torsten Hothorn

2006-Feb-15 11:59 UTC

head link

[R] wilcox.test returned estimates

On Wed, 15 Feb 2006, pmt1rew at leeds.ac.uk wrote:
> Hi all,
>
> I have being using wilcox.test to test for differences between 2
independent
> samples.  I had understood the difference in location to be conventionally
the
> difference in the sample medians however this is not the case when
implemented
> in R. I have tied ranks and therefore non-exact p-value and confidence
> intervals are calculated due to the normal approximation.  But what exactly
is
> this normal approximation i.e. how is it involved in estimating the
location
> difference?
the reference distribution is not involved in _estimating_ the difference 
in location. `wilcox.test' implements the Hodges-Lehmann estimator:

from `stats/R/wilcox.test.R'

                 ## Exact confidence interval for the location parameter
                 ## mean(x) - mean(y) in the two-sample case (cf. the
                 ## one-sample case).
                 alpha <- 1 - conf.level
                 diffs <- sort(outer(x, y, "-"))
                 ...
                 ESTIMATE <- median(diffs)
                 names(ESTIMATE) <- "difference in location"

which simply is the median of all pairwise differences.

However, the usual normal approximation to the exact conditional 
distribution (in case of ties) of the Wilcoxon-Mann-Whitney statistic 
(see Hajek, Sidak, Sen for example) is involved in computing a confidence 
interval for the difference in location.

Hope that helps,

Torsten
>
> Further, is it then wrong to refer to the difference in location as the
> difference between the medians?  Does anyone have a more appropriate
> description?
>
> Thanks
>
> Rebecca
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>

Thomas Lumley

2006-Feb-15 20:54 UTC

head link

[R] wilcox.test returned estimates

On Wed, 15 Feb 2006, pmt1rew at leeds.ac.uk wrote:
> Hi all,
>
> I have being using wilcox.test to test for differences between 2
independent
> samples.  I had understood the difference in location to be conventionally
the
> difference in the sample medians however this is not the case when
implemented
> in R.  I have tied ranks and therefore non-exact p-value and confidence
> intervals are calculated due to the normal approximation.  But what exactly
is
> this normal approximation i.e. how is it involved in estimating the
location
> difference?
It isn't.  The only assumption is that the distribution is the same apart 
from location in the two groups.
> Further, is it then wrong to refer to the difference in location as the
> difference between the medians?  Does anyone have a more appropriate
> description?
Well, this gets more complicated.  Since the method assumes that the 
population distributions differ only by location the population difference 
in medians is the same as the difference in means or in 16.34th 
percentile, or 42%-trimmed mean or whatever. If the assumption is not true 
then seriously weird things can happen (consider the distributions given 
by http://mathworld.wolfram.com/EfronsDice.html)

However, the estimate is not the difference in sample medians. It is the 
median pairwise difference.


 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Gregory Snow

2006-Feb-15 22:02 UTC

head link

[R] wilcox.test returned estimates

If you really want to look at the difference between 2 medians, then consider
using permutation tests and bootstrapping for the interval.

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch on behalf of pmt1rew@leeds.ac.uk
Sent: Wed 2/15/2006 4:36 AM
To: r-help@stat.math.ethz.ch
Subject: [R] wilcox.test returned estimates

Hi all,

I have being using wilcox.test to test for differences between 2 independent
samples.  I had understood the difference in location to be conventionally the
difference in the sample medians however this is not the case when implemented
in R.  I have tied ranks and therefore non-exact p-value and confidence
intervals are calculated due to the normal approximation.  But what exactly is
this normal approximation i.e. how is it involved in estimating the location
difference?

Further, is it then wrong to refer to the difference in location as the
difference between the medians?  Does anyone have a more appropriate
description?

Thanks

Rebecca

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more apparently analagous threads

R help - Feb 2006 - wilcox.test returned estimates

[R] wilcox.test returned estimates

[R] wilcox.test returned estimates

[R] wilcox.test returned estimates

[R] wilcox.test returned estimates

Maybe Matching Threads