I am sure I am opening myself up to looking stupid, but I have two samples with medians of 613.5 and 189 (difference in location of 424 compared to the difference suggested from the wilcoxon of 291.5)> wilcox.test(pipwtCount,pipwdCount, conf.int=TRUE, na.rm=TRUE)Wilcoxon rank sum test data: pipwtCount and pipwdCount W = 822, p-value = 0.01227 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: 58 639 sample estimates: difference in location 291.5 The data is here> pipwtCount[1] 532 298 215 1588 38 180 284 376 5349 1024 650 605 1307 6147 21 [16] 453 23 1983 1048 464 2183 1028 1361 163 175 5944 569 622 793 70 [31] 67 1188 248 3010 19 2179 1339 408 113 739 2615 4619> pipwdCount[1] 89 384 12 703 2 138 189 383 314 482 96 907 90 1193 154 [16] 305 61 414 4764 1066 121 143 102 174 44 2896 NA 1103 161 199> median(pipwtCount)[1] 613.5> median(pipwdCount,na.rm=T)[1] 189> 613.5-189[1] 424.5 I would appreciate if someone could point out the obvious to me, and explain why there is such a large discrepancy in the differences in location. Many thanks, Graham [[alternative HTML version deleted]]
I am sure I am opening myself up to looking stupid, but I have two samples> with medians of 613.5 and 189 (difference in location of 424 compared to > the difference suggested from the Wilcoxon of 291.5) > >After a rather frustrating search, with it only explained in one of the books I found. It seems that the difference in medians for the Wilcoxon is calculated by looking at pair wise differences between the observations from each sample, at least in Minitab it is. This would potentially explain the discrepancy I am getting. Graham [[alternative HTML version deleted]]
Prof Brian Ripley
2011-Jan-30 16:10 UTC
[R] medians in Wilcoxon disagree with median function
Where did you get the idea that the location estimate in a 2-sample Wilcoxon test is the difference in medians? (It is a common misconception, but not I believe to be found in R. The estimate is the median of differences, not the difference of medians: and the test is not of a difference of population medians either, unless the two populations differ only in location.) On Sun, 30 Jan 2011, Graham Smith wrote:> I am sure I am opening myself up to looking stupid, but I have two samples > with medians of 613.5 and 189 (difference in location of 424 compared to > the difference suggested from the wilcoxon of 291.5) > >> wilcox.test(pipwtCount,pipwdCount, conf.int=TRUE, na.rm=TRUE) > > Wilcoxon rank sum test > > data: pipwtCount and pipwdCount > W = 822, p-value = 0.01227 > alternative hypothesis: true location shift is not equal to 0 > 95 percent confidence interval: > 58 639 > sample estimates: > difference in location > 291.5 > > The data is here > >> pipwtCount > [1] 532 298 215 1588 38 180 284 376 5349 1024 650 605 1307 6147 > 21 > [16] 453 23 1983 1048 464 2183 1028 1361 163 175 5944 569 622 793 > 70 > [31] 67 1188 248 3010 19 2179 1339 408 113 739 2615 4619 > >> pipwdCount > [1] 89 384 12 703 2 138 189 383 314 482 96 907 90 1193 > 154 > [16] 305 61 414 4764 1066 121 143 102 174 44 2896 NA 1103 161 > 199 > >> median(pipwtCount) > [1] 613.5 >> median(pipwdCount,na.rm=T) > [1] 189 >> 613.5-189 > [1] 424.5 > > > I would appreciate if someone could point out the obvious to me, and explain > why there is such a large discrepancy in the differences in location. > > > > Many thanks, > > Graham > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
David Winsemius
2011-Jan-30 16:47 UTC
[R] medians in Wilcoxon disagree with median function
On Jan 30, 2011, at 9:28 AM, Graham Smith wrote:> I am sure I am opening myself up to looking stupid,Or exposing a failure to read the help page.> but I have two samples > with medians of 613.5 and 189 (difference in location of 424 > compared to > the difference suggested from the wilcoxon of 291.5) > >> wilcox.test(pipwtCount,pipwdCount, conf.int=TRUE, na.rm=TRUE) > > Wilcoxon rank sum test > > data: pipwtCount and pipwdCount > W = 822, p-value = 0.01227 > alternative hypothesis: true location shift is not equal to 0 > 95 percent confidence interval: > 58 639 > sample estimates: > difference in location > 291.5 >The Wilcoxon two sample test is not a test of equality of medians. Read the Details help page. The estimated location is a pseudomedian. > plot(density(pipwtCount)) > plot(density(pipwdCount, na.rm=TRUE)) Both are highly right-skewed. -- David.> The data is here > >> pipwtCount > [1] 532 298 215 1588 38 180 284 376 5349 1024 650 605 1307 > 6147 > 21 > [16] 453 23 1983 1048 464 2183 1028 1361 163 175 5944 569 > 622 793 > 70 > [31] 67 1188 248 3010 19 2179 1339 408 113 739 2615 4619 > >> pipwdCount > [1] 89 384 12 703 2 138 189 383 314 482 96 907 90 > 1193 > 154 > [16] 305 61 414 4764 1066 121 143 102 174 44 2896 NA > 1103 161 > 199 > >> median(pipwtCount) > [1] 613.5 >> median(pipwdCount,na.rm=T) > [1] 189 >> 613.5-189 > [1] 424.5 > > > I would appreciate if someone could point out the obvious to me, and > explain > why there is such a large discrepancy in the differences in location. > > > > Many thanks, > > Graham > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT