Hi all, I have a distribution, and take a sample of it. Then I compare that sample with the mean of the population like here in "Wilcoxon signed rank test with continuity correction":> wilcox.test(Sample,mu=mean(All), alt="two.sided")Wilcoxon signed rank test with continuity correction data: AlphaNoteOnsetDists V = 63855, p-value = 0.0002093 alternative hypothesis: true location is not equal to 0.4115136> wilcox.test(Sample,mu=mean(All), alt = "greater")Wilcoxon signed rank test with continuity correction data: AlphaNoteOnsetDists V = 63855, p-value = 0.0001047 alternative hypothesis: true location is greater than 0.4115136 What assumptions are needed for the population? What can we say according these results? p-value for the "less" is 0.999. Thanks in advance, Atte Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/
On Wed, Jun 23, 2010 at 10:27 PM, Atte Tenkanen <attenka at utu.fi> wrote:> Hi all, > > I have a distribution, and take a sample of it. Then I compare that sample with the mean of the population like here in "Wilcoxon signed rank test with continuity correction": > >> wilcox.test(Sample,mu=mean(All), alt="two.sided") > > ? ? ? ?Wilcoxon signed rank test with continuity correction > > data: ?AlphaNoteOnsetDists > V = 63855, p-value = 0.0002093 > alternative hypothesis: true location is not equal to 0.4115136 > >> wilcox.test(Sample,mu=mean(All), alt = "greater") > > ? ? ? ?Wilcoxon signed rank test with continuity correction > > data: ?AlphaNoteOnsetDists > V = 63855, p-value = 0.0001047 > alternative hypothesis: true location is greater than 0.4115136 > > What assumptions are needed for the population?wikipedia says: "The Wilcoxon signed-rank test is a _non-parametric_ statistical hypothesis test for... " it also talks about the assumptions.> What can we say according these results? > p-value for the "less" is 0.999.That the p-value for less and greater seem to sum up to one, and that the p-value of greater is half of that for two-sided. You shouldn't ask what we can say. You should ask yourself "What was the question and is this test giving me an answer on that question?" Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
There is a potentially useful remark from Peter Dalfgaard at http://www.mail-archive.com/r-help at stat.math.ethz.ch/msg86359.html : Summarising: "[The Wilcoxon paired rank sign test assumes symmetry] ...of differences, and under the null hypothesis. This is usually rather uncontroversial. " My rider to this: It's uncontroversial because differences between random samples from the same asymmetric distribution would form a symmetric distribution of differences, and the null for the wilcoxon is essentially that the distributions are the same. Symmetry of differences at the null follows. BUT the corollary is that location might not be the only thing that can cause a wilcoxon test to show a significant difference. set.seed(1023) x<-rlnorm(50) z<-rlnorm(50, sdlog=3) z<-z-mean(z)+mean(x) mean(x) mean(z) #Same mean.. wilcox.test(x,z) #Strongly significant test result. #Not a perfect example, as the test relates to true means, not data set means. #But very different skew and scale will make for a very significant test result as well as very different means On Thu, Jun 24, 2010 at 4:16 AM, Atte Tenkanen <attenka at utu.fi> wrote:> PS. > > Mayby I can somehow try to transform data and check it, for example,using the skewness-function of timeDate-package?> >> Thanks. What I have had to ask is that >> >> how do you test that the data is symmetric enough? >> If it is not, is it ok to use some data transformation? >> >> when it is said: >> >> "The Wilcoxon signed rank test does not assume that the data are >> sampled from a Gaussian distribution. However it does assume thatthe>> data are distributed symmetrically around the median. If the >> distribution is asymmetrical, the P value will not tell you muchabout>> whether the median is different than the hypothetical value." >> >> > On Wed, Jun 23, 2010 at 10:27 PM, Atte Tenkanen <attenka at utu.fi>wrote:>> > > Hi all, >> > > >> > > I have a distribution, and take a sample of it. Then I compare >> that >> > sample with the mean of the population like here in "Wilcoxonsigned>> >> > rank test with continuity correction": >> > > >> > >> wilcox.test(Sample,mu=mean(All), alt="two.sided") >> > > >> > > Wilcoxon signed rank test with continuity correction >> > > >> > > data: AlphaNoteOnsetDists >> > > V = 63855, p-value = 0.0002093 >> > > alternative hypothesis: true location is not equal to 0.4115136 >> > > >> > >> wilcox.test(Sample,mu=mean(All), alt = "greater") >> > > >> > > Wilcoxon signed rank test with continuity correction >> > > >> > > data: AlphaNoteOnsetDists >> > > V = 63855, p-value = 0.0001047 >> > > alternative hypothesis: true location is greater than 0.4115136 >> > > >> > > What assumptions are needed for the population? >> > >> > wikipedia says: >> > "The Wilcoxon signed-rank test is a _non-parametric_ statistical >> > hypothesis test for... " >> > it also talks about the assumptions. >> > >> > > What can we say according these results? >> > > p-value for the "less" is 0.999. >> > >> > That the p-value for less and greater seem to sum up to one, andthat>> > the p-value of greater is half of that for two-sided. Youshouldn't>> > ask what we can say. You should ask yourself "What was thequestion>> > and is this test giving me an answer on that question?" >> > >> > Cheers >> > Joris >> > >> > -- >> > Joris Meys >> > Statistical consultant >> > >> > Ghent University >> > Faculty of Bioscience Engineering >> > Department of Applied mathematics, biometrics and process control >> > >> > tel : +32 9 264 59 87 >> > Joris.Meys at Ugent.be >> > ------------------------------- >> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}}
On Jun 24, 2010, at 6:09 PM, Joris Meys wrote:> I do agree that one should not trust solely on sources like wikipedia > and graphpad, although they contain a lot of valuable information. > > This said, it is not too difficult to illustrate why, in the case of > the one-sample signed rank test,That is a key point. I was assuming that you were using the paired sample version of the WSRT and I may have been misleading the OP. For the one-sample situation, the assumption of symmetry is needed but for the paired sampling version of the test, the location shift becomes the tested hypothesis, and no assumptions about the form of the hypothesis are made except that they be the same. Any consideration of median or mean (which will be the same in the case of symmetric distributions) gets lost in the paired test case. -- David.> the differences should be not to far > away from symmetrical. It just needs some reflection on how the > statistic is calculated. If you have an asymmetrical distribution, you > have a lot of small differences with a negative sign and a lot of > large differences with a positive sign if you test against the median > or mean. Hence the sum of ranks for one side will be higher than for > the other, leading eventually to a significant result. > > An extreme example : > >> set.seed(100) >> y <- rnorm(100,1,2)^2 >> wilcox.test(y,mu=median(y)) > > Wilcoxon signed rank test with continuity correction > > data: y > V = 3240.5, p-value = 0.01396 > alternative hypothesis: true location is not equal to 1.829867 > >> wilcox.test(y,mu=mean(y)) > > Wilcoxon signed rank test with continuity correction > > data: y > V = 1763, p-value = 0.008837 > alternative hypothesis: true location is not equal to 5.137409 > > Which brings us to the question what location is actually tested in > the wilcoxon test. For the measure of location to be the mean (or > median), one has to assume that the distribution of the differences is > rather symmetrical, which implies your data has to be distributed > somewhat symmetrical. The test is robust against violations of this > -implicit- assumption, but in more extreme cases skewness does matter. > > Cheers > Joris > > On Thu, Jun 24, 2010 at 7:40 PM, David Winsemius <dwinsemius at comcast.net > > wrote: >> >> >> You are being misled. Simply finding a statement on a statistics >> software >> website, even one as reputable as Graphpad (???), does not mean >> that it is >> necessarily true. My understanding (confirmed reviewing >> "Nonparametric >> statistical methods for complete and censored data" by M. M. Desu, >> Damaraju >> Raghavarao, is that the Wilcoxon signed-rank test does not require >> that the >> underlying distributions be symmetric. The above quotation is highly >> inaccurate. >> >> -- >> David. >> >>> > > -- > Joris Meys > Statistical consultant > > Ghent University > Faculty of Bioscience Engineering > Department of Applied mathematics, biometrics and process control > > tel : +32 9 264 59 87 > Joris.Meys at Ugent.be > ------------------------------- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php