Hello, I think I am right in saying that a 2 sample wilcox.test is equal to a 2 sample kruskal.test and a 2 sample t.test is equal to a 2 sample anova. This is also stated in the ?kruskal.test man page: The Wilcoxon rank sum test (wilcox.test) as the special case for two samples; lm together with anova for performing one-way location analysis under normality assumptions; with Student's t test (t.test) as the special case for two samples.>From this example it seems like it doesn't but I cannot figure out what I amdoing wrong. x <- c(10,11,15,8,16,12,20) y <- c(10,14,18,25,28,30,35) f <- c(rep("a",7), rep("b",7)) d <- c(x,y) wilcox.test(x,y) kruskal.test(x,y) kruskal.test(x~y) kruskal.test(f~d) t.test(x,y) anova(lm(x~y)) summary(aov(lm(x~y))) And why does kruskal.test(x~y) differ from kruskal.test(f~d)?? Cheers -- View this message in context: http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4282888.html Sent from the R help mailing list archive at Nabble.com.
2012/1/10 syrvn <mentor_@gmx.net>> And why does kruskal.test(x~y) differ from kruskal.test(f~d)?? >Your formula is wrong, but function doesn't see errors. "formula a formula of the form lhs ~ rhs where lhs gives the data values and rhs the corresponding groups." And that leads to kruskal.test(d~as.factor(f)) which is fine. -- Mi³ego dnia [[alternative HTML version deleted]]
Hi, thanks for your answer. Unfortunately I cannot reproduce your results. In my example the results still differ when I use your approach:> x <- c(10,11,15,8,16,12,20) > y <- c(10,14,18,25,28,30,35) > f <- as.factor(c(rep("a",7), rep("b",7))) > d <- c(x,y) > kruskal.test(x,y)Kruskal-Wallis rank sum test data: x and y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232> kruskal.test(x~y)Kruskal-Wallis rank sum test data: x by y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232> kruskal.test(d~f)Kruskal-Wallis rank sum test data: d by f Kruskal-Wallis chi-squared = 3.6816, df = 1, p-value = 0.05502> kruskal.test(f~d)Kruskal-Wallis rank sum test data: f by d Kruskal-Wallis chi-squared = 11.1429, df = 12, p-value = 0.5167 I know the last kruskal.test(f~d) is not correct as the factor is always placed as the second bit but I still tried it that way just to be sure... Cheers -- View this message in context: http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4285003.html Sent from the R help mailing list archive at Nabble.com.
Hi Michael and Mi?ego dnia, yes right. I get identical results now! thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4285325.html Sent from the R help mailing list archive at Nabble.com.
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of syrvn > Sent: Tuesday, January 10, 2012 10:28 AM > To: r-help at r-project.org > Subject: [R] 2 sample wilcox.test != kruskal.test > > Hello, > > > I think I am right in saying that a 2 sample wilcox.test is equal to a > 2 > sample kruskal.test and > > a 2 sample t.test is equal to a 2 sample anova. This is also stated in > the > ?kruskal.test man page: > > The Wilcoxon rank sum test (wilcox.test) as the special case for two > samples; lm together with anova for performing one-way location > analysis > under normality assumptions; with Student's t test (t.test) as the > special > case for two samples. > > > >From this example it seems like it doesn't but I cannot figure out > what I am > doing wrong. > > > x <- c(10,11,15,8,16,12,20) > y <- c(10,14,18,25,28,30,35) > f <- c(rep("a",7), rep("b",7)) > d <- c(x,y) > > wilcox.test(x,y) > kruskal.test(x,y) > kruskal.test(x~y) > kruskal.test(f~d) > > t.test(x,y) > anova(lm(x~y)) > summary(aov(lm(x~y))) > > > And why does kruskal.test(x~y) differ from kruskal.test(f~d)?? > >You have received answers about the kruskal.test. But, to make a final point, if your purpose for these statements> t.test(x,y) > anova(lm(x~y)) > summary(aov(lm(x~y)))Was to compare the t.test results with anova, you have misspecified the call to lm(). To get comparable results you should look at t.test(x,y) summary(lm(d ~ as.factor(f))) The difference between the two is that t.test() use the Welch adjustment to the degrees of freedom. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204