> On 16 Jul 2015, at 15:13 , Ivan Calandra <ivan.calandra at
univ-reims.fr> wrote:
>
> Dear useRs,
>
> I am running a wilcox.test() on two subsets of a dataset and get exactly
the same results although the raw data are different in the subsets.
>
> mydata <- structure(list(cat1 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("high",
"low"), class = "factor"), cat2 = structure(c(1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label =
c("large", "small"), class = "factor"), var1 =
c(2.012743, 1.51272, 1.328453, 1.2609935, 1.617757, 1.8175455, 1.890035,
2.3652205, 1.295888, 1.5985145, 1.081813, 1.856733, 2.366358, 2.27421, 1.727023,
2.230433, 5.272843, 3.7626355), var2 = c(0.00196, 0.0066545, 0.006188,
0.0058985, 0.004453, 0.005468, 0.003773, 0.004742, 0.007525, 0.0081235,
0.004611, 0.0050475, 0.006643, 0.0097335, 0.009213, 0.0049525, 0.006243,
0.006021)), .Names = c("cat1", "cat2", "var1",
"var2"), row.names = c(NA, 18L), class = "data.frame")
>
> #p-values are identical but W different for the first variable
> wilcox.test(var1~cat1, data=mydata[mydata$cat2=="large",])
> wilcox.test(var1~cat1, data=mydata[mydata$cat2=="small",])
>
> #both p-values and W are identical for the second variable
> wilcox.test(var2~cat1, data=mydata[mydata$cat2=="large",])
> wilcox.test(var2~cat1, data=mydata[mydata$cat2=="small",])
>
> Did I do something wrong or does it just have something to do with my
dataset? Or is it just a coincidence?
Coincidence, mostly, I think:
You have
> table(mydata[mydata$cat2=="small","cat1"])
high low
4 5
> table(mydata[mydata$cat2=="large","cat1"])
high low
4 5
and all of your response variables' values are distinct.
In both cases, the null distribution of the rank sum W is that of
(sum(sample(1:9,4))-sum(1:4)) which is a distribution on 0:20, symmetric around
10. Hence there are only 11 different p-values possible, so it is not
particularly odd that you may get the same one twice.
>
> Thank you in advance for your help,
> Ivan
>
> --
> Ivan Calandra, ATER
> University of Reims Champagne-Ardenne
> GEGENAA - EA 3795
> CREA - 2 esplanade Roland Garros
> 51100 Reims, France
> +33(0)3 26 77 36 89
> ivan.calandra at univ-reims.fr
> https://www.researchgate.net/profile/Ivan_Calandra
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com