Dear Michael,
Thanks very much for your answers!
The purpose of my analysis is to test whether the contingency table x is
different from the contingency table y.
Or, to put it differently, whether there is a significant difference between
the joint distribution A&B and A&C.
Based on your answer I'm wondering whether the best way to do this is really
a chisq.test?
Or is there probably a different function or package I should use
altogether?
Thanks,
Michael
-----Original Message-----
From: Meyners, Michael [mailto:meyners.m@pg.com]
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help@r-project.org
Subject: RE: [R] Pearson chi-square test
Just for completeness: the manual calculation you'd want is most likely
sum((x-y)^2 / (x+y))
(that's one you can find on the Wikipedia link you provided). To get the
same from chisq.test, try something like
chisq.test(data.frame(x,y)[,c(3,6)])
(there are surely smarter ways, but at least it works here). Note that
something like
chisq.test(as.vector(x), as.vector(y))
will give a different test, i.e. based on a contingency table of x cross y).
M.
> -----Original Message-----
> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Meyners, Michael
> Sent: Tuesday, September 27, 2011 13:28
> To: Michael Haenlein; r-help@r-project.org
> Subject: Re: [R] Pearson chi-square test
>
> Not sure what you want to test here with two matrices, but reading the
> manual helps here as well:
>
> y a vector; ignored if x is a matrix.
>
> x and y are matrices in your example, so it comes as no surprise that
> you get different results. On top of that, your manual calculation is
> not correct if you want to test whether two samples come from the same
> distribution (so don't be surprised if R still gives a different
> value...).
>
> HTH, Michael
>
> > -----Original Message-----
> > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Michael Haenlein
> > Sent: Tuesday, September 27, 2011 12:45
> > To: r-help@r-project.org
> > Subject: [R] Pearson chi-square test
> >
> > Dear all,
> >
> > I have some trouble understanding the chisq.test function.
> > Take the following example:
> >
> > set.seed(1)
> > A <- cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
> > B <- cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
> > C <- cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
> > x <- table(A,B)
> > y <- table(A,C)
> >
> > When I calculate the test statistic by hand I get a value of
> > approximately
> > 75.9:
> > http://en.wikipedia.org/wiki/Pearson's_chi-
> > square_test#Calculating_the_test-statistic
> > sum((x-y)^2/y)
> >
> > But when I do chisq.test(x,y) I get a value of 12.2 while
> > chisq.test(y,x)
> > gives a value of 10.3.
> >
> > I understand that I must be doing something wrong here, but I'm
not
> > sure
> > what.
> >
> > Thanks,
> >
> > Michael
[[alternative HTML version deleted]]