dih69530@syd.odn.ne.jp
2005-Oct-30 14:07 UTC
[Rd] Yates' correction for continuity in chisq.test (PR#8265)
Full_Name: foo ba baz Version: R2.2.0 OS: Mac OS X (10.4) Submission from: (NULL) (219.66.32.183) chisq.test(matrix(c(9,10,9,11),2,2)) Chi-square value must be 0, and, P value must be 0 R does over correction when | a d - b c | < n / 2 ,chi-sq must be 0
P Ehlers
2005-Oct-30 14:49 UTC
[Rd] Yates' correction for continuity in chisq.test (PR#8265)
dih69530 at syd.odn.ne.jp wrote:> Full_Name: foo ba baz > Version: R2.2.0 > OS: Mac OS X (10.4) > Submission from: (NULL) (219.66.32.183) > > > chisq.test(matrix(c(9,10,9,11),2,2)) > > Chi-square value must be 0, and, P value must be 0 > R does over correction > > when | a d - b c | < n / 2 ,chi-sq must be 0 >(Presumably, you mean P-value = 1.) If you don't want the correction, set correct=FALSE. (The results won't differ much.) A better example is chisq.test(matrix(c(9,10,9,10),2,2)) for which R probably should return X-squared = 0. Peter Ehlers
ripley@stats.ox.ac.uk
2005-Oct-31 13:02 UTC
[Rd] Yates' correction for continuity in chisq.test (PR#8265)
On Sun, 30 Oct 2005, P Ehlers wrote:> dih69530 at syd.odn.ne.jp wrote: >> Full_Name: foo ba baz >> Version: R2.2.0 >> OS: Mac OS X (10.4) >> Submission from: (NULL) (219.66.32.183) >> >> >> chisq.test(matrix(c(9,10,9,11),2,2)) >> >> Chi-square value must be 0, and, P value must be 0 >> R does over correction >> >> when | a d - b c | < n / 2 ,chi-sq must be 0 > > (Presumably, you mean P-value = 1.) > If you don't want the correction, set correct=FALSE. (The > results won't differ much.) > > A better example is > > chisq.test(matrix(c(9,10,9,10),2,2)) > > for which R probably should return X-squared = 0.R is using the correction that almost all the sources I looked at suggest. You can't go around adjusting X^2 for just some values of the data: the claim is that the adjusted statistic has a more accurate chisq distribution under the null. I think at this remove it does not matter what Yates' suggested (although if I were writing a textbook I would find out), especially as the R documentation does not mention Yates. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
ehlers@math.ucalgary.ca
2005-Oct-31 16:12 UTC
[Rd] Yates' correction for continuity in chisq.test (PR#8265)
Prof Brian Ripley wrote:> On Sun, 30 Oct 2005, P Ehlers wrote: > >> dih69530 at syd.odn.ne.jp wrote: >> >>> Full_Name: foo ba baz >>> Version: R2.2.0 >>> OS: Mac OS X (10.4) >>> Submission from: (NULL) (219.66.32.183) >>> >>> >>> chisq.test(matrix(c(9,10,9,11),2,2)) >>> >>> Chi-square value must be 0, and, P value must be 0 >>> R does over correction >>> >>> when | a d - b c | < n / 2 ,chi-sq must be 0 >> >> >> (Presumably, you mean P-value = 1.) >> If you don't want the correction, set correct=FALSE. (The >> results won't differ much.) >> >> A better example is >> >> chisq.test(matrix(c(9,10,9,10),2,2)) >> >> for which R probably should return X-squared = 0. > > > R is using the correction that almost all the sources I looked at > suggest. You can't go around adjusting X^2 for just some values of the > data: the claim is that the adjusted statistic has a more accurate chisq > distribution under the null. > > I think at this remove it does not matter what Yates' suggested > (although if I were writing a textbook I would find out), especially as > the R documentation does not mention Yates. >You're quite right that, for consistency, the correction should be applied even in the silly example I gave. And, of course, one should not be doing a chisquare test on silly examples. Peter Ehlers
Prof Brian Ripley
2005-Nov-01 10:58 UTC
[Rd] Yates' correction for continuity in chisq.test (PR#8265)
On Mon, 31 Oct 2005, Prof Brian Ripley wrote:> On Sun, 30 Oct 2005, P Ehlers wrote: > >> dih69530 at syd.odn.ne.jp wrote: >>> Full_Name: foo ba baz >>> Version: R2.2.0 >>> OS: Mac OS X (10.4) >>> Submission from: (NULL) (219.66.32.183) >>> >>> >>> chisq.test(matrix(c(9,10,9,11),2,2)) >>> >>> Chi-square value must be 0, and, P value must be 0 >>> R does over correction >>> >>> when | a d - b c | < n / 2 ,chi-sq must be 0 >> >> (Presumably, you mean P-value = 1.) >> If you don't want the correction, set correct=FALSE. (The >> results won't differ much.) >> >> A better example is >> >> chisq.test(matrix(c(9,10,9,10),2,2)) >> >> for which R probably should return X-squared = 0. > > R is using the correction that almost all the sources I looked at suggest. > You can't go around adjusting X^2 for just some values of the data: the claim > is that the adjusted statistic has a more accurate chisq distribution under > the null. > > I think at this remove it does not matter what Yates' suggested (although if > I were writing a textbook I would find out), especially as the R > documentation does not mention Yates.I have now checked Yates (1934), Fisher's 'Statistical Methods for Research Workers' and the Encyclopedia of Statistical Sciences. The first two are vague (and could perhaps be read as not correcting O-E = 0), but the latter agrees with R in giving a formula which always subtracts 1/2. Also, it mentions that Pearson stated that the formula for the continuity correction long preceded Yates' publication, so it is perhaps reasonable not to mention Yates. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595