Is Pearson's Chi-Square test for contingency tables asymptotically unbiased for large tables (large degrees of freedom) regardless of the expected values in each cell? The rule of thumb is that Pearson's Chi-square should not be used when large numbers of cells have expected values < 5. However, I compared the results on 4x4 contingency tables for R's chisq.test using chi-square approximation vs. chisq.test using a large number of monte carlo simulations, and the results agree within a fairly small error. This is true even when every cell of the table has an expected value < 2. I tried several tables, but the best example was: 4 1 1 1 1 4 1 1 1 1 4 1 1 1 1 4 As expected, the chi-square approximation appears to be very poor when both the expected values and degrees of freedom are small. Is there a good theoretical reason why the chi-square test seems to perform well on large contingency tables even with small expected values? Are the standard rules of thumb overly simplistic? -- View this message in context: http://www.nabble.com/Validity-of-Pearson%27s-Chi-Square-for-Large-Tables-tp23844791p23844791.html Sent from the R help mailing list archive at Nabble.com.
Gerard M. Keogh
2009-Jun-03 09:28 UTC
[R] Validity of Pearson's Chi-Square for Large Tables
Hi,
didn't get your name.
For large tables (5 X 5) or bigger the dist of the log of the cross product
ratios tends to normality. there are (nC2)**2/2 of these (200 in a 5X5
table. The chi-sq test for independence fits a main effects loglinear model
to the table and this can be expressed in terms of the cross product ratios
(see Discrete Multivariate Analysis by bishop fineburg and holland 1975).
Problems only arise when there are a lot of zeros as this posits weight in
the log dist at +/- infinity.
This in turn will result in a poor chi-sq approx.
Gerard
dsimcha
<dsimcha at yahoo.co
m> To
Sent by: r-help at r-project.org
r-help-bounces at r- cc
project.org
Subject
[R] Validity of Pearson's
03/06/2009 04:32 Chi-Square for Large Tables
Is Pearson's Chi-Square test for contingency tables asymptotically unbiased
for large tables (large degrees of freedom) regardless of the expected
values in each cell? The rule of thumb is that Pearson's Chi-square should
not be used when large numbers of cells have expected values < 5. However,
I compared the results on 4x4 contingency tables for R's chisq.test using
chi-square approximation vs. chisq.test using a large number of monte carlo
simulations, and the results agree within a fairly small error. This is
true even when every cell of the table has an expected value < 2. I tried
several tables, but the best example was:
4 1 1 1
1 4 1 1
1 1 4 1
1 1 1 4
As expected, the chi-square approximation appears to be very poor when both
the expected values and degrees of freedom are small. Is there a good
theoretical reason why the chi-square test seems to perform well on large
contingency tables even with small expected values? Are the standard rules
of thumb overly simplistic?
--
View this message in context:
http://www.nabble.com/Validity-of-Pearson%27s-Chi-Square-for-Large-Tables-tp23844791p23844791.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
**********************************************************************************
The information transmitted is intended only for the person or entity to which
it is addressed and may contain confidential and/or privileged material. Any
review, retransmission, dissemination or other use of, or taking of any action
in reliance upon, this information by persons or entities other than the
intended recipient is prohibited. If you received this in error, please contact
the sender and delete the material from any computer. It is the policy of the
Department of Justice, Equality and Law Reform and the Agencies and Offices
using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive you
should contact the sender immediately and also mailminder[at]justice.ie.
Is le haghaidh an duine n? an eintitis ar a bhfuil s? d?rithe, agus le haghaidh
an duine n? an eintitis sin amh?in, a bhearta?tear an fhaisn?is a tarchuireadh
agus f?adfaidh s? go bhfuil ?bhar faoi r?n agus/n? faoi phribhl?id inti.
Toirmisctear aon athbhreithni?, atarchur n? leathadh a dh?anamh ar an bhfaisn?is
seo, aon ?s?id eile a bhaint aisti n? aon ghn?omh a dh?anamh ar a hiontaoibh, ag
daoine n? ag eintitis seachas an faighteoir beartaithe. M? fuair t? ? seo tr?
dhearmad, t?igh i dteagmh?il leis an seolt?ir, le do thoil, agus scrios an
t-?bhar as aon r?omhaire. Is ? beartas na Roinne Dl? agus Cirt, Comhionannais
agus Athch?irithe Dl?, agus na nOif?g? agus na nGn?omhaireachta? a ?s?ideann
seirbh?s? TF na Roinne, seoladh ?bhair chol?il a dh?chead?.
M?s rud ? go measann t? gur ?bhar col?il at? san ?bhar at? sa teachtaireacht seo
is ceart duit dul i dteagmh?il leis an seolt?ir l?ithreach agus le
mailminder[ag]justice.ie chomh maith.
***********************************************************************************