thr3ads.net - R help - [R] Yates correction [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Jomy Jose

2017-Feb-21 10:47 UTC

[R] Yates correction

I tried to do chi square test for the following observed frequencies
---------------------------------------------------------------------------------------------
   A  B
A  8  4
B 12 10

R gave the following output:
-------------------------------------------------------------------------------------------
        Pearson's Chi-squared test with Yates' continuity correction

data:  M
X-squared = 0.10349, df = 1, p-value = 0.7477

Warning message:
In chisq.test(M) : Chi-squared approximation may be incorrect

---------------------------------------------------------------------------------------------------------------
Whether this result can be relied or we have to use Fishers exact test ?

Jose

	[[alternative HTML version deleted]]

David L Carlson

2017-Feb-21 13:40 UTC

head link

[R] Yates correction

Use fisher.test(). Yates' correction compensates for a tendency for
Chi-square to be overestimated in a 2x2 table, but Yates' can
overcompensate, reducing Chi-square too much. It's main advantage was when
computers were expensive and Fisher's Exact was hard to compute by hand. 
You can see from the following that Fisher's Exact estimates the p-value as
.717, a bit less than .7477.
> M <- matrix(c(8, 12, 4, 10), 2, 2)
> fisher.test(M)
        Fisher's Exact Test for Count Data

data:  M
p-value = 0.717
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.3160571 9.7976232
sample estimates:
odds ratio 
  1.641969

An alternative would be to let chisq.test() use simulations to estimate the
p-value:
> chisq.test(M, simulate.p.value=TRUE)
        Pearson's Chi-squared test with simulated p-value (based on 2000
replicates)

data:  M
X-squared = 0.471, df = NA, p-value = 0.7141

Which agrees pretty well with fisher.test(). The X-squared value of 0.471 is the
uncorrected value so you can see that the Yates' correction reduced it
substantially (to .1035).

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jomy Jose
Sent: Tuesday, February 21, 2017 4:48 AM
To: r-help at r-project.org
Subject: [R] Yates correction

 I tried to do chi square test for the following observed frequencies
---------------------------------------------------------------------------------------------
   A  B
A  8  4
B 12 10

R gave the following output:
-------------------------------------------------------------------------------------------
        Pearson's Chi-squared test with Yates' continuity correction

data:  M
X-squared = 0.10349, df = 1, p-value = 0.7477

Warning message:
In chisq.test(M) : Chi-squared approximation may be incorrect

---------------------------------------------------------------------------------------------------------------
Whether this result can be relied or we have to use Fishers exact test ?

Jose

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Marc Schwartz

2017-Feb-21 15:15 UTC

head link

[R] Yates correction

Hi,

In general, statistical questions that are more conceptual in nature, which is
the case here, are generally frowned upon on this list, since they are
considered off-topic here.

That being said, increasingly, both the Fisher Exact test and the use of the
Yates correction to the Chi-Square test are being challenged as being overly
conservative in small sample situations. The same goes for the common
recommendation of switching to the Fisher Exact Test when there are any expected
cell values of <5, which is what is generating the "Chi-squared
approximation may be incorrect" in the examples below, since one of your
expected cell values is 4.94.

There was a paper by Campbell back in 2007 that discussed this:

Chi-squared and Fisher?Irwin tests of two-by-two tables with small sample
recommendations
http://onlinelibrary.wiley.com/doi/10.1002/sim.2832/abstract

and he has a web site here with additional resources:

http://www.iancampbell.co.uk/twobytwo/twobytwo.htm

Even using his 'n-1' variant of the test with his online calculator on
the above web site, you end up with a p value of 0.5, which is close to the p
value for the uncorrected chi-square (chisq.test() with correct = FALSE) of
0.4925. Thus, none of these cases results in a "statistically
significant" p value of <=0.05. Not that you should be p value hunting
anyway here.

The whole p value discussion is further from being on topic here, but under none
of these hypothesis tests would you reject the null.

Further consultation with a local statistical expert would seem prudent.

Regards,

Marc Schwartz

> On Feb 21, 2017, at 7:40 AM, David L Carlson <dcarlson at tamu.edu>
wrote:
> 
> Use fisher.test(). Yates' correction compensates for a tendency for
Chi-square to be overestimated in a 2x2 table, but Yates' can
overcompensate, reducing Chi-square too much. It's main advantage was when
computers were expensive and Fisher's Exact was hard to compute by hand. 
You can see from the following that Fisher's Exact estimates the p-value as
.717, a bit less than .7477.
> 
>> M <- matrix(c(8, 12, 4, 10), 2, 2)
>> fisher.test(M)
> 
>        Fisher's Exact Test for Count Data
> 
> data:  M
> p-value = 0.717
> alternative hypothesis: true odds ratio is not equal to 1
> 95 percent confidence interval:
> 0.3160571 9.7976232
> sample estimates:
> odds ratio 
>  1.641969
> 
> An alternative would be to let chisq.test() use simulations to estimate the
p-value:
> 
>> chisq.test(M, simulate.p.value=TRUE)
> 
>        Pearson's Chi-squared test with simulated p-value (based on 2000
replicates)
> 
> data:  M
> X-squared = 0.471, df = NA, p-value = 0.7141
> 
> Which agrees pretty well with fisher.test(). The X-squared value of 0.471
is the uncorrected value so you can see that the Yates' correction reduced
it substantially (to .1035).
> 
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
> 
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jomy
Jose
> Sent: Tuesday, February 21, 2017 4:48 AM
> To: r-help at r-project.org
> Subject: [R] Yates correction
> 
> I tried to do chi square test for the following observed frequencies
>
---------------------------------------------------------------------------------------------
>   A  B
> A  8  4
> B 12 10
> 
> R gave the following output:
>
-------------------------------------------------------------------------------------------
>        Pearson's Chi-squared test with Yates' continuity correction
> 
> data:  M
> X-squared = 0.10349, df = 1, p-value = 0.7477
> 
> Warning message:
> In chisq.test(M) : Chi-squared approximation may be incorrect
> 
>
---------------------------------------------------------------------------------------------------------------
> Whether this result can be relied or we have to use Fishers exact test ?
> 
> Jose

Rolf Turner

2017-Feb-21 22:14 UTC

head link

[R] [FORGED] Yates correction

On 21/02/17 23:47, Jomy Jose wrote:>  I tried to do chi square test for the following observed frequencies
>
---------------------------------------------------------------------------------------------
>    A  B
> A  8  4
> B 12 10
>
> R gave the following output:
>
-------------------------------------------------------------------------------------------
>         Pearson's Chi-squared test with Yates' continuity
correction
>
> data:  M
> X-squared = 0.10349, df = 1, p-value = 0.7477
>
> Warning message:
> In chisq.test(M) : Chi-squared approximation may be incorrect
>
>
---------------------------------------------------------------------------------------------------------------
> Whether this result can be relied or we have to use Fishers exact test ?
(a) With a p-value of 0.7477 there is no evidence against the null 
hypothesis no matter how you slice it.

(b) To assuage your trepidations, use "simulate.p.value=TRUE".

E.g.

    chisq.test(M,simulate.p.value=TRUE,B=9999)

Note that the value of X-squared that is returned is "of course" the 
same as what you'd get by setting correct=FALSE. I got a p-value of 
0.7178; you will get something slightly different, since a simulated 
p-value is random, but it will be about 0.71 or 0.72.

Bottom line:  Don't reject H_0!!!

cheers,

Rolf Turner

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

R help - Feb 2017 - Yates correction

[R] Yates correction

[R] Yates correction

[R] Yates correction

[R] [FORGED] Yates correction