thr3ads.net - R help - [R] chisq.test: decreasing p-value [Mar 2009]

If this information is useful, please help other people find it:
Share via:

soeren.vogel at eawag.ch

2009-Mar-11 10:36 UTC

[R] chisq.test: decreasing p-value

A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.

When applying a chi square test with increasing number of repetitions  
(simulate.p.value) over a fixed sample, the p-value decreases  
dramatically (looks as if converge to zero).

(1) Why?
(2) (If this test is wrong), then which test can check what I want to  
check, that is: are the two distributions of frequencies (observed and  
expected) in principle the same?
(3) By the way, how to deal with low frequency cells?

r <- c(10, 100, 500, 1000, 2000, 5000)
v <- c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })

Thank you, S?ren


-- 
S?ren Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

Peter Dalgaard

2009-Mar-11 11:24 UTC

head link

[R] chisq.test: decreasing p-value

soeren.vogel at eawag.ch wrote:> A Likert scale may have produced counts of answers per category.
> According to theory I may expect equality over the categories. A
> statistical test shall reveal the actual equality in my sample.
> 
> When applying a chi square test with increasing number of repetitions
> (simulate.p.value) over a fixed sample, the p-value decreases
> dramatically (looks as if converge to zero).
> 
> (1) Why?
> (2) (If this test is wrong), then which test can check what I want to
> check, that is: are the two distributions of frequencies (observed and
> expected) in principle the same?
> (3) By the way, how to deal with low frequency cells?
> 
> r <- c(10, 100, 500, 1000, 2000, 5000)
> v <- c(35, 40, 45, 45, 40, 35)
> sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),
> rescale.p=T, simulate.p.value=T, B=x)$p.value })
This is a combination of user error and an infelicity in chisq.test.

You are sapply'ing over a list with one element, so essentially you are
doing

chisq.test(v, p=c(rep.int(40, 6)),
 rescale.p=T, simulate.p.value=T, B=r)$p.value

Now B is supposed to be a single integer, so the above cannot be
expected to do anything sensible, but you might have hoped for an error
message. Instead, it seems that you get the result of r[1] replications
divided by r+1:
> chisq.test(v, p=c(rep.int(40, 6)), rescale.p=T, simulate.p.value=T,B=r)$p.value
[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720
> 7/(r+1)[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720

What you really wanted was
> sapply(r,function (x) { chisq.test(v, p=c(rep.int(40, 6)),rescale.p=T, simulate.p.value=T, B=x)$p.value })
[1] 0.9090909 0.8118812 0.7964072 0.7672328 0.8025987 0.7932414


> Thank you, S?ren
> 
> 
> --S?ren Vogel, PhD-Student, Eawag, Dept. SIAM
> http://www.eawag.ch, http://sozmod.eawag.ch
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

David Winsemius

2009-Mar-11 11:32 UTC

head link

[R] chisq.test: decreasing p-value

On Mar 11, 2009, at 6:36 AM, soeren.vogel at eawag.ch wrote:
> A Likert scale may have produced counts of answers per category.  
> According to theory I may expect equality over the categories. A  
> statistical test shall reveal the actual equality in my sample.
>
> When applying a chi square test with increasing number of  
> repetitions (simulate.p.value) over a fixed sample, the p-value  
> decreases dramatically (looks as if converge to zero).
>
> (1) Why?
With low numbers of repetitions the test has low power, i.e, it may  
give you the wrong answer to the question: are those two vectors from  
the same distribution? As you increase in number, the simulated value  
approaches the "truth".>
> (2) (If this test is wrong), then which test can check what I want  
> to check, that is: are the two distributions of frequencies  
> (observed and expected) in principle the same?
"In principle" they are not the same. Do you want a test that tells  
you they are?>
> (3) By the way, how to deal with low frequency cells?
>
> r <- c(10, 100, 500, 1000, 2000, 5000)
> v <- c(35, 40, 45, 45, 40, 35)
> sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
> rescale.p=T, simulate.p.value=T, B=x)$p.value })
>
> Thank you, S?ren
>
>
> -- 
> S?ren Vogel, PhD-Student, Eawag, Dept. SIAM
> http://www.eawag.ch, http://sozmod.eawag.ch
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

David Winsemius

2009-Mar-11 13:39 UTC

head link

[R] chisq.test: decreasing p-value

Thanks to Peter Dalgaard for the correct answer. I misinterpreted what  
R was returning.


On Mar 11, 2009, at 7:32 AM, David Winsemius wrote:
>
> On Mar 11, 2009, at 6:36 AM, soeren.vogel at eawag.ch wrote:
>
>> A Likert scale may have produced counts of answers per category.  
>> According to theory I may expect equality over the categories. A  
>> statistical test shall reveal the actual equality in my sample.
>>
>> When applying a chi square test with increasing number of  
>> repetitions (simulate.p.value) over a fixed sample, the p-value  
>> decreases dramatically (looks as if converge to zero).
>>
>> (1) Why?
>
> With low numbers of repetitions the test has low power, i.e, it may  
> give you the wrong answer to the question: are those two vectors  
> from the same distribution? As you increase in number, the simulated  
> value approaches the "truth".
>>
>> (2) (If this test is wrong), then which test can check what I want  
>> to check, that is: are the two distributions of frequencies  
>> (observed and expected) in principle the same?
>
> "In principle" they are not the same. Do you want a test that
tells
> you they are?
>>
>> (3) By the way, how to deal with low frequency cells?
>>
>> r <- c(10, 100, 500, 1000, 2000, 5000)
>> v <- c(35, 40, 45, 45, 40, 35)
>> sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
>> rescale.p=T, simulate.p.value=T, B=x)$p.value })
>>
>>
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Maybe Matching Threads

Search for more seemingly similar threads

R help - Mar 2009 - chisq.test: decreasing p-value

[R] chisq.test: decreasing p-value

[R] chisq.test: decreasing p-value

[R] chisq.test: decreasing p-value

[R] chisq.test: decreasing p-value

Maybe Matching Threads