thr3ads.net - R help - [R] Fishers exact test at

If this information is useful, please help other people find it:
Share via:

SÃ¸ren Faurby

2009-Dec-17 12:11 UTC

[R] Fishers exact test at < 2.2e-16

In an effort to select the most appropriate number of clusters in a
mixture analysis I am comparing the expected and actual membership of
individuals in various clusters using the Fisher?s exact test. I aim
for the model with the lowest possible p-value, but I frequently get
p-values below 2.2e-16 and therefore does not get exact p-values with
standard Fisher?s exact tests in R.

Does anybody know if there is a version of Fisher?s exact test in
any package which can handle lower probabilities, or have other  
suggestions as to how I can compare the probabilities?

I am for instance comparing the following two:

dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
fisher.test(dat2, workspace=30000000)

dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
nrow=3)
fisher.test(dat3, workspace=30000000)

Which both result in p-value < 2.2e-16

Kind regards, S?ren

Ben Bolker

2009-Dec-17 14:04 UTC

head link

[R] Fishers exact test at < 2.2e-16

S??ren Faurby <soren.faurby <at> biology.au.dk> writes:
> 
> In an effort to select the most appropriate number of clusters in a
> mixture analysis I am comparing the expected and actual membership of
> individuals in various clusters using the Fisher?s exact test. I aim
> for the model with the lowest possible p-value, but I frequently get
> p-values below 2.2e-16 and therefore does not get exact p-values with
> standard Fisher?s exact tests in R.
> 
  The p<2.2e-16 is a printing issue, not a precision issue.
> ff = fisher.test(dat3, workspace=30000000)
> ff
	Fisher's Exact Test for Count Data

data:  dat3 
p-value < 2.2e-16
alternative hypothesis: two.sided 
> str(ff)List of 4
 $ p.value    : num 5.88e-58
 $ alternative: chr "two.sided"
 $ method     : chr "Fisher's Exact Test for Count Data"
 $ data.name  : chr "dat3"
 - attr(*, "class")= chr "htest"

So just use ff$p.value

Robin Hankin

2009-Dec-17 14:07 UTC

head link

[R] Fishers exact test at < 2.2e-16

The aylmer package has some functionality in this regard
which you may find useful.

In particular, you can use good() to get a feel for the
number of tableaux that are consistent with the
specified marginal totals:




 > good(dat2)
[1] 42285210
 > good(dat3)
[1] 2.756286e+12
 >


HTH

rksh


S?ren Faurby wrote:> In an effort to select the most appropriate number of clusters in a
> mixture analysis I am comparing the expected and actual membership of
> individuals in various clusters using the Fisher?s exact test. I aim
> for the model with the lowest possible p-value, but I frequently get
> p-values below 2.2e-16 and therefore does not get exact p-values with
> standard Fisher?s exact tests in R.
>
> Does anybody know if there is a version of Fisher?s exact test in
> any package which can handle lower probabilities, or have other 
> suggestions as to how I can compare the probabilities?
>
> I am for instance comparing the following two:
>
> dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
> fisher.test(dat2, workspace=30000000)
>
> dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
> nrow=3)
> fisher.test(dat3, workspace=30000000)
>
> Which both result in p-value < 2.2e-16
>
> Kind regards, S?ren
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Robin K. S. Hankin
Uncertainty Analyst
University of Cambridge
19 Silver Street
Cambridge CB3 9EP
01223-764877

Peter Dalgaard

2009-Dec-17 14:16 UTC

head link

[R] Fishers exact test at < 2.2e-16

S?ren Faurby wrote:> In an effort to select the most appropriate number of clusters in a
> mixture analysis I am comparing the expected and actual membership of
> individuals in various clusters using the Fisher?s exact test. I aim
> for the model with the lowest possible p-value, but I frequently get
> p-values below 2.2e-16 and therefore does not get exact p-values with
> standard Fisher?s exact tests in R.
> 
> Does anybody know if there is a version of Fisher?s exact test in
> any package which can handle lower probabilities, or have other
> suggestions as to how I can compare the probabilities?
> 
> I am for instance comparing the following two:
> 
> dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
> fisher.test(dat2, workspace=30000000)
> 
> dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
> nrow=3)
> fisher.test(dat3, workspace=30000000)
> 
> Which both result in p-value < 2.2e-16
> 
> Kind regards, S?ren
The direct answer is that it is primarily an issue of printing conventions:
> fisher.test(dat2, workspace=30000000)$p.value
[1] 5.384278e-44> fisher.test(dat3, workspace=30000000)$p.value[1] 5.883133e-58

However, I'm not sure (a) what is the influence of underflow in the
calculation of such tiny p-values, or (b) whether the p-value is a
sensible metric for comparing clustering models at all.

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

Christian Hennig

2009-Dec-17 14:24 UTC

head link

[R] Fishers exact test at < 2.2e-16

I know that you didn't ask for this but to me this seems to be a very 
dodgy method to select a "best number of clusters" with no proper
basis at
all. All of these tests are data dependent, so the p-values cannot be 
interpreted in the usual way. It is actually not clear how they can be 
interpreted, and the freedom in the data to find a clustering depends on 
the number of clusters, so there is no reason to expect that comparing 
p-values for different numbers tells you anything meaningful. Do you 
really think that it is an informative difference if one clustering gives
you p=10^{-58} and another one 10^{-30}?

Christian

On Thu, 17 Dec 2009, S??ren Faurby wrote:
> In an effort to select the most appropriate number of clusters in a
> mixture analysis I am comparing the expected and actual membership of
> individuals in various clusters using the Fisher?s exact test. I aim
> for the model with the lowest possible p-value, but I frequently get
> p-values below 2.2e-16 and therefore does not get exact p-values with
> standard Fisher?s exact tests in R.
>
> Does anybody know if there is a version of Fisher?s exact test in
> any package which can handle lower probabilities, or have other suggestions
> as to how I can compare the probabilities?
>
> I am for instance comparing the following two:
>
> dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
> fisher.test(dat2, workspace=30000000)
>
> dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
> nrow=3)
> fisher.test(dat3, workspace=30000000)
>
> Which both result in p-value < 2.2e-16
>
> Kind regards, S?ren
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

Reasonably Related Threads

Search for more reasonably related threads

R help - Dec 2009 - Fishers exact test at < 2.2e-16

[R] Fishers exact test at < 2.2e-16

[R] Fishers exact test at < 2.2e-16

[R] Fishers exact test at < 2.2e-16

[R] Fishers exact test at < 2.2e-16

[R] Fishers exact test at < 2.2e-16

Reasonably Related Threads