Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]]
1. I am not an expert on this. 2. However, my strong prior would be no, since because it is "exact" it has to calculate all the possible configurations and there are a lot to calculate with the values of n1 and n2 you gave. -- Bert On Fri, Apr 8, 2011 at 9:43 AM, Jim Silverton <jim.silverton@gmail.com>wrote:> Is anyone aware of a fast way of doing fisher's exact test for a series of > 2 > x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. > > -- > Thanks, > Jim. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics [[alternative HTML version deleted]]
Do you mean a test something such as this?> fisher.test(matrix(c(502,498,490, 510), nrow = 2))Fisher's Exact Test for Count Data data: matrix(c(502, 498, 490, 510), nrow = 2) p-value = 0.6228 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8770113 1.2550998 sample estimates: odds ratio 1.049119 This runs quickly on my machine.> system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2)))user system elapsed 0.008 0.001 0.010> sessionInfo()R version 2.12.2 (2011-02-25) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2>Can you provide an example that is running slowly for you? Steven McKinney ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Jim Silverton [jim.silverton at gmail.com] Sent: April 8, 2011 9:43 AM To: r-help at r-project.org Subject: Re: [R] Fast version of Fisher's Exact Test Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Depends on how many other programs, and how large they are, and how much RAM you have on your machine. If I repeatedly run the example I used below, my R session shows 170MB of memory usage, not a huge amount relative to total memory, and not a huge amount even for 32 bit R. But if your system has 2 GB of RAM and 1.9 GB is consumed by other processes, then this example will cause swapping and speed will be reduced. So figuring out a solution requires understanding what it is that is causing the slowdown - not enough RAM, other programs competing for CPU cycles... You can try switching to 64 bit R but unless your 32 bit R is loading some large data objects, leaving little RAM, you won't see much of a difference. If you start R, and do rm(list = ls()) to ensure no big data objects are using up RAM, does the example below still take a long time? You haven't mentioned what operating system you are using, how much RAM you have or what sessionInfo() reports on your machine. That information will help to figure this out. Steven McKinney ________________________________________ From: Jim Silverton [jim.silverton at gmail.com] Sent: April 9, 2011 9:21 AM To: Steven McKinney Subject: Re: [R] Fast version of Fisher's Exact Test I R 32 bit installed but my machine is 64 bit. Do I need to upgrade the R to 64 bit for it to run faster? On Fri, Apr 8, 2011 at 6:44 PM, Steven McKinney <smckinney at bccrc.ca<mailto:smckinney at bccrc.ca>> wrote: Do you mean a test something such as this?> fisher.test(matrix(c(502,498,490, 510), nrow = 2))Fisher's Exact Test for Count Data data: matrix(c(502, 498, 490, 510), nrow = 2) p-value = 0.6228 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8770113 1.2550998 sample estimates: odds ratio 1.049119 This runs quickly on my machine.> system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2)))user system elapsed 0.008 0.001 0.010> sessionInfo()R version 2.12.2 (2011-02-25) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2>Can you provide an example that is running slowly for you? Steven McKinney ________________________________________ From: r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org> [r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org>] On Behalf Of Jim Silverton [jim.silverton at gmail.com<mailto:jim.silverton at gmail.com>] Sent: April 8, 2011 9:43 AM To: r-help at r-project.org<mailto:r-help at r-project.org> Subject: Re: [R] Fast version of Fisher's Exact Test Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thanks, Jim.
> Is anyone aware of a fast way of doing fisher's exact test for a series of 2 > x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000.If you don't require exact two-sided p-values (determined according to a likelihood criterion as in fisher.test), you can use the vectorised fisher.pval() function from the 'corpora' package. The function is just a wrapper around the hypergeometric distribution; it doesn't compute confidence intervals, which are much more difficult to obtain. Best, Stefan