Hi Gerrit, I tried both of your suggestions and got the exact same thing. Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05 replicates) data: Trapz p-value = 1e-05 alternative hypothesis: two.sided I put in a few changes myself based on the details section on what should be used for a larger than 2x2 table, getting the exact same thing as before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control list(30) and changed simulate.p.value = TRUE.> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control list(30), simulate.p.value = TRUE, B =1e5)isher's Exact Test for Count Data with simulated p-value (based on 1e+05 replicates) data: Trapz p-value = 1e-05 alternative hypothesis: two.sided> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control list(30), simulate.p.value = TRUE, B =1e7)Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 replicates) data: Trapz p-value = 1e-07 alternative hypothesis: two.sided Dispite these chages, the changes equations is not giving me the results for the calculations. The changes I have made seem to satisfy what is in the details section on R, and I don't have the issue of workspace in R. What I do to get the results of the fisher test? Is there something simple that I am missing? Regards, Paul On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner < Gerrit.Eichner at math.uni-giessen.de> wrote:> Paul, > > as the error messages of your first three attempts (see below) tell you - > in an admittedly rather cryptic way - your table or its sample size, > respectively, are too large, so that either the "largest (hash table) key" > is too large, or your (i.e., R's) workspace is too small, or your > hardware/os cannot allocate enough memory to calculate the p-value of > Fisher Exact Test exactly by means of the implemented algorithm. > > One way out of this is to approximate the exact p-value through > simulation, but apparently there occurred a typo in your (last) attempt to > do that (Error: unexpected '>' in ">"). > > > So, for me the following works (and it should also for you) and gives the > shown output (after a very short while): > > Trapz <- as.matrix( read.table( "w.txt", head = T, row.names = "Traps")) >> > > set.seed( 20150828) # For the sake of reproducibility. >> fisher.test( Trapz, simulate.p.value = TRUE, >> > + B = 1e5) > > Fisher's Exact Test for Count Data with simulated p-value (based on > 1e+05 replicates) > > data: Trapz > p-value = 1e-05 > alternative hypothesis: two.sided > > > > Or for a higher value for B if you are patient enough (with a computing > time of several seconds) : > > set.seed( 20150828) >> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7) >> > > Fisher's Exact Test for Count Data with simulated p-value (based on > 1e+07 replicates) > > data: Trapz > p-value = 1e-07 > alternative hypothesis: two.sided > > > Hth -- Gerrit > > (BTW, you don't have to specify arguments (in function calls) whose > default values you don't want to change.) > > > > > On Fri, 28 Aug 2015, paul brett wrote: > > Hi Gerrit, >> I spotted that, it was a mistake on my own part, it should >> read 1.trap.2.barrier. I have corrected it on the file attached. >> >> So I have done these so far: >> > fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control = list(), >> or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >> 0.95,simulate.p.value = FALSE, B = 2000) >> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control >> list(), : >> FEXACT error 501. >> The hash table key cannot be computed because the largest key >> is larger than the largest representable int. >> The algorithm cannot proceed. >> Reduce the workspace size or use another algorithm. >> >> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(), or >>> >> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >> 0.95,simulate.p.value = FALSE, B = 2000) >> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control >> list(), : >> FEXACT error 40. >> Out of workspace. >> >>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >>> >> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >> 0.95,simulate.p.value = FALSE, B = 2000) >> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control >> list(), : >> FEXACT error 501. >> The hash table key cannot be computed because the largest key >> is larger than the largest representable int. >> The algorithm cannot proceed. >> Reduce the workspace size or use another algorithm. >> >>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control >>> >> list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >> 0.95,simulate.p.value = FALSE, B = 2000) >> Error: cannot allocate vector of size 7.5 Gb >> In addition: Warning messages: >> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >> list(), : >> Reached total allocation of 6027Mb: see help(memory.size) >> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >> list(), : >> Reached total allocation of 6027Mb: see help(memory.size) >> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >> list(), : >> Reached total allocation of 6027Mb: see help(memory.size) >> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >> list(), : >> Reached total allocation of 6027Mb: see help(memory.size) >> >> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >> 1, alternative = "two.sided", conf.int = TRUE, conf.level >> 0.95,simulate.p.value = TRUE, B = 1e5) >> Error: unexpected '>' in ">" >> >> So the issue could be perhaps that R cannot compute my sample as the >> workspace needed is too big? Is there a way around this? I think I have >> everything set out correctly. >> Is my only other alternative is to do a 2x2 fisher test for each of the >> variables? >> >> I attach on the pdf the Minitab result for the Chi squared test as proof >> (I >> know that getting very low p values are highly unlikely but sometimes it >> happens). Seeing is believing i suppose! >> >> Regards, >> Paul >> >> >> >> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner < >> Gerrit.Eichner at math.uni-giessen.de> wrote: >> >> Dear Paul, >>> >>> quoting the email-footer: "PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html and provide commented, >>> minimal, self-contained, reproducible code." >>> >>> So, what exactly did you try and what was the actual problem/error >>> message? >>> >>> Besides that, have you noted that two of you data rows have the same >>> name? >>> >>> >>> Have you read the online help page of fisher.test(): >>> >>> ?fisher.test >>> >>> >>> Have you tried anything like the following? >>> >>> W <- as.matrix( read.table( "w.txt", head = T)[-1]) >>> >>> fisher.test( W, workspace = 1e8) >>> # For workspace look at the help page, but it presumably >>> # won't work because of your sample size. >>> >>> >>> set.seed( 20150828) # for reproducibility >>> fisher.test( W, simulate.p.value = TRUE, B = 1e5) >>> # For B look at the help page. >>> >>> >>> Finally: Did Minitab really report "p > 0.001"? ;-) >>> >>> Hth -- Gerrit >>> >>> >>> Dear all, >>> >>>> I am trying to do a fishers test on a 5x4 table on R >>>> statistics. I have already done a chi squared test using Minitab on this >>>> data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet >>>> using >>>> these results (even though they are excellent) may not be suitable for >>>> publication. I have tried numerous other statistical packages in the >>>> hope >>>> of doing this test, yet each one has just the 2x2 table. >>>> I am struggling to edit the template fishers test on R to fit >>>> my table (as according to the R book it is possible, yet i cannot get it >>>> to >>>> work). The template given on the R documentation and R book is for a 2x2 >>>> fisher test. What do i need to change to get this to work? I have >>>> attached >>>> the data with the email so one can see what i am on about. Or do i have >>>> to >>>> write my own new code to compute this. >>>> >>>> Yours Sincerely, >>>> Paul Brett >>>> >>>> >>>>[[alternative HTML version deleted]]
> On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com> wrote: > > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 > replicates) > > data: Trapz > p-value = 1e-07 > alternative hypothesis: two.sided > > > Dispite these chages, the changes equations is not giving me the results > for the calculations. The changes I have made seem to satisfy what is in > the details section on R, and I don't have the issue of workspace in R. > What I do to get the results of the fisher test? > Is there something simple that I am missing?The theory? There is nothing more to Fisher's test than the calculation of the probability of obtaining a table as or less (im-)probable as the one observed. This is the p-value. You have done 10 million simulations and not found a single table that is less likely than the one observed. Hence, the p-value is 1/10 000 001 = ca. 1e-7, counting in the observed table. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Paul, in addition to Peter's suggestion about the missing of theory you are also completely missing to explain what you mean by "[it] is not giving me the results for the calculations" or "[how] to get the results of the fisher test". They are there in the output of R's fisher.test() (if you have an idea about the theory). And again:> fisher.test( Trapz, simulate.p.value = TRUE, B = 1e5)specifies enough arguments in the case of simulating to approximate the p-value since workspace (quoting from the help page) is "Only used for ***non-simulated*** p-values [of] larger than 2 by 2 tables." (Similarly, control and hybrid are not needed either here.) Regards -- Gerrit --------------------------------------------------------------------- Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner --------------------------------------------------------------------- On Sun, 30 Aug 2015, paul brett wrote:> Hi Gerrit, > I tried both of your suggestions and got the exact same thing. > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05 > replicates) > > data: Trapz > p-value = 1e-05 > alternative hypothesis: two.sided > > I put in a few changes myself based on the details section on what should > be used for a larger than 2x2 table, getting the exact same thing as > before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control > list(30) and changed simulate.p.value = TRUE. >> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control > list(30), simulate.p.value = TRUE, B =1e5) > isher's Exact Test for Count Data with simulated p-value (based on 1e+05 > replicates) > > data: Trapz > p-value = 1e-05 > alternative hypothesis: two.sided > >> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control > list(30), simulate.p.value = TRUE, B =1e7) > > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 > replicates) > > data: Trapz > p-value = 1e-07 > alternative hypothesis: two.sided > > > Dispite these chages, the changes equations is not giving me the results > for the calculations. The changes I have made seem to satisfy what is in > the details section on R, and I don't have the issue of workspace in R. > What I do to get the results of the fisher test? > Is there something simple that I am missing? > > Regards, > Paul > > On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner < > Gerrit.Eichner at math.uni-giessen.de> wrote: > >> Paul, >> >> as the error messages of your first three attempts (see below) tell you - >> in an admittedly rather cryptic way - your table or its sample size, >> respectively, are too large, so that either the "largest (hash table) key" >> is too large, or your (i.e., R's) workspace is too small, or your >> hardware/os cannot allocate enough memory to calculate the p-value of >> Fisher Exact Test exactly by means of the implemented algorithm. >> >> One way out of this is to approximate the exact p-value through >> simulation, but apparently there occurred a typo in your (last) attempt to >> do that (Error: unexpected '>' in ">"). >> >> >> So, for me the following works (and it should also for you) and gives the >> shown output (after a very short while): >> >> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names = "Traps")) >>> >> >> set.seed( 20150828) # For the sake of reproducibility. >>> fisher.test( Trapz, simulate.p.value = TRUE, >>> >> + B = 1e5) >> >> Fisher's Exact Test for Count Data with simulated p-value (based on >> 1e+05 replicates) >> >> data: Trapz >> p-value = 1e-05 >> alternative hypothesis: two.sided >> >> >> >> Or for a higher value for B if you are patient enough (with a computing >> time of several seconds) : >> >> set.seed( 20150828) >>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7) >>> >> >> Fisher's Exact Test for Count Data with simulated p-value (based on >> 1e+07 replicates) >> >> data: Trapz >> p-value = 1e-07 >> alternative hypothesis: two.sided >> >> >> Hth -- Gerrit >> >> (BTW, you don't have to specify arguments (in function calls) whose >> default values you don't want to change.) >> >> >> >> >> On Fri, 28 Aug 2015, paul brett wrote: >> >> Hi Gerrit, >>> I spotted that, it was a mistake on my own part, it should >>> read 1.trap.2.barrier. I have corrected it on the file attached. >>> >>> So I have done these so far: >>>> fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control = list(), >>> or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control >>> list(), : >>> FEXACT error 501. >>> The hash table key cannot be computed because the largest key >>> is larger than the largest representable int. >>> The algorithm cannot proceed. >>> Reduce the workspace size or use another algorithm. >>> >>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(), or >>>> >>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control >>> list(), : >>> FEXACT error 40. >>> Out of workspace. >>> >>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >>>> >>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control >>> list(), : >>> FEXACT error 501. >>> The hash table key cannot be computed because the largest key >>> is larger than the largest representable int. >>> The algorithm cannot proceed. >>> Reduce the workspace size or use another algorithm. >>> >>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control >>>> >>> list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error: cannot allocate vector of size 7.5 Gb >>> In addition: Warning messages: >>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> >>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >>> 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = TRUE, B = 1e5) >>> Error: unexpected '>' in ">" >>> >>> So the issue could be perhaps that R cannot compute my sample as the >>> workspace needed is too big? Is there a way around this? I think I have >>> everything set out correctly. >>> Is my only other alternative is to do a 2x2 fisher test for each of the >>> variables? >>> >>> I attach on the pdf the Minitab result for the Chi squared test as proof >>> (I >>> know that getting very low p values are highly unlikely but sometimes it >>> happens). Seeing is believing i suppose! >>> >>> Regards, >>> Paul >>> >>> >>> >>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner < >>> Gerrit.Eichner at math.uni-giessen.de> wrote: >>> >>> Dear Paul, >>>> >>>> quoting the email-footer: "PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html and provide commented, >>>> minimal, self-contained, reproducible code." >>>> >>>> So, what exactly did you try and what was the actual problem/error >>>> message? >>>> >>>> Besides that, have you noted that two of you data rows have the same >>>> name? >>>> >>>> >>>> Have you read the online help page of fisher.test(): >>>> >>>> ?fisher.test >>>> >>>> >>>> Have you tried anything like the following? >>>> >>>> W <- as.matrix( read.table( "w.txt", head = T)[-1]) >>>> >>>> fisher.test( W, workspace = 1e8) >>>> # For workspace look at the help page, but it presumably >>>> # won't work because of your sample size. >>>> >>>> >>>> set.seed( 20150828) # for reproducibility >>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5) >>>> # For B look at the help page. >>>> >>>> >>>> Finally: Did Minitab really report "p > 0.001"? ;-) >>>> >>>> Hth -- Gerrit >>>> >>>> >>>> Dear all, >>>> >>>>> I am trying to do a fishers test on a 5x4 table on R >>>>> statistics. I have already done a chi squared test using Minitab on this >>>>> data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet >>>>> using >>>>> these results (even though they are excellent) may not be suitable for >>>>> publication. I have tried numerous other statistical packages in the >>>>> hope >>>>> of doing this test, yet each one has just the 2x2 table. >>>>> I am struggling to edit the template fishers test on R to fit >>>>> my table (as according to the R book it is possible, yet i cannot get it >>>>> to >>>>> work). The template given on the R documentation and R book is for a 2x2 >>>>> fisher test. What do i need to change to get this to work? I have >>>>> attached >>>>> the data with the email so one can see what i am on about. Or do i have >>>>> to >>>>> write my own new code to compute this. >>>>> >>>>> Yours Sincerely, >>>>> Paul Brett
Hi Peter and Gerrit, Sorry about my confusion with the results I was not entirely sure what they were. I was expecting some form of a table and i didn't realize that with the results of a fisher test, one just gets a p-value. I had tried the 'estimate' and 'null.value' which gave me a null value which upon looking again I don't do but I know that now). Thanks very much for the help, this has been my 5th different statistical package to try and do this test. So I suppose I had a suspicous/this is too good to be true reaction to the result. I wasn't entirely sure what it was. Thanks for clearing this up for me. Thanks again, Paul On Mon, Aug 31, 2015 at 9:00 AM, peter dalgaard <pdalgd at gmail.com> wrote:> > > On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com> wrote: > > > > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 > > replicates) > > > > data: Trapz > > p-value = 1e-07 > > alternative hypothesis: two.sided > > > > > > Dispite these chages, the changes equations is not giving me the results > > for the calculations. The changes I have made seem to satisfy what is in > > the details section on R, and I don't have the issue of workspace in R. > > What I do to get the results of the fisher test? > > Is there something simple that I am missing? > > The theory? > > There is nothing more to Fisher's test than the calculation of the > probability of obtaining a table as or less (im-)probable as the one > observed. This is the p-value. You have done 10 million simulations and not > found a single table that is less likely than the one observed. Hence, the > p-value is 1/10 000 001 = ca. 1e-7, counting in the observed table. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > >[[alternative HTML version deleted]]