Hi Gerrit,
I tried both of your suggestions and got the exact same thing.
Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05
replicates)
data: Trapz
p-value = 1e-05
alternative hypothesis: two.sided
I put in a few changes myself based on the details section on what should
be used for a larger than 2x2 table, getting the exact same thing as
before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control list(30)
and changed simulate.p.value = TRUE.> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
list(30), simulate.p.value = TRUE, B =1e5)
isher's Exact Test for Count Data with simulated p-value (based on 1e+05
replicates)
data: Trapz
p-value = 1e-05
alternative hypothesis: two.sided
> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
list(30), simulate.p.value = TRUE, B =1e7)
Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07
replicates)
data: Trapz
p-value = 1e-07
alternative hypothesis: two.sided
Dispite these chages, the changes equations is not giving me the results
for the calculations. The changes I have made seem to satisfy what is in
the details section on R, and I don't have the issue of workspace in R.
What I do to get the results of the fisher test?
Is there something simple that I am missing?
Regards,
Paul
On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner <
Gerrit.Eichner at math.uni-giessen.de> wrote:
> Paul,
>
> as the error messages of your first three attempts (see below) tell you -
> in an admittedly rather cryptic way - your table or its sample size,
> respectively, are too large, so that either the "largest (hash table)
key"
> is too large, or your (i.e., R's) workspace is too small, or your
> hardware/os cannot allocate enough memory to calculate the p-value of
> Fisher Exact Test exactly by means of the implemented algorithm.
>
> One way out of this is to approximate the exact p-value through
> simulation, but apparently there occurred a typo in your (last) attempt to
> do that (Error: unexpected '>' in ">").
>
>
> So, for me the following works (and it should also for you) and gives the
> shown output (after a very short while):
>
> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names =
"Traps"))
>>
>
> set.seed( 20150828) # For the sake of reproducibility.
>> fisher.test( Trapz, simulate.p.value = TRUE,
>>
> + B = 1e5)
>
> Fisher's Exact Test for Count Data with simulated p-value (based on
> 1e+05 replicates)
>
> data: Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
>
>
> Or for a higher value for B if you are patient enough (with a computing
> time of several seconds) :
>
> set.seed( 20150828)
>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7)
>>
>
> Fisher's Exact Test for Count Data with simulated p-value (based on
> 1e+07 replicates)
>
> data: Trapz
> p-value = 1e-07
> alternative hypothesis: two.sided
>
>
> Hth -- Gerrit
>
> (BTW, you don't have to specify arguments (in function calls) whose
> default values you don't want to change.)
>
>
>
>
> On Fri, 28 Aug 2015, paul brett wrote:
>
> Hi Gerrit,
>> I spotted that, it was a mistake on my own part, it should
>> read 1.trap.2.barrier. I have corrected it on the file attached.
>>
>> So I have done these so far:
>> > fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control =
list(),
>> or = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control
>> list(), :
>> FEXACT error 501.
>> The hash table key cannot be computed because the largest key
>> is larger than the largest representable int.
>> The algorithm cannot proceed.
>> Reduce the workspace size or use another algorithm.
>>
>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(),
or
>>>
>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control
>> list(), :
>> FEXACT error 40.
>> Out of workspace.
>>
>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control =
list(), or
>>>
>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control
>> list(), :
>> FEXACT error 501.
>> The hash table key cannot be computed because the largest key
>> is larger than the largest representable int.
>> The algorithm cannot proceed.
>> Reduce the workspace size or use another algorithm.
>>
>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control
>>>
>> list(), or = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error: cannot allocate vector of size 7.5 Gb
>> In addition: Warning messages:
>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(), :
>> Reached total allocation of 6027Mb: see help(memory.size)
>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(), :
>> Reached total allocation of 6027Mb: see help(memory.size)
>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(), :
>> Reached total allocation of 6027Mb: see help(memory.size)
>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(), :
>> Reached total allocation of 6027Mb: see help(memory.size)
>>
>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(),
or >> 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = TRUE, B = 1e5)
>> Error: unexpected '>' in ">"
>>
>> So the issue could be perhaps that R cannot compute my sample as the
>> workspace needed is too big? Is there a way around this? I think I have
>> everything set out correctly.
>> Is my only other alternative is to do a 2x2 fisher test for each of the
>> variables?
>>
>> I attach on the pdf the Minitab result for the Chi squared test as
proof
>> (I
>> know that getting very low p values are highly unlikely but sometimes
it
>> happens). Seeing is believing i suppose!
>>
>> Regards,
>> Paul
>>
>>
>>
>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner <
>> Gerrit.Eichner at math.uni-giessen.de> wrote:
>>
>> Dear Paul,
>>>
>>> quoting the email-footer: "PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html and provide commented,
>>> minimal, self-contained, reproducible code."
>>>
>>> So, what exactly did you try and what was the actual problem/error
>>> message?
>>>
>>> Besides that, have you noted that two of you data rows have the
same
>>> name?
>>>
>>>
>>> Have you read the online help page of fisher.test():
>>>
>>> ?fisher.test
>>>
>>>
>>> Have you tried anything like the following?
>>>
>>> W <- as.matrix( read.table( "w.txt", head = T)[-1])
>>>
>>> fisher.test( W, workspace = 1e8)
>>> # For workspace look at the help page, but it presumably
>>> # won't work because of your sample size.
>>>
>>>
>>> set.seed( 20150828) # for reproducibility
>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5)
>>> # For B look at the help page.
>>>
>>>
>>> Finally: Did Minitab really report "p > 0.001"? ;-)
>>>
>>> Hth -- Gerrit
>>>
>>>
>>> Dear all,
>>>
>>>> I am trying to do a fishers test on a 5x4 table on R
>>>> statistics. I have already done a chi squared test using
Minitab on this
>>>> data set, getting a result of (1, N = 165.953, DF 12,
p>0.001), yet
>>>> using
>>>> these results (even though they are excellent) may not be
suitable for
>>>> publication. I have tried numerous other statistical packages
in the
>>>> hope
>>>> of doing this test, yet each one has just the 2x2 table.
>>>> I am struggling to edit the template fishers test on
R to fit
>>>> my table (as according to the R book it is possible, yet i
cannot get it
>>>> to
>>>> work). The template given on the R documentation and R book is
for a 2x2
>>>> fisher test. What do i need to change to get this to work? I
have
>>>> attached
>>>> the data with the email so one can see what i am on about. Or
do i have
>>>> to
>>>> write my own new code to compute this.
>>>>
>>>> Yours Sincerely,
>>>> Paul Brett
>>>>
>>>>
>>>>
[[alternative HTML version deleted]]
> On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com> wrote: > > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 > replicates) > > data: Trapz > p-value = 1e-07 > alternative hypothesis: two.sided > > > Dispite these chages, the changes equations is not giving me the results > for the calculations. The changes I have made seem to satisfy what is in > the details section on R, and I don't have the issue of workspace in R. > What I do to get the results of the fisher test? > Is there something simple that I am missing?The theory? There is nothing more to Fisher's test than the calculation of the probability of obtaining a table as or less (im-)probable as the one observed. This is the p-value. You have done 10 million simulations and not found a single table that is less likely than the one observed. Hence, the p-value is 1/10 000 001 = ca. 1e-7, counting in the observed table. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Paul, in addition to Peter's suggestion about the missing of theory you are also completely missing to explain what you mean by "[it] is not giving me the results for the calculations" or "[how] to get the results of the fisher test". They are there in the output of R's fisher.test() (if you have an idea about the theory). And again:> fisher.test( Trapz, simulate.p.value = TRUE, B = 1e5)specifies enough arguments in the case of simulating to approximate the p-value since workspace (quoting from the help page) is "Only used for ***non-simulated*** p-values [of] larger than 2 by 2 tables." (Similarly, control and hybrid are not needed either here.) Regards -- Gerrit --------------------------------------------------------------------- Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner --------------------------------------------------------------------- On Sun, 30 Aug 2015, paul brett wrote:> Hi Gerrit, > I tried both of your suggestions and got the exact same thing. > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05 > replicates) > > data: Trapz > p-value = 1e-05 > alternative hypothesis: two.sided > > I put in a few changes myself based on the details section on what should > be used for a larger than 2x2 table, getting the exact same thing as > before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control > list(30) and changed simulate.p.value = TRUE. >> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control > list(30), simulate.p.value = TRUE, B =1e5) > isher's Exact Test for Count Data with simulated p-value (based on 1e+05 > replicates) > > data: Trapz > p-value = 1e-05 > alternative hypothesis: two.sided > >> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control > list(30), simulate.p.value = TRUE, B =1e7) > > Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07 > replicates) > > data: Trapz > p-value = 1e-07 > alternative hypothesis: two.sided > > > Dispite these chages, the changes equations is not giving me the results > for the calculations. The changes I have made seem to satisfy what is in > the details section on R, and I don't have the issue of workspace in R. > What I do to get the results of the fisher test? > Is there something simple that I am missing? > > Regards, > Paul > > On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner < > Gerrit.Eichner at math.uni-giessen.de> wrote: > >> Paul, >> >> as the error messages of your first three attempts (see below) tell you - >> in an admittedly rather cryptic way - your table or its sample size, >> respectively, are too large, so that either the "largest (hash table) key" >> is too large, or your (i.e., R's) workspace is too small, or your >> hardware/os cannot allocate enough memory to calculate the p-value of >> Fisher Exact Test exactly by means of the implemented algorithm. >> >> One way out of this is to approximate the exact p-value through >> simulation, but apparently there occurred a typo in your (last) attempt to >> do that (Error: unexpected '>' in ">"). >> >> >> So, for me the following works (and it should also for you) and gives the >> shown output (after a very short while): >> >> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names = "Traps")) >>> >> >> set.seed( 20150828) # For the sake of reproducibility. >>> fisher.test( Trapz, simulate.p.value = TRUE, >>> >> + B = 1e5) >> >> Fisher's Exact Test for Count Data with simulated p-value (based on >> 1e+05 replicates) >> >> data: Trapz >> p-value = 1e-05 >> alternative hypothesis: two.sided >> >> >> >> Or for a higher value for B if you are patient enough (with a computing >> time of several seconds) : >> >> set.seed( 20150828) >>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7) >>> >> >> Fisher's Exact Test for Count Data with simulated p-value (based on >> 1e+07 replicates) >> >> data: Trapz >> p-value = 1e-07 >> alternative hypothesis: two.sided >> >> >> Hth -- Gerrit >> >> (BTW, you don't have to specify arguments (in function calls) whose >> default values you don't want to change.) >> >> >> >> >> On Fri, 28 Aug 2015, paul brett wrote: >> >> Hi Gerrit, >>> I spotted that, it was a mistake on my own part, it should >>> read 1.trap.2.barrier. I have corrected it on the file attached. >>> >>> So I have done these so far: >>>> fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control = list(), >>> or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control >>> list(), : >>> FEXACT error 501. >>> The hash table key cannot be computed because the largest key >>> is larger than the largest representable int. >>> The algorithm cannot proceed. >>> Reduce the workspace size or use another algorithm. >>> >>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(), or >>>> >>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control >>> list(), : >>> FEXACT error 40. >>> Out of workspace. >>> >>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >>>> >>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control >>> list(), : >>> FEXACT error 501. >>> The hash table key cannot be computed because the largest key >>> is larger than the largest representable int. >>> The algorithm cannot proceed. >>> Reduce the workspace size or use another algorithm. >>> >>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control >>>> >>> list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000) >>> Error: cannot allocate vector of size 7.5 Gb >>> In addition: Warning messages: >>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control >>> list(), : >>> Reached total allocation of 6027Mb: see help(memory.size) >>> >>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or >>> 1, alternative = "two.sided", conf.int = TRUE, conf.level >>> 0.95,simulate.p.value = TRUE, B = 1e5) >>> Error: unexpected '>' in ">" >>> >>> So the issue could be perhaps that R cannot compute my sample as the >>> workspace needed is too big? Is there a way around this? I think I have >>> everything set out correctly. >>> Is my only other alternative is to do a 2x2 fisher test for each of the >>> variables? >>> >>> I attach on the pdf the Minitab result for the Chi squared test as proof >>> (I >>> know that getting very low p values are highly unlikely but sometimes it >>> happens). Seeing is believing i suppose! >>> >>> Regards, >>> Paul >>> >>> >>> >>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner < >>> Gerrit.Eichner at math.uni-giessen.de> wrote: >>> >>> Dear Paul, >>>> >>>> quoting the email-footer: "PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html and provide commented, >>>> minimal, self-contained, reproducible code." >>>> >>>> So, what exactly did you try and what was the actual problem/error >>>> message? >>>> >>>> Besides that, have you noted that two of you data rows have the same >>>> name? >>>> >>>> >>>> Have you read the online help page of fisher.test(): >>>> >>>> ?fisher.test >>>> >>>> >>>> Have you tried anything like the following? >>>> >>>> W <- as.matrix( read.table( "w.txt", head = T)[-1]) >>>> >>>> fisher.test( W, workspace = 1e8) >>>> # For workspace look at the help page, but it presumably >>>> # won't work because of your sample size. >>>> >>>> >>>> set.seed( 20150828) # for reproducibility >>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5) >>>> # For B look at the help page. >>>> >>>> >>>> Finally: Did Minitab really report "p > 0.001"? ;-) >>>> >>>> Hth -- Gerrit >>>> >>>> >>>> Dear all, >>>> >>>>> I am trying to do a fishers test on a 5x4 table on R >>>>> statistics. I have already done a chi squared test using Minitab on this >>>>> data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet >>>>> using >>>>> these results (even though they are excellent) may not be suitable for >>>>> publication. I have tried numerous other statistical packages in the >>>>> hope >>>>> of doing this test, yet each one has just the 2x2 table. >>>>> I am struggling to edit the template fishers test on R to fit >>>>> my table (as according to the R book it is possible, yet i cannot get it >>>>> to >>>>> work). The template given on the R documentation and R book is for a 2x2 >>>>> fisher test. What do i need to change to get this to work? I have >>>>> attached >>>>> the data with the email so one can see what i am on about. Or do i have >>>>> to >>>>> write my own new code to compute this. >>>>> >>>>> Yours Sincerely, >>>>> Paul Brett
Hi Peter and Gerrit,
Sorry about my confusion with the results I was
not entirely sure what they were. I was expecting some form of a table and
i didn't realize that with the results of a fisher test, one just gets a
p-value. I had tried the 'estimate' and 'null.value' which gave
me a null
value which upon looking again I don't do but I know that now).
Thanks very much for the help, this has been my
5th different statistical package to try and do this test. So I suppose I
had a suspicous/this is too good to be true reaction to the result. I
wasn't entirely sure what it was. Thanks for clearing this up for me.
Thanks again,
Paul
On Mon, Aug 31, 2015 at 9:00 AM, peter dalgaard <pdalgd at gmail.com>
wrote:
>
> > On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com>
wrote:
> >
> > Fisher's Exact Test for Count Data with simulated p-value (based
on 1e+07
> > replicates)
> >
> > data: Trapz
> > p-value = 1e-07
> > alternative hypothesis: two.sided
> >
> >
> > Dispite these chages, the changes equations is not giving me the
results
> > for the calculations. The changes I have made seem to satisfy what is
in
> > the details section on R, and I don't have the issue of workspace
in R.
> > What I do to get the results of the fisher test?
> > Is there something simple that I am missing?
>
> The theory?
>
> There is nothing more to Fisher's test than the calculation of the
> probability of obtaining a table as or less (im-)probable as the one
> observed. This is the p-value. You have done 10 million simulations and not
> found a single table that is less likely than the one observed. Hence, the
> p-value is 1/10 000 001 = ca. 1e-7, counting in the observed table.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
[[alternative HTML version deleted]]