thr3ads.net - R help - [R] Fisher's Test 5x4 table [Aug 2015]

If this information is useful, please help other people find it:
Share via:

paul brett

2015-Aug-30 11:54 UTC

[R] Fisher's Test 5x4 table

Hi Gerrit,
             I tried both of your suggestions and got the exact same thing.
Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05
replicates)

data:  Trapz
p-value = 1e-05
alternative hypothesis: two.sided

I put in a few changes myself based on the details section on what should
be used for a larger than 2x2 table, getting the exact same thing as
before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control list(30)
and changed simulate.p.value = TRUE.> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
list(30), simulate.p.value = TRUE, B =1e5)isher's Exact Test for Count Data with simulated p-value (based on 1e+05
replicates)

data:  Trapz
p-value = 1e-05
alternative hypothesis: two.sided
> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
list(30), simulate.p.value = TRUE, B =1e7)
Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07
replicates)

data:  Trapz
p-value = 1e-07
alternative hypothesis: two.sided


Dispite these chages, the changes equations is not giving me the results
for the calculations. The changes I have made seem to satisfy what is in
the details section on R, and I don't have the issue of workspace in R.
What I do to get the results of the fisher test?
Is there something simple that I am missing?

Regards,
             Paul

On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner <
Gerrit.Eichner at math.uni-giessen.de> wrote:
> Paul,
>
> as the error messages of your first three attempts (see below) tell you -
> in an admittedly rather cryptic way - your table or its sample size,
> respectively, are too large, so that either the "largest (hash table)
key"
> is too large, or your (i.e., R's) workspace is too small, or your
> hardware/os cannot allocate enough memory to calculate the p-value of
> Fisher Exact Test exactly by means of the implemented algorithm.
>
> One way out of this is to approximate the exact p-value through
> simulation, but apparently there occurred a typo in your (last) attempt to
> do that (Error: unexpected '>' in ">").
>
>
> So, for me the following works (and it should also for you) and gives the
> shown output (after a very short while):
>
> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names =
"Traps"))
>>
>
> set.seed( 20150828)   # For the sake of reproducibility.
>> fisher.test( Trapz, simulate.p.value = TRUE,
>>
> +             B = 1e5)
>
>    Fisher's Exact Test for Count Data with simulated p-value (based on
>    1e+05 replicates)
>
> data:  Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
>
>
> Or for a higher value for B if you are patient enough (with a computing
> time of several seconds) :
>
> set.seed( 20150828)
>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7)
>>
>
>    Fisher's Exact Test for Count Data with simulated p-value (based on
>    1e+07 replicates)
>
> data:  Trapz
> p-value = 1e-07
> alternative hypothesis: two.sided
>
>
>  Hth  --  Gerrit
>
> (BTW, you don't have to specify arguments (in function calls) whose
> default values you don't want to change.)
>
>
>
>
> On Fri, 28 Aug 2015, paul brett wrote:
>
> Hi Gerrit,
>>             I spotted that, it was a mistake on my own part, it should
>> read 1.trap.2.barrier. I have corrected it on the file attached.
>>
>> So I have done these so far:
>> > fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control =
list(),
>> or = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control
>> list(),  :
>>  FEXACT error 501.
>> The hash table key cannot be computed because the largest key
>> is larger than the largest representable int.
>> The algorithm cannot proceed.
>> Reduce the workspace size or use another algorithm.
>>
>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(),
or
>>>
>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control
>> list(),  :
>>  FEXACT error 40.
>> Out of workspace.
>>
>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control =
list(), or
>>>
>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control
>> list(),  :
>>  FEXACT error 501.
>> The hash table key cannot be computed because the largest key
>> is larger than the largest representable int.
>> The algorithm cannot proceed.
>> Reduce the workspace size or use another algorithm.
>>
>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control
>>>
>> list(), or = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >> 0.95,simulate.p.value = FALSE, B = 2000)
>> Error: cannot allocate vector of size 7.5 Gb
>> In addition: Warning messages:
>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(),  :
>>  Reached total allocation of 6027Mb: see help(memory.size)
>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(),  :
>>  Reached total allocation of 6027Mb: see help(memory.size)
>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(),  :
>>  Reached total allocation of 6027Mb: see help(memory.size)
>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>> list(),  :
>>  Reached total allocation of 6027Mb: see help(memory.size)
>>
>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(),
or >> 1, alternative = "two.sided", conf.int = TRUE, conf.level
>> 0.95,simulate.p.value = TRUE, B = 1e5)
>> Error: unexpected '>' in ">"
>>
>> So the issue could be perhaps that R cannot compute my sample as the
>> workspace needed is too big? Is there a way around this? I think I have
>> everything set out correctly.
>> Is my only other alternative is to do a 2x2 fisher test for each of the
>> variables?
>>
>> I attach on the pdf the Minitab result for the Chi squared test as
proof
>> (I
>> know that getting very low p values are highly unlikely but sometimes
it
>> happens). Seeing is believing i suppose!
>>
>> Regards,
>>             Paul
>>
>>
>>
>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner <
>> Gerrit.Eichner at math.uni-giessen.de> wrote:
>>
>> Dear Paul,
>>>
>>> quoting the email-footer: "PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html and provide commented,
>>> minimal, self-contained, reproducible code."
>>>
>>> So, what exactly did you try and what was the actual problem/error
>>> message?
>>>
>>> Besides that, have you noted that two of you data rows have the
same
>>> name?
>>>
>>>
>>> Have you read the online help page of fisher.test():
>>>
>>>  ?fisher.test
>>>
>>>
>>> Have you tried anything like the following?
>>>
>>> W <- as.matrix( read.table( "w.txt", head = T)[-1])
>>>
>>> fisher.test( W, workspace = 1e8)
>>>    # For workspace look at the help page, but it presumably
>>>    # won't work because of your sample size.
>>>
>>>
>>> set.seed( 20150828) # for reproducibility
>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5)
>>>    # For B look at the help page.
>>>
>>>
>>> Finally: Did Minitab really report "p > 0.001"? ;-)
>>>
>>>  Hth  --  Gerrit
>>>
>>>
>>> Dear all,
>>>
>>>>            I am trying to do a fishers test on a 5x4 table on R
>>>> statistics. I have already done a chi squared test using
Minitab on this
>>>> data set, getting a result of (1, N = 165.953, DF 12,
p>0.001), yet
>>>> using
>>>> these results (even though they are excellent) may not be
suitable for
>>>> publication. I have tried numerous other statistical packages
in the
>>>> hope
>>>> of doing this test, yet each one has just the 2x2 table.
>>>>            I am struggling to edit the template fishers test on
R to fit
>>>> my table (as according to the R book it is possible, yet i
cannot get it
>>>> to
>>>> work). The template given on the R documentation and R book is
for a 2x2
>>>> fisher test. What do i need to change to get this to work? I
have
>>>> attached
>>>> the data with the email so one can see what i am on about. Or
do i have
>>>> to
>>>> write my own new code to compute this.
>>>>
>>>>             Yours Sincerely,
>>>>                                     Paul Brett
>>>>
>>>>
>>>>
	[[alternative HTML version deleted]]

peter dalgaard

2015-Aug-31 07:00 UTC

head link

[R] Fisher's Test 5x4 table

> On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com>
wrote:
> 
> Fisher's Exact Test for Count Data with simulated p-value (based on
1e+07
> replicates)
> 
> data:  Trapz
> p-value = 1e-07
> alternative hypothesis: two.sided
> 
> 
> Dispite these chages, the changes equations is not giving me the results
> for the calculations. The changes I have made seem to satisfy what is in
> the details section on R, and I don't have the issue of workspace in R.
> What I do to get the results of the fisher test?
> Is there something simple that I am missing?
The theory?

There is nothing more to Fisher's test than the calculation of the
probability of obtaining a table as or less (im-)probable as the one observed.
This is the p-value. You have done 10 million simulations and not found a single
table that is less likely than the one observed. Hence, the p-value is 1/10 000
001 = ca. 1e-7, counting in the observed table.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Gerrit Eichner

2015-Aug-31 07:28 UTC

head link

[R] Fisher's Test 5x4 table

Paul,

in addition to Peter's suggestion about the missing of theory you are also 
completely missing to explain what you mean by "[it] is not giving me the 
results for the calculations" or "[how] to get the results of the
fisher
test". They are there in the output of R's fisher.test() (if you have
an
idea about the theory).

And again:
> fisher.test( Trapz, simulate.p.value = TRUE, B = 1e5)
specifies enough arguments in the case of simulating to approximate the 
p-value since workspace (quoting from the help page) is "Only used for 
***non-simulated*** p-values [of] larger than 2 by 2 tables." (Similarly, 
control and hybrid are not needed either here.)

  Regards  --  Gerrit

---------------------------------------------------------------------
Dr. Gerrit Eichner                   Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/eichner
---------------------------------------------------------------------


On Sun, 30 Aug 2015, paul brett wrote:
> Hi Gerrit,
>             I tried both of your suggestions and got the exact same thing.
> Fisher's Exact Test for Count Data with simulated p-value (based on
1e+05
> replicates)
>
> data:  Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
> I put in a few changes myself based on the details section on what should
> be used for a larger than 2x2 table, getting the exact same thing as
> before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control
> list(30) and changed simulate.p.value = TRUE.
>> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
> list(30), simulate.p.value = TRUE, B =1e5)
> isher's Exact Test for Count Data with simulated p-value (based on
1e+05
> replicates)
>
> data:  Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
>> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control
> list(30), simulate.p.value = TRUE, B =1e7)
>
> Fisher's Exact Test for Count Data with simulated p-value (based on
1e+07
> replicates)
>
> data:  Trapz
> p-value = 1e-07
> alternative hypothesis: two.sided
>
>
> Dispite these chages, the changes equations is not giving me the results
> for the calculations. The changes I have made seem to satisfy what is in
> the details section on R, and I don't have the issue of workspace in R.
> What I do to get the results of the fisher test?
> Is there something simple that I am missing?
>
> Regards,
>             Paul
>
> On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner <
> Gerrit.Eichner at math.uni-giessen.de> wrote:
>
>> Paul,
>>
>> as the error messages of your first three attempts (see below) tell you
-
>> in an admittedly rather cryptic way - your table or its sample size,
>> respectively, are too large, so that either the "largest (hash
table) key"
>> is too large, or your (i.e., R's) workspace is too small, or your
>> hardware/os cannot allocate enough memory to calculate the p-value of
>> Fisher Exact Test exactly by means of the implemented algorithm.
>>
>> One way out of this is to approximate the exact p-value through
>> simulation, but apparently there occurred a typo in your (last) attempt
to
>> do that (Error: unexpected '>' in ">").
>>
>>
>> So, for me the following works (and it should also for you) and gives
the
>> shown output (after a very short while):
>>
>> Trapz <- as.matrix( read.table( "w.txt", head = T,
row.names = "Traps"))
>>>
>>
>> set.seed( 20150828)   # For the sake of reproducibility.
>>> fisher.test( Trapz, simulate.p.value = TRUE,
>>>
>> +             B = 1e5)
>>
>>    Fisher's Exact Test for Count Data with simulated p-value (based
on
>>    1e+05 replicates)
>>
>> data:  Trapz
>> p-value = 1e-05
>> alternative hypothesis: two.sided
>>
>>
>>
>> Or for a higher value for B if you are patient enough (with a computing
>> time of several seconds) :
>>
>> set.seed( 20150828)
>>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7)
>>>
>>
>>    Fisher's Exact Test for Count Data with simulated p-value (based
on
>>    1e+07 replicates)
>>
>> data:  Trapz
>> p-value = 1e-07
>> alternative hypothesis: two.sided
>>
>>
>>  Hth  --  Gerrit
>>
>> (BTW, you don't have to specify arguments (in function calls) whose
>> default values you don't want to change.)
>>
>>
>>
>>
>> On Fri, 28 Aug 2015, paul brett wrote:
>>
>> Hi Gerrit,
>>>             I spotted that, it was a mistake on my own part, it
should
>>> read 1.trap.2.barrier. I have corrected it on the file attached.
>>>
>>> So I have done these so far:
>>>> fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control
= list(),
>>> or = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE,
control >>> list(),  :
>>>  FEXACT error 501.
>>> The hash table key cannot be computed because the largest key
>>> is larger than the largest representable int.
>>> The algorithm cannot proceed.
>>> Reduce the workspace size or use another algorithm.
>>>
>>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control =
list(), or
>>>>
>>> = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE,
control >>> list(),  :
>>>  FEXACT error 40.
>>> Out of workspace.
>>>
>>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control =
list(), or
>>>>
>>> = 1, alternative = "two.sided", conf.int = TRUE,
conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE,
control >>> list(),  :
>>>  FEXACT error 501.
>>> The hash table key cannot be computed because the largest key
>>> is larger than the largest representable int.
>>> The algorithm cannot proceed.
>>> Reduce the workspace size or use another algorithm.
>>>
>>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE,
control >>>>
>>> list(), or = 1, alternative = "two.sided", conf.int =
TRUE, conf.level >>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error: cannot allocate vector of size 7.5 Gb
>>> In addition: Warning messages:
>>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>>
>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control =
list(), or >>> 1, alternative = "two.sided", conf.int = TRUE,
conf.level >>> 0.95,simulate.p.value = TRUE, B = 1e5)
>>> Error: unexpected '>' in ">"
>>>
>>> So the issue could be perhaps that R cannot compute my sample as
the
>>> workspace needed is too big? Is there a way around this? I think I
have
>>> everything set out correctly.
>>> Is my only other alternative is to do a 2x2 fisher test for each of
the
>>> variables?
>>>
>>> I attach on the pdf the Minitab result for the Chi squared test as
proof
>>> (I
>>> know that getting very low p values are highly unlikely but
sometimes it
>>> happens). Seeing is believing i suppose!
>>>
>>> Regards,
>>>             Paul
>>>
>>>
>>>
>>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner <
>>> Gerrit.Eichner at math.uni-giessen.de> wrote:
>>>
>>> Dear Paul,
>>>>
>>>> quoting the email-footer: "PLEASE do read the posting
guide
>>>> http://www.R-project.org/posting-guide.html and provide
commented,
>>>> minimal, self-contained, reproducible code."
>>>>
>>>> So, what exactly did you try and what was the actual
problem/error
>>>> message?
>>>>
>>>> Besides that, have you noted that two of you data rows have the
same
>>>> name?
>>>>
>>>>
>>>> Have you read the online help page of fisher.test():
>>>>
>>>>  ?fisher.test
>>>>
>>>>
>>>> Have you tried anything like the following?
>>>>
>>>> W <- as.matrix( read.table( "w.txt", head =
T)[-1])
>>>>
>>>> fisher.test( W, workspace = 1e8)
>>>>    # For workspace look at the help page, but it presumably
>>>>    # won't work because of your sample size.
>>>>
>>>>
>>>> set.seed( 20150828) # for reproducibility
>>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5)
>>>>    # For B look at the help page.
>>>>
>>>>
>>>> Finally: Did Minitab really report "p > 0.001"?
;-)
>>>>
>>>>  Hth  --  Gerrit
>>>>
>>>>
>>>> Dear all,
>>>>
>>>>>            I am trying to do a fishers test on a 5x4 table
on R
>>>>> statistics. I have already done a chi squared test using
Minitab on this
>>>>> data set, getting a result of (1, N = 165.953, DF 12,
p>0.001), yet
>>>>> using
>>>>> these results (even though they are excellent) may not be
suitable for
>>>>> publication. I have tried numerous other statistical
packages in the
>>>>> hope
>>>>> of doing this test, yet each one has just the 2x2 table.
>>>>>            I am struggling to edit the template fishers
test on R to fit
>>>>> my table (as according to the R book it is possible, yet i
cannot get it
>>>>> to
>>>>> work). The template given on the R documentation and R book
is for a 2x2
>>>>> fisher test. What do i need to change to get this to work?
I have
>>>>> attached
>>>>> the data with the email so one can see what i am on about.
Or do i have
>>>>> to
>>>>> write my own new code to compute this.
>>>>>
>>>>>             Yours Sincerely,
>>>>>                                     Paul Brett

paul brett

2015-Aug-31 14:54 UTC

head link

[R] Fisher's Test 5x4 table

Hi Peter and Gerrit,
                            Sorry about my confusion with the results I was
not entirely sure what they were. I was expecting some form of a table and
i didn't realize that with the results of a fisher test, one just gets a
p-value. I had tried the 'estimate' and 'null.value' which gave
me a null
value which upon looking again I don't do but I know that now).
                            Thanks very much for the help, this has been my
5th different statistical package to try and do this test. So I suppose I
had a suspicous/this is too good to be true reaction to the result. I
wasn't entirely sure what it was. Thanks for clearing this up for me.

                            Thanks again,
                                                Paul

On Mon, Aug 31, 2015 at 9:00 AM, peter dalgaard <pdalgd at gmail.com>
wrote:
>
> > On 30 Aug 2015, at 13:54 , paul brett <brettpaul16 at gmail.com>
wrote:
> >
> > Fisher's Exact Test for Count Data with simulated p-value (based
on 1e+07
> > replicates)
> >
> > data:  Trapz
> > p-value = 1e-07
> > alternative hypothesis: two.sided
> >
> >
> > Dispite these chages, the changes equations is not giving me the
results
> > for the calculations. The changes I have made seem to satisfy what is
in
> > the details section on R, and I don't have the issue of workspace
in R.
> > What I do to get the results of the fisher test?
> > Is there something simple that I am missing?
>
> The theory?
>
> There is nothing more to Fisher's test than the calculation of the
> probability of obtaining a table as or less (im-)probable as the one
> observed. This is the p-value. You have done 10 million simulations and not
> found a single table that is less likely than the one observed. Hence, the
> p-value is 1/10 000 001 = ca. 1e-7, counting in the observed table.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
	[[alternative HTML version deleted]]

R help - Aug 2015 - Fisher's Test 5x4 table

[R] Fisher's Test 5x4 table

[R] Fisher's Test 5x4 table

[R] Fisher's Test 5x4 table

[R] Fisher's Test 5x4 table