thr3ads.net - R help - [R] ANOVA Permutation Test [Sep 2018]

If this information is useful, please help other people find it:
Share via:

Juan Telleria Ruiz de Aguirre

2018-Sep-03 15:17 UTC

[R] ANOVA Permutation Test

Dear R users,

I have the following Question related to Package lmPerm:

This package uses a modified version of aov() function, which uses
Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
Variance (ANOVA) Model.

However, when I run the following code for a simple linear model:

library(lmPerm)

e$t_Downtime_per_Intervention_Successful %>%
  aovp(
    formula = `Downtime per Intervention[h]` ~ `Working Hours`,
    data = .
  ) %>%
  summary()

I obtain different p-values for each run!

With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not fulfill the required Normality Assumptions.

So my questions are:

Would it still be possible use the regular aov() by generating permutations
in advance (Obtaining therefore a Normal Distribution thanks to the Central
Limit Theorem)? And applying the aov() function afterwards? Does it have
sense?


Or maybe this issue could be due to unbalanced classes? I also tried to
weight observations based on proportions, but the function failed.


Any alternative solution for performing a One-Way ANOVA Test over
Non-Normal Data?


Thank you.

Juan

	[[alternative HTML version deleted]]

Michael Dewey

2018-Sep-03 15:58 UTC

head link

[R] ANOVA Permutation Test

Dear Juan

I do not use the package but if it does permutation tests it presumably 
uses random numbers and since you are not setting the seed you would get 
different values for each run.

Michael

On 03/09/2018 16:17, Juan Telleria Ruiz de Aguirre
wrote:> Dear R users,
> 
> I have the following Question related to Package lmPerm:
> 
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
> 
> However, when I run the following code for a simple linear model:
> 
> library(lmPerm)
> 
> e$t_Downtime_per_Intervention_Successful %>%
>    aovp(
>      formula = `Downtime per Intervention[h]` ~ `Working Hours`,
>      data = .
>    ) %>%
>    summary()
> 
> I obtain different p-values for each run!
> 
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
> do not fulfill the required Normality Assumptions.
> 
> So my questions are:
> 
> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
> 
> 
> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.
> 
> 
> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?
> 
> 
> Thank you.
> 
> Juan
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
-- 
Michael
http://www.dewey.myzen.co.uk/home.html

Meyners, Michael

2018-Sep-03 16:06 UTC

head link

[R] ANOVA Permutation Test

Juan, 

Your question might be borderline for this list, as it ultimately rather seems a
stats question coming in R disguise.

Anyway, the short answer is that you *expect* to get a different p value from a
permutation test unless you are able to do all possible permutation and
therefore use the so-called systematic reference set. That is rarely the case,
and only for relatively small problems.
The permutation test uses a random subset of all possible permutations. Given
this randomness, you'll get a different p value. In order to get
reproducible results, you may specify a seed (?set.seed), yet that is only
reproducible with this environment. Someone else with a different software
and/or code might come out with a different p. The higher the number of
permutations used, the smaller the variation around the p values, however. For
most applications, 1000 seem good enough to me, but sometimes I go higher (in
particular if the p value is borderline and I really need a strict above/below
alpha decision).

The permutations do not create an implicit normal distribution, but rather a
null distribution that can (likely is depending on non-normality of your data)
not normal. So your respective proposal does not appeal.

I don't think you need an alternative - the permutation test is just fine,
and recognizing the randomness in the execution does not render the (relatively
small) variability in p values a major issue.

You may want to have a look at the text book by Edgington & Onghena for
details on permutation tests, and there are plenty of papers out there
addressing them in various contexts, which will help to understand *why* you
observe what you observe here.

HTH, Michael


> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of Juan
Telleria Ruiz
> de Aguirre
> Sent: Montag, 3. September 2018 17:18
> To: R help Mailing list <r-help at r-project.org>
> Subject: [R] ANOVA Permutation Test
> 
> Dear R users,
> 
> I have the following Question related to Package lmPerm:
> 
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
> 
> However, when I run the following code for a simple linear model:
> 
> library(lmPerm)
> 
> e$t_Downtime_per_Intervention_Successful %>%
>   aovp(
>     formula = `Downtime per Intervention[h]` ~ `Working Hours`,
>     data = .
>   ) %>%
>   summary()
> 
> I obtain different p-values for each run!
> 
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not
> fulfill the required Normality Assumptions.
> 
> So my questions are:
> 
> Would it still be possible use the regular aov() by generating permutations
in
> advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
> 
> 
> Or maybe this issue could be due to unbalanced classes? I also tried to
weight
> observations based on proportions, but the function failed.
> 
> 
> Any alternative solution for performing a One-Way ANOVA Test over Non-
> Normal Data?
> 
> 
> Thank you.
> 
> Juan
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

S Ellison

2018-Sep-03 16:20 UTC

head link

[R] ANOVA Permutation Test

> This package uses a modified version of aov() function, which uses
> Permutation Tests 
> 
> I obtain different p-values for each run!
Could that be because you are defaulting to perm="Prob"?

I am not familiar with the package, but the manual is informative.
You may have missed something when reading it.

" ...The Exact method will be used by default when the number of
observations is less than or equal to
maxExact, otherwise Prob will be used.
Prob:  Iterations terminate when the estimated standard error of the estimated
proportion p is less
than p*Ca"

I would assume that probabilistic permutation is random and will change from run
to run.
You could use set.seed() to stop that, but it's actually quite useful to see
how much the results change.
If you want complete permutation, you'd need to force Exact (unless that
does not mean what it sounds like for this package).
It looks like that requires you to set maxExact to at least your number of
observations. But given that permutation  grows combinatorially,  that could
take a _long_ time for a run; the Example in the help page does not complete in
a useful time when maxExact is set to exceed the number of data points.

So I'd probably run it using Prob and simply note the range of results for a
handful of runs to give you an indication of how far to trust the answers.
> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?As a chemist, I'd guess No. And you'd be even more limited in number of
permutations.
> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.No, it's nothing to do with balance, if the results change run to run with
no change in the model. I'd guess that may exacerbate the permutaiton
variability somewhat but it won't _cause_ it.
> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?Yes; the traditional nonparametric test for one-way data (balanced) is the
kruskal-wallis test - see ?kruskal.test.
Classical ANOVA on ranks can also be defended as a general
'nonparametric' approach, though I gather it can also be criticised.



*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

Juan Telleria Ruiz de Aguirre

2018-Sep-03 17:47 UTC

head link

[R] ANOVA Permutation Test

Thank you all for your **very good** answers:

Using aovp(..., perm="Exact") seems to be the way to go for small
datasets,
and also I should definitely try ?kruskal.test.


Juan

	[[alternative HTML version deleted]]

R help - Sep 2018 - ANOVA Permutation Test

[R] ANOVA Permutation Test

[R] ANOVA Permutation Test

[R] ANOVA Permutation Test

[R] ANOVA Permutation Test

[R] ANOVA Permutation Test