Stéphane Adamowicz
2014-Sep-24 12:00 UTC
[R] Rép : ANOVA and permutation tests : beware of traps
Many thanks to J. Wiley and B. Ripley for their quick answers. I suppose that many end users are aware of problems in calculation accuracy with computers. However, I would like to comment that it was not that obvious for me that the data order matters. First, I do not find any clear mention of this particular problem in the == help page, but perhaps I am experiencing difficulties with the English. Second, I do not encounter this problem neither with the piece of code I proposed to replace the dubious one, or in the following experiments :> # experiment 1 : comparing total variances > var(dat1$Y) == var(dat2$Y)[1] TRUE> # experiment 2 : comparing bilateral T tests > abs(t.test(Y~F, dat1)$statistic) == abs(t.test(Y~F, dat2)$statistic)t TRUE> # experiment 3 : applying permutations to T tests > nperm <- 10000 > T <- abs(t.test(Y~F, dat1)$statistic) > Tperm <- replicate(n=nperm, abs(t.test(sample(Y)~F, dat1)$statistic))[1] 0.1018898 # that's nice ! Thus, why a naive end user as I am should expect such pitfalls with F values given by the lm function ? Furthermore, codes similar to the one I criticized can be found in the teaching documents of various Universities and thus are spreading out. I would not be surprised that some scientific papers already rely on it ... In fact, even in R web pages, under Books ("This page gives a partially annotated list of books that are related to S or R and may be useful to the R user community"), I found only one book clearly devoted to randomization methods : "[32] Laura Chihara and Tim Hesterberg. Mathematical Statistics with Resampling and R. Wiley, 1st edition, 2011. ISBN 978-1-1180-2985-5". Looking at the author's profiles, I would say that "Beware of the trap of listening to people with no knowledge of basic numerical methods!" does not apply to them. Here is their recommended R code for one-way anova (chapter 12, but adapted to my example data):> observed <- anova(lm(Y ~ F, dat1))$F[1] > n <- length(dat1$Y) > B <- 10^5 - 1 > results <- numeric(B) > for (i in 1:B)+ { + index <- sample(n) + Y.perm <- dat1$Y[index] + results[i] <- anova(lm(Y.perm ~ dat1$F))$F[1] + }> (sum(results> observed) + 1) / (B + 1) # P value[1] 0.03969 Well, it seems that I am not the only guy who does not find the trap obvious ... Thus, my final question remains : How can we evaluate the reliability of CRAN packages that propose randomization (or bootstrap) methods ? Cheers, St?phane _______________________________________________________ St?phane Adamowicz stephane.adamowicz at avignon.inra.fr _______________________________________________________ UR 1115 Plantes et Syst?mes de Culture Horticoles (PSH) 228, route de l'a?rodrome CS 40509 domaine St Paul, site agroparc 84914 Avignon, cedex 9 France Tel +33 (0)4 32 72 24 35 Fax +33 (0)4 32 72 24 32 http://www.inra.fr/ https://www6.paca.inra.fr/psh _______________________________________________________
Duncan Murdoch
2014-Sep-24 12:24 UTC
[R] Rép : ANOVA and permutation tests : beware of traps
On 24/09/2014 8:00 AM, St?phane Adamowicz wrote:> Many thanks to J. Wiley and B. Ripley for their quick answers. > I suppose that many end users are aware of problems in calculation accuracy with computers. However, I would like to comment that it was not that obvious for me that the data order matters. First, I do not find any clear mention of this particular problem in the == help page, but perhaps I am experiencing difficulties with the English. Second, I do not encounter this problem neither with the piece of code I proposed to replace the dubious one, or in the following experiments :I suspect that there would be situations where your code failed as well. Now that I know about this potential error, I would not trust code that doesn't defend against it, even if I could not find an example where it failed: it depends on tests of exact equality in floating point values, and that is known to be a source of errors in numerical methods.> > > # experiment 1 : comparing total variances > > var(dat1$Y) == var(dat2$Y) > [1] TRUE > > # experiment 2 : comparing bilateral T tests > > abs(t.test(Y~F, dat1)$statistic) == abs(t.test(Y~F, dat2)$statistic) > t > TRUE > > # experiment 3 : applying permutations to T tests > > nperm <- 10000 > > T <- abs(t.test(Y~F, dat1)$statistic) > > Tperm <- replicate(n=nperm, abs(t.test(sample(Y)~F, dat1)$statistic)) > [1] 0.1018898 # that's nice ! > > Thus, why a naive end user as I am should expect such pitfalls with F values given by the lm function ? Furthermore, codes similar to the one I criticized can be found in the teaching documents of various Universities and thus are spreading out. I would not be surprised that some scientific papers already rely on it ... > > In fact, even in R web pages, under Books ("This page gives a partially annotated list of books that are related to S or R and may be useful to the R user community"), I found only one book clearly devoted to randomization methods : "[32] Laura Chihara and Tim Hesterberg. Mathematical Statistics with Resampling and R. Wiley, 1st edition, 2011. ISBN 978-1-1180-2985-5". Looking at the author's profiles, I would say that "Beware of the trap of listening to people with no knowledge of basic numerical methods!" does not apply to them. Here is their recommended R code for one-way anova (chapter 12, but adapted to my example data):You should tell the authors that their code doesn't work. Many books contain errors, but they only get corrected when authors are informed about them.> > > observed <- anova(lm(Y ~ F, dat1))$F[1] > > n <- length(dat1$Y) > > B <- 10^5 - 1 > > results <- numeric(B) > > for (i in 1:B) > + { > + index <- sample(n) > + Y.perm <- dat1$Y[index] > + results[i] <- anova(lm(Y.perm ~ dat1$F))$F[1] > + } > > (sum(results> observed) + 1) / (B + 1) # P value > [1] 0.03969 > > Well, it seems that I am not the only guy who does not find the trap obvious ... > > Thus, my final question remains : How can we evaluate the reliability of CRAN packages that propose randomization (or bootstrap) methods ?The same way as you evaluate the reliability of any software: you examine the code for common errors, you apply tests to problems that you know to be difficult, etc. At least all CRAN packages provide you with source code. Duncan Murdoch