Full_Name: Tanya Logvinenko Version: 1.7.0 OS: Windows 2000 Submission from: (NULL) (132.183.156.125) For unbalanced design, I ran into problem with ANOVA (aov function). The sum of squares for only for the second factor and total are computed correctly, but sum of squares for the first factor is computed incorreclty. Changing order of factors in the formula changes the ANOVA table. For the balanced design, there is no such problem.> summary(aov(data[1,]~factor1+factor2))Df Sum Sq Mean Sq F value Pr(>F) factor1 5 1524420 304884 6.4529 0.0003229 *** factor2 7 1447830 206833 4.3776 0.0017808 ** Residuals 31 1464674 47248 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1> summary(aov(data[1,]~factor2+factor1))Df Sum Sq Mean Sq F value Pr(>F) factor2 7 1648225 235461 4.9836 0.0007295 *** factor1 5 1324025 264805 5.6046 0.0008612 *** Residuals 31 1464674 47248 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
It is not a bug. It is supposed to be that way. It is even a FAQ. -thomas On Thu, 29 Jul 2004 tlogvinenko@partners.org wrote:> Full_Name: Tanya Logvinenko > Version: 1.7.0 > OS: Windows 2000 > Submission from: (NULL) (132.183.156.125) > > > For unbalanced design, I ran into problem with ANOVA (aov function). The sum of > squares for only for the second factor and total are computed correctly, but sum > of squares for the first factor is computed incorreclty. Changing order of > factors in the formula changes the ANOVA table. For the balanced design, there > is no such problem. > > > summary(aov(data[1,]~factor1+factor2)) > Df Sum Sq Mean Sq F value Pr(>F) > factor1 5 1524420 304884 6.4529 0.0003229 *** > factor2 7 1447830 206833 4.3776 0.0017808 ** > Residuals 31 1464674 47248 > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > summary(aov(data[1,]~factor2+factor1)) > Df Sum Sq Mean Sq F value Pr(>F) > factor2 7 1648225 235461 4.9836 0.0007295 *** > factor1 5 1324025 264805 5.6046 0.0008612 *** > Residuals 31 1464674 47248 > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel >Thomas Lumley Assoc. Professor, Biostatistics tlumley@u.washington.edu University of Washington, Seattle
What do you think is the correct answer and on what authority? (These are explicitly sequential aka Type 1 anova tables.) That the SSqs depend on the order of fitting is a feature of an unbalanced design. I believe that R is correct and your understanding is not. On Thu, 29 Jul 2004 tlogvinenko@partners.org wrote:> Full_Name: Tanya Logvinenko > Version: 1.7.0Oh, please! Don't send in bug reports from very old versions -- there have been 5 releases since then.> OS: Windows 2000 > Submission from: (NULL) (132.183.156.125) > > > For unbalanced design, I ran into problem with ANOVA (aov function). The sum of > squares for only for the second factor and total are computed correctly, but sum > of squares for the first factor is computed incorreclty. Changing order of > factors in the formula changes the ANOVA table. For the balanced design, there > is no such problem. > > > summary(aov(data[1,]~factor1+factor2)) > Df Sum Sq Mean Sq F value Pr(>F) > factor1 5 1524420 304884 6.4529 0.0003229 *** > factor2 7 1447830 206833 4.3776 0.0017808 ** > Residuals 31 1464674 47248 > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > summary(aov(data[1,]~factor2+factor1)) > Df Sum Sq Mean Sq F value Pr(>F) > factor2 7 1648225 235461 4.9836 0.0007295 *** > factor1 5 1324025 264805 5.6046 0.0008612 *** > Residuals 31 1464674 47248 > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1The FAQ has a section on BUGS asking for a *reproducible* example. This is not. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595