Suman Sundaresh
2009-Jun-03  22:54 UTC
[R] Need help understanding output from aov and from anova
Hi all, I noticed something strange when I ran aov and anova. vtot=c(7.29917, 7.29917, 7.29917) #identical values fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has the 3rd element When I run:> anova(lm(vtot~fac))Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1 5.0455e-30 5.0455e-30 I get a p-value of 0.667. This seems strange to me. I would have expected the p-value to be NaN. Again, when I run:> summary(aov(vtot~fac))Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1 5.0455e-30 5.0455e-30 Again same p-value. Now, if I set fac to c(1,2,2) which is essentially just switching the groups. fac=as.factor(c(1,2,2)) And run,> anova(lm(vtot~fac))Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 *** Residuals 1 5.043e-62 5.043e-62 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 The p-value is really significant which again looks very strange. Please could someone shed some light on what I may be missing here? Thanks very much. Suman.
Steven McKinney
2009-Jun-04  04:24 UTC
[R] Need help understanding output from aov and from anova
Hi Suman, What version of R are you running? In R 2.9.0 running your first example yields a warning Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable so some adept R developer has taken the time to figure out how to warn you about such a problem. Perhaps someone will add this to aov() at some point as well. The only variability in this problem is that introduced by machine precision rounding errors. The exercise of submitting data with no variability to a program designed to assess variability cannot be expected to produce meaningful output, so there's nothing to understand except the issue of machine precision. Machine roundoff error is an important topic, so I'd recommend learning about that issue, which will do most to help understand these examples. Best SteveM R version 2.9.0 (2009-04-17) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.> vtot=c(7.29917, 7.29917, 7.29917) #identical values > fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has > anova(lm(vtot~fac))Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1 5.0455e-30 5.0455e-30 Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable> > summary(aov(vtot~fac))Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1 5.0455e-30 5.0455e-30> > fac=as.factor(c(1,2,2)) > anova(lm(vtot~fac))Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 *** Residuals 1 5.043e-62 5.043e-62 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable>Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney at bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Suman Sundaresh > Sent: Wednesday, June 03, 2009 3:55 PM > To: r-help at r-project.org > Subject: [R] Need help understanding output from aov and from anova > > Hi all, > > I noticed something strange when I ran aov and anova. > > vtot=c(7.29917, 7.29917, 7.29917) #identical values > fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has > the 3rd element > > When I run: > > anova(lm(vtot~fac)) > Analysis of Variance Table > > Response: vtot > Df Sum Sq Mean Sq F value Pr(>F) > fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 > Residuals 1 5.0455e-30 5.0455e-30 > > > I get a p-value of 0.667. This seems strange to me. I would have > expected the p-value to be NaN. > > Again, when I run: > > summary(aov(vtot~fac)) > Df Sum Sq Mean Sq F value Pr(>F) > fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 > Residuals 1 5.0455e-30 5.0455e-30 > > Again same p-value. > > > Now, if I set fac to c(1,2,2) which is essentially just switching the > groups. > fac=as.factor(c(1,2,2)) > > And run, > > anova(lm(vtot~fac)) > Analysis of Variance Table > > Response: vtot > Df Sum Sq Mean Sq F value Pr(>F) > fac 1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 *** > Residuals 1 5.043e-62 5.043e-62 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > The p-value is really significant which again looks very strange. > > Please could someone shed some light on what I may be missing here? > > Thanks very much. > Suman. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.