Sergii Ivakhno
2009-Feb-20 11:30 UTC
[R] lm and aov produce different results for nested fixed-factor anova
Dear R users, I have trouble obtaining the same results for nested Anova with two fixed factors when using lm and aov functions. The formulas are:> e1=aov(y~x/z)> e2=lm(y~x/z)summary(e1) Df Sum Sq Mean Sq F value Pr(>F) x 47 260.0 5.5 18.0088 < 2.2e-16 *** x:z 195 169.6 0.9 2.8318 < 2.2e-16 *** Residuals 14425 4430.3 0.3 --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 2 observations deleted due to missingness For e2 Residual standard error: 0.5542 on 14425 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.08839, Adjusted R-squared: 0.07309 F-statistic: 5.779 on 242 and 14425 DF, p-value: < 2.2e-16 I prefer to use lm, as in my case I want to know the difference between the first control group and all the other factors though regression coefficients. The same is true for levels of the nested factor within each level of the main factor. Since I am fairly novice to running linear models in R, I am not sure what can cause this problem; it also seems that lm does not provide the decomposition of MS into MS(x) and MS(z) and corresponding F-test statistics. (Is this possible to estimate them from lm output?) Finally, few words about the dataset: main factor x has 48 levels, repeated from 60 to 540 times and represents different patients. The nested factor z has 9 levels, but not all of them occur within levels of factor x. Although the nested factor levels are independent between each of the main factor (i.e. they samples taken from different tissues of each patients), considering the large size of the dataset I was advised on this forum to use the same encoding of levels of nested factor z at each level of factor x. I am not sure if this influences QR decomposition and leads to differences that I observe. I would most appreciate your help as after reading help pages I still can not understand the cause for lm vs aov discrepancy. The dataset with three factors can be downloaded from http://www.compbio.group.cam.ac.uk/Resources/Sergii_temp/example.RData Thank you, Sergii ---------------------------------------------- Sergii Ivakhno PhD student Computational Biology Group Cancer Research UK Cambridge Research Institute Li Ka Shing Centre Robinson Way Cambridge CB2 0RE England +44 (0)1223 404293 (O) +44 (0)1223 404128 (F) http://www.compbio.group.cam.ac.uk <http://www.compbio.group.cam.ac.uk/> / This communication is from Cancer Research UK. Our website is at www.cancerresearchuk.org. We are a charity registered under number 1089464 and a company limited by guarantee registered in England & Wales under number 4325234. Our registered address is 61 Lincoln's Inn Fields, London WC2A 3PX. Our central telephone number is 020 7242 0200. This communication and any attachments contain information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it. E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses. We do not accept liability for any such matters or their consequences. Anyone who communicates with us by e-mail is taken to accept the risks in doing so. [[alternative HTML version deleted]]
Mark Difford
2009-Feb-20 17:35 UTC
[R] lm and aov produce different results for nested fixed-factor anova
Hi Sergii,>> I have trouble obtaining the same results for nested Anova with two fixed >> factors when using >> lm and aov functions.There is no difference between the two if you treat them equally, i.e. if you summarize them in the same way. ## Try: anova(e2) summary(e1) ## Or: summary.lm(e1) summary(e2) Your nested model is also unusual: customary/correct is: e2 <- lm( z/x - 1) The nesting introduces an intercept for each level of the nesting factor. Also, x presently is a factor (with 48 levels). Is that what you really want ? Regards, Mark. Sergii Ivakhno wrote:> > Dear R users, > > I have trouble obtaining the same results for nested Anova with two fixed > factors when using lm and aov functions. > > The formulas are: > >> e1=aov(y~x/z) > >> e2=lm(y~x/z) > > > > summary(e1) > > Df Sum Sq Mean Sq F value Pr(>F) > > x 47 260.0 5.5 18.0088 < 2.2e-16 *** > > x:z 195 169.6 0.9 2.8318 < 2.2e-16 *** > > Residuals 14425 4430.3 0.3 > > --- > > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > 2 observations deleted due to missingness > > > > For e2 > > Residual standard error: 0.5542 on 14425 degrees of freedom > > (2 observations deleted due to missingness) > > Multiple R-squared: 0.08839, Adjusted R-squared: 0.07309 > > F-statistic: 5.779 on 242 and 14425 DF, p-value: < 2.2e-16 > > > > > > > > I prefer to use lm, as in my case I want to know the difference between > the first control group and all the other factors though regression > coefficients. The same is true for levels of the nested factor within each > level of the main factor. > > > > Since I am fairly novice to running linear models in R, I am not sure what > can cause this problem; it also seems that lm does not provide the > decomposition of MS into MS(x) and MS(z) and corresponding F-test > statistics. (Is this possible to estimate them from lm output?) > > > > Finally, few words about the dataset: main factor x has 48 levels, > repeated from 60 to 540 times and represents different patients. The > nested factor z has 9 levels, but not all of them occur within levels of > factor x. Although the nested factor levels are independent between each > of the main factor (i.e. they samples taken from different tissues of each > patients), considering the large size of the dataset I was advised on this > forum to use the same encoding of levels of nested factor z at each level > of factor x. I am not sure if this influences QR decomposition and leads > to differences that I observe. > > I would most appreciate your help as after reading help pages I still can > not understand the cause for lm vs aov discrepancy. > > The dataset with three factors can be downloaded from > > http://www.compbio.group.cam.ac.uk/Resources/Sergii_temp/example.RData > > > > Thank you, > > Sergii > > > > > ---------------------------------------------- > Sergii Ivakhno > > PhD student > > Computational Biology Group > Cancer Research UK Cambridge Research Institute > Li Ka Shing Centre > Robinson Way > Cambridge CB2 0RE > England > > +44 (0)1223 404293 (O) > +44 (0)1223 404128 (F) > > http://www.compbio.group.cam.ac.uk <http://www.compbio.group.cam.ac.uk/> / > > > This communication is from Cancer Research UK. Our website is at > www.cancerresearchuk.org. We are a charity registered under number 1089464 > and a company limited by guarantee registered in England & Wales under > number 4325234. Our registered address is 61 Lincoln's Inn Fields, London > WC2A 3PX. Our central telephone number is 020 7242 0200. > > This communication and any attachments contain information which is > confidential and may also be privileged. It is for the exclusive use of > the intended recipient(s). If you are not the intended recipient(s) > please note that any form of disclosure, distribution, copying or use of > this communication or the information in it or in any attachments is > strictly prohibited and may be unlawful. If you have received this > communication in error, please notify the sender and delete the email and > destroy any copies of it. > > E-mail communications cannot be guaranteed to be secure or error free, as > information could be intercepted, corrupted, amended, lost, destroyed, > arrive late or incomplete, or contain viruses. We do not accept liability > for any such matters or their consequences. Anyone who communicates with > us by e-mail is taken to accept the risks in doing so. > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/lm-and-aov-produce-different-results-for-nested-fixed-factor-anova-tp22119054p22124958.html Sent from the R help mailing list archive at Nabble.com.
Seemingly Similar Threads
- question about multinom function (nnet)
- repost: problems with lm for nested fixed-factor Anova (ANOVA I)
- problems with lm for nested fixed-factor Anova
- increasing significant digits in smooth.spline function
- error installing 2.0.1 '.install_package_description'