I am using anova.lm to compare 3 linear models. Model 1 has 1 variable, model 2 has 2 variables and model 3 has 3 variables. All models are fitted to the same data set. anova.lm(model1,model2) gives me: Res.Df RSS Df Sum of Sq F Pr(>F) 1 135 245.38 2 134 184.36 1 61.022 44.354 6.467e-10 *** anova.lm(model1,model2,model3) gives me: Res.Df RSS Df Sum of Sq F Pr(>F) 1 135 245.38 2 134 184.36 1 61.022 50.182 7.355e-11 *** 3 133 161.73 1 22.628 18.609 3.105e-05 *** Why aren't the 2nd row F values from each of the anova tables the same??? I thought in each case the 2nd row is comparing model 2 to model 1? I figured out that for anova.lm(model1,model2) F(row2)=Sum of Sq(row2)/MSE of Model 2 and for anova.lm(model1,model2,model3) F(row2)=Sum of Sq(row 2)/MSE of Model 3 <-- I don't get why the MSE of model 3 is being included if we're comparing Model 2 to Model 2 Any help/explanations would be appreciated! -- View this message in context: http://r.789695.n4.nabble.com/anova-lm-F-test-confusion-tp4490211p4490211.html Sent from the R help mailing list archive at Nabble.com.
Sorry.......typo ***<-- I don't get why the MSE of model 3 is being included if we're comparing Model 2 to Model 1 -- View this message in context: http://r.789695.n4.nabble.com/anova-lm-F-test-confusion-tp4490211p4490220.html Sent from the R help mailing list archive at Nabble.com.
msteane <michellesteane <at> hotmail.com> writes:> > I am using anova.lm to compare 3 linear models. Model 1 has 1 variable, > model 2 has 2 variables and model 3 has 3 variables. All models are fitted > to the same data set.(I assume these are nested models, otherwise the analysis doesn't make sense ...)> > anova.lm(model1,model2) gives me: > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 135 245.38 > 2 134 184.36 1 61.022 44.354 6.467e-10 *** > > anova.lm(model1,model2,model3) gives me: > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 135 245.38 > 2 134 184.36 1 61.022 50.182 7.355e-11 *** > 3 133 161.73 1 22.628 18.609 3.105e-05 *** > > Why aren't the 2nd row F values from each of the anova tables the same??? I > thought in each case the 2nd row is comparing model 2 to model 1?From ?anova.lm: Normally the F statistic is most appropriate, which compares the mean square for a row to the residual sum of squares for the largest model considered.> > I figured out that for anova.lm(model1,model2) > F(row2)=Sum of Sq(row2)/MSE of Model 2 > > and for anova.lm(model1,model2,model3) > F(row2)=Sum of Sq(row 2)/MSE of Model 3 <-- I don't get why the MSE of > model 3 is being included if we're comparing Model 2 to Model 2See above ...