thr3ads.net - R help - [R] testing two-factor anova effects using model comparison approach with lm() and anova() [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Paul Gribble

2009-Feb-27 18:00 UTC

[R] testing two-factor anova effects using model comparison approach with lm() and anova()

I wonder if someone could explain the behavior of the anova() and lm()
functions in the following situation:

I have a standard 3x2 factorial design, factorA has 3 levels, factorB has 2
levels, they are fully crossed. I have a dependent variable DV.

Of course I can do the following to get the usual anova table:
> anova(lm(DV~factorA+factorB+factorA:factorB))Analysis of Variance Table

Response: DV
                Df  Sum Sq Mean Sq F value   Pr(>F)
factorA          2  7.4667  3.7333  4.9778 0.015546 *
factorB          1  2.1333  2.1333  2.8444 0.104648
factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
Residuals       24 18.0000  0.7500

This is perfectly satisfactory for my situation, but as a pedagogical
exercise, I wanted to demonstrate the model comparison approach to analysis
of variance by using anova() to compare a full model that contains all
effects, to restricted models that contain all effects save for the effect
of interest.

The test of the interaction effect seems to be as I expected:
> fullmodel<-lm(DV~factorA+factorB+factorA:factorB)
> restmodel<-lm(DV~factorA+factorB)
> anova(fullmodel,restmodel)Analysis of Variance Table

Model 1: DV ~ factorA + factorB + factorA:factorB
Model 2: DV ~ factorA + factorB
  Res.Df     RSS Df Sum of Sq      F   Pr(>F)
1     24 18.0000
2     26 27.8667 -2   -9.8667 6.5778 0.005275 **

As you can see the value of F (6.5778) is the same as in the anova table
above. All is well.

However, if I try to test a main effect, e.g. factorA, by testing the full
model against a restricted model that doesn't contain the main effect
factorA, I get something strange:
> restmodel<-lm(DV~factorB+factorA:factorB)
> anova(fullmodel,restmodel)Analysis of Variance Table

Model 1: DV ~ factorA + factorB + factorA:factorB
Model 2: DV ~ factorB + factorA:factorB
  Res.Df RSS Df Sum of Sq F Pr(>F)
1     24  18
2     24  18  0         0

upon inspection of each model I see that the Residuals are identical, which
is not what I was expecting:
> anova(fullmodel)Analysis of Variance Table

Response: DV
                Df  Sum Sq Mean Sq F value   Pr(>F)
factorA          2  7.4667  3.7333  4.9778 0.015546 *
factorB          1  2.1333  2.1333  2.8444 0.104648
factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
Residuals       24 18.0000  0.7500

This looks fine, but then the restricted model is where things are not as I
expected:
> anova(restmodel)Analysis of Variance Table

Response: DV
                Df  Sum Sq Mean Sq F value   Pr(>F)
factorB          1  2.1333  2.1333  2.8444 0.104648
factorB:factorA  4 17.3333  4.3333  5.7778 0.002104 **
Residuals       24 18.0000  0.7500

I was expecting the Residuals in the restricted model (the one not
containing main effect of factorA) to be larger than in the full model
containing all three effects. In other words, the variance accounted for by
the main effect factorA should be added to the Residuals. Instead, it looks
like the variance accounted for by the main effect of factorA is being
soaked up by the factorA:factorB interaction term. Strangely, the degrees of
freedom are also affected.

I must be misunderstanding something here. Can someone point out what is
happening?

Thanks,

-Paul

-- 
Paul L. Gribble, Ph.D.
Associate Professor
Dept. Psychology
The University of Western Ontario
London, Ontario
Canada N6A 5C2
Tel. +1 519 661 2111 x82237
Fax. +1 519 661 3961
pgribble@uwo.ca
http://gribblelab.org

	[[alternative HTML version deleted]]

Greg Snow

2009-Feb-27 18:30 UTC

head link

[R] testing two-factor anova effects using model comparison approach with lm() and anova()

Notice the degrees of freedom as well in the different models.  

With factors A and B, the 2 models:

A + B + A:B 

And 

A + A:B

Are actually the same overall model, just different parameterizations (you can
also see this by using x=TRUE in the call to lm and looking at the x matrix
used).

Testing if the main effect A should be in the model given that the interaction
is in the model does not make sense in most cases, therefore the notation gives
a different parameterization rather than the generally uninteresting test.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Paul Gribble
> Sent: Friday, February 27, 2009 11:01 AM
> To: r-help at r-project.org
> Subject: [R] testing two-factor anova effects using model comparison
> approach with lm() and anova()
> 
> I wonder if someone could explain the behavior of the anova() and lm()
> functions in the following situation:
> 
> I have a standard 3x2 factorial design, factorA has 3 levels, factorB
> has 2
> levels, they are fully crossed. I have a dependent variable DV.
> 
> Of course I can do the following to get the usual anova table:
> 
> > anova(lm(DV~factorA+factorB+factorA:factorB))
> Analysis of Variance Table
> 
> Response: DV
>                 Df  Sum Sq Mean Sq F value   Pr(>F)
> factorA          2  7.4667  3.7333  4.9778 0.015546 *
> factorB          1  2.1333  2.1333  2.8444 0.104648
> factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
> Residuals       24 18.0000  0.7500
> 
> This is perfectly satisfactory for my situation, but as a pedagogical
> exercise, I wanted to demonstrate the model comparison approach to
> analysis
> of variance by using anova() to compare a full model that contains all
> effects, to restricted models that contain all effects save for the
> effect
> of interest.
> 
> The test of the interaction effect seems to be as I expected:
> 
> > fullmodel<-lm(DV~factorA+factorB+factorA:factorB)
> > restmodel<-lm(DV~factorA+factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
> 
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorA + factorB
>   Res.Df     RSS Df Sum of Sq      F   Pr(>F)
> 1     24 18.0000
> 2     26 27.8667 -2   -9.8667 6.5778 0.005275 **
> 
> As you can see the value of F (6.5778) is the same as in the anova
> table
> above. All is well.
> 
> However, if I try to test a main effect, e.g. factorA, by testing the
> full
> model against a restricted model that doesn't contain the main effect
> factorA, I get something strange:
> 
> > restmodel<-lm(DV~factorB+factorA:factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
> 
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorB + factorA:factorB
>   Res.Df RSS Df Sum of Sq F Pr(>F)
> 1     24  18
> 2     24  18  0         0
> 
> upon inspection of each model I see that the Residuals are identical,
> which
> is not what I was expecting:
> 
> > anova(fullmodel)
> Analysis of Variance Table
> 
> Response: DV
>                 Df  Sum Sq Mean Sq F value   Pr(>F)
> factorA          2  7.4667  3.7333  4.9778 0.015546 *
> factorB          1  2.1333  2.1333  2.8444 0.104648
> factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
> Residuals       24 18.0000  0.7500
> 
> This looks fine, but then the restricted model is where things are not
> as I
> expected:
> 
> > anova(restmodel)
> Analysis of Variance Table
> 
> Response: DV
>                 Df  Sum Sq Mean Sq F value   Pr(>F)
> factorB          1  2.1333  2.1333  2.8444 0.104648
> factorB:factorA  4 17.3333  4.3333  5.7778 0.002104 **
> Residuals       24 18.0000  0.7500
> 
> I was expecting the Residuals in the restricted model (the one not
> containing main effect of factorA) to be larger than in the full model
> containing all three effects. In other words, the variance accounted
> for by
> the main effect factorA should be added to the Residuals. Instead, it
> looks
> like the variance accounted for by the main effect of factorA is being
> soaked up by the factorA:factorB interaction term. Strangely, the
> degrees of
> freedom are also affected.
> 
> I must be misunderstanding something here. Can someone point out what
> is
> happening?
> 
> Thanks,
> 
> -Paul
> 
> --
> Paul L. Gribble, Ph.D.
> Associate Professor
> Dept. Psychology
> The University of Western Ontario
> London, Ontario
> Canada N6A 5C2
> Tel. +1 519 661 2111 x82237
> Fax. +1 519 661 3961
> pgribble at uwo.ca
> http://gribblelab.org
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more maybe matching threads

R help - Feb 2009 - testing two-factor anova effects using model comparison approach with lm() and anova()

[R] testing two-factor anova effects using model comparison approach with lm() and anova()

[R] testing two-factor anova effects using model comparison approach with lm() and anova()

Apparently Analagous Threads