thr3ads.net - R help - [R] Sums of sq in car package Anova function [Dec 2004]

If this information is useful, please help other people find it:
Share via:

Karla Sartor

2004-Dec-18 23:43 UTC

[R] Sums of sq in car package Anova function

Hello R users,

I am trying to run a three factor ANOVA on a data set with unequal 
sample sizes.

I fit the data to a 'lm' object and used the Anova function from the 
'car' package with the 'type=III' option to get type III sums of
squares.  I also set the contrast coding option to 'options(contrasts = 
c("contr.sum", "contr.poly"))' as cautioned in Jon
Fox's book "An R and
S-plus Companion to Applied Regression'.

Is there anything else that I need to consider when using the type III 
option with the Anova function?

When I run the same data set in SPSS with General Linear Model and type 
III  sums of squares, the sums of squares are different enough that one 
of the main effect terms is significant in the R table and not in the 
SPSS table.  I found a similar discrepancy with a different data set, 
only SPSS showed a significant interaction effect while, while the 
'Anova' function did not.

I also compared the results from SPSS those from the 'anova' function in
the base package, and the results are nearly identical.  I would expect 
the two methods with type III sums of squares to be more similar, does 
anyone have any ideas as to why that was not the case?  I am hoping to 
not go back to SPSS at this point, so am trying to decide which of the 
two R functions is most appropriate for me (and defensible, considering 
the unequal sample sizes).

Thank you in advance for any ideas you may have!

Karla

Karla Sartor
Montana State University - LRES
ksartor at montana.edu

John Fox

2004-Dec-19 15:12 UTC

head link

[R] Sums of sq in car package Anova function

Dear Karla,

I suggested last night that you send me further information, but decided
this morning to try out a reproducible example of my own:
> set.seed(12345)
> A <- factor(sample(c("a1", "a2", "a3"),
100, replace=TRUE))
> B <- factor(sample(c("b1", "b2"), 100,
replace=TRUE))
> C <- factor(sample(c("c1", "c2", "c3"),
100, replace=TRUE))
> mu <- array(1:18, c(3,2,3))
> a <- as.numeric(A)
> b <- as.numeric(B)
> c <- as.numeric(C)
> y <- mu[cbind(a,b,c)] + rnorm(100)
> mod <- lm(y ~ A*B*C)
> library(car)
> options(contrasts=c("contr.sum", "contr.poly"))
> Anova(mod, type="II")Anova Table (Type II tests)

Response: y
           Sum Sq Df   F value    Pr(>F)    
A           65.88  2   38.4098 1.696e-12 ***
B          196.47  1  229.0775 < 2.2e-16 ***
C         2441.00  2 1423.0809 < 2.2e-16 ***
A:B          0.22  2    0.1259    0.8819    
A:C          6.92  4    2.0174    0.0996 .  
B:C          0.87  2    0.5095    0.6027    
A:B:C        2.89  4    0.8432    0.5018    
Residuals   70.33 82                        
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` '
1 > Anova(mod, type="III")Anova Table (Type III tests)

Response: y
            Sum Sq Df   F value    Pr(>F)    
(Intercept) 7830.2  1 9129.8959 < 2.2e-16 ***
A             55.7  2   32.4913 4.059e-11 ***
B            189.5  1  221.0076 < 2.2e-16 ***
C           2124.0  2 1238.2549 < 2.2e-16 ***
A:B            0.2  2    0.0942    0.9102    
A:C            5.9  4    1.7323    0.1507    
B:C            0.6  2    0.3417    0.7115    
A:B:C          2.9  4    0.8432    0.5018    
Residuals     70.3 82                        
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` '
1


I don't have a working copy of SPSS anymore, but here's what SAS does
with
this example:

      Source                      DF      Type II SS     Mean Square    F
Value    Pr > F

      A                            2       65.884048       32.942024
38.41    <.0001
      B                            1      196.467384      196.467384
229.08    <.0001
      A*B                          2        0.215883        0.107942
0.13    0.8819
      C                            2     2440.998718     1220.499359
1423.08    <.0001
      A*C                          4        6.920872        1.730218
2.02    0.0996
      B*C                          2        0.873945        0.436973
0.51    0.6027
      A*B*C                        4        2.892820        0.723205
0.84    0.5018


      Source                      DF     Type III SS     Mean Square    F
Value    Pr > F

      A                            2       55.732128       27.866064
32.49    <.0001
      B                            1      189.546201      189.546201
221.01    <.0001
      A*B                          2        0.161608        0.080804
0.09    0.9102
      C                            2     2123.968177     1061.984089
1238.25    <.0001
      A*C                          4        5.942845        1.485711
1.73    0.1507
      B*C                          2        0.586168        0.293084
0.34    0.7115
      A*B*C                        4        2.892820        0.723205
0.84    0.5018

So, as you can see, the results check.

It's hard to know what to make of this without more information about what
you did. Much as I'm not an admirer of SPSS, I doubt whether it computes
type-III sums of squares incorrectly, so I suspect something wrong with
either your SPSS commands or your R commands.

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Karla Sartor
> Sent: Saturday, December 18, 2004 6:43 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Sums of sq in car package Anova function
> 
> Hello R users,
> 
> I am trying to run a three factor ANOVA on a data set with 
> unequal sample sizes.
> 
> I fit the data to a 'lm' object and used the Anova function 
> from the 'car' package with the 'type=III' option to get
type
> III sums of squares.  I also set the contrast coding option 
> to 'options(contrasts = c("contr.sum",
"contr.poly"))' as
> cautioned in Jon Fox's book "An R and S-plus Companion to 
> Applied Regression'.
> 
> Is there anything else that I need to consider when using the 
> type III option with the Anova function?
> 
> When I run the same data set in SPSS with General Linear 
> Model and type III  sums of squares, the sums of squares are 
> different enough that one of the main effect terms is 
> significant in the R table and not in the SPSS table.  I 
> found a similar discrepancy with a different data set, only 
> SPSS showed a significant interaction effect while, while the 
> 'Anova' function did not.
> 
> I also compared the results from SPSS those from the 'anova' 
> function in the base package, and the results are nearly 
> identical.  I would expect the two methods with type III sums 
> of squares to be more similar, does anyone have any ideas as 
> to why that was not the case?  I am hoping to not go back to 
> SPSS at this point, so am trying to decide which of the two R 
> functions is most appropriate for me (and defensible, 
> considering the unequal sample sizes).
> 
> Thank you in advance for any ideas you may have!
> 
> Karla
> 
> Karla Sartor
> Montana State University - LRES
> ksartor at montana.edu
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Dec 2004 - Sums of sq in car package Anova function

[R] Sums of sq in car package Anova function

[R] Sums of sq in car package Anova function

Apparently Analagous Threads