thr3ads.net - R help - [R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Larry A Sonna

2005-Sep-09 14:10 UTC

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

Dear R community,

I am trying to resolve a discrepancy between the way SPSS and R handle 
2-way, repeated measures ANOVA.

An experiment was performed in which samples were drawn before and after 
treatment of four groups of subjects (control and disease states 1, 2 and 
3).  Each group contained five subjects.  An experimental measurement was 
performed on each sample to yield a "signal".  The before and after 
treatment signals for each subject were treated as repeated measures.  We 
desire to obtain P values for disease state ("CONDITION"), and the 
interaction between signal over time and disease state
("CONDITION*TIME").

Using SPSS, the following output was obtained:
                      DF        SumSq (Type 3)    Mean Sq    F value     P
COND              3                 42861            14287       3.645 
0.0355

TIME                1                     473               473       0.175 
0.681

COND*TIME     3                     975               325       0.120 
0.947

Error                16                43219             2701



By contrast, using the following R command:

summary(aov(SIGNAL~(COND+TIME+COND*TIME)+Error(EXPNO/COND),
Type="III"))

the output was as follows:

                  Df     Sum Sq     Mean Sq     F value  Pr(>F)

COND          3          26516       8839      3.2517     0.03651 *

TIME            1            473         473      0.1739     0.67986

COND:TIME  3            975         325      0.1195     0.94785

Residuals     28        76107      2718



I don't understand why the two results are discrepant.  In particular,
I'm
not sure why R is yielding 28 DF for the residuals whereas SPSS only yields 
16.  Can anyone help?



E-mail replies would be much appreciated.  I can be reached at 
larry_sonna at yahoo.com and at larry_sonna at hotmail.com





Thanks in advance,



Larry Sonna

John Maindonald

2005-Sep-10 12:17 UTC

head link

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

There are 20 distinct individuals, right? expno breaks the 20
individuals into five groups of 4, right? Is this a blocking factor?

If expno is treated as a blocking factor, the following is what you get:

 > xy <- expand.grid(expno=letters[1:5],cond=letters[1:4],
+                                    time=factor(paste(1:2)))
 > xy$subj <- factor(paste(xy$expno, xy$cond, sep=":"))
 > xy$cond <- factor(xy$cond)
 > xy$expno <- factor(xy$expno)
 > xy$y <- rnorm(40)
 > summary(aov(y~cond*time+Error(expno/cond), data=xy))

Error: expno
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  4   3.59    0.90

Error: expno:cond
           Df Sum Sq Mean Sq F value Pr(>F)
cond       3   1.06    0.35    0.36   0.78
Residuals 12  11.86    0.99

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
time       1   2.27    2.27    1.38   0.26
cond:time  3   3.27    1.09    0.67   0.59
Residuals 16  26.19    1.64


If on the other hand this is analyzed as for a complete
randomized design, the following is the output:

 > summary(aov(y~cond*time+Error(subj), data=xy))

Error: subj
           Df Sum Sq Mean Sq F value Pr(>F)
cond       3   1.06    0.35    0.37   0.78
Residuals 16  15.46    0.97

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
time       1   2.27    2.27    1.38   0.26
cond:time  3   3.27    1.09    0.67   0.59
Residuals 16  26.19    1.64



On 10 Sep 2005, at 8:00 PM, Larry A Sonna wrote:
> From: "Larry A Sonna" <larry_sonna at hotmail.com>
> Date: 10 September 2005 12:10:06 AM
> To: <r-help at stat.math.ethz.ch>
> Subject: [R] Discrepancy between R and SPSS in 2-way, repeated  
> measures ANOVA
>
>
> Dear R community,
>
> I am trying to resolve a discrepancy between the way SPSS and R  
> handle 2-way, repeated measures ANOVA.
>
> An experiment was performed in which samples were drawn before and  
> after treatment of four groups of subjects (control and disease  
> states 1, 2 and 3).  Each group contained five subjects.  An  
> experimental measurement was performed on each sample to yield a  
> "signal".  The before and after treatment signals for each
subject
> were treated as repeated measures.  We desire to obtain P values  
> for disease state ("CONDITION"), and the interaction between
signal
> over time and disease state ("CONDITION*TIME").
>
> Using SPSS, the following output was obtained:
>                      DF        SumSq (Type 3)    Mean Sq    F  
> value     P>
> COND              3                 42861            14287        
> 3.645 0.0355
>
> TIME                1                     473                
> 473       0.175 0.681
>
> COND*TIME     3                     975               325        
> 0.120 0.947
>
> Error                16                43219             2701
>
>
>
> By contrast, using the following R command:
>
> summary(aov(SIGNAL~(COND+TIME+COND*TIME)+Error(EXPNO/COND),  
> Type="III"))
>
> the output was as follows:
>
>                  Df     Sum Sq     Mean Sq     F value  Pr(>F)
>
> COND          3          26516       8839      3.2517     0.03651 *
>
> TIME            1            473         473      0.1739     0.67986
>
> COND:TIME  3            975         325      0.1195     0.94785
>
> Residuals     28        76107      2718
>
>
>
> I don't understand why the two results are discrepant.  In  
> particular, I'm not sure why R is yielding 28 DF for the residuals  
> whereas SPSS only yields 16.  Can anyone help?
>
>
John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

John Maindonald

2005-Sep-13 00:48 UTC

head link

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

For the record, it turns out that EXPNO ran from 1 to 20, i.e., it  
identified
subject.

Thus EXPNO/COND parsed into the two error terms (additional to residual)
EXPNO and EXPNO:COND.  This second error term accounts for all
variation between levels of COND; so there is no COND sum of squares.
(In SPSS the fixed effect COND may have taken precedence; I do not
know for sure.)

In R, if this was a complete randomized design, the term Error(EXPO),
or in the mock-up example I gave Error(subj), would be enough on its  
own.

The R implementation can handle error terms akin to Error(REPNO/subj),
but because there are redundant model matrix columns generated by the
REPNO:subj term, complains that the Error() model is singular.

In general, terms of the form a/b should be used only if b is nested  
within a,
i.e.,
REPNO/IndividualWithinBlock
(where IndividualWithinBlock runs from 1 to 4)
not REPNO/subj.
(Either of these cause REPNO to be treated as a blocking factor).

 > xy <- expand.grid(REPNO=letters[1:5], COND=letters[1:4],
+                                    TIME=factor(paste(1:2)))
 > xy$subj <- factor(paste(xy$REPNO, xy$COND, sep=":"))
 > ## Below subj becomes EXPNO
 > xy$COND <- factor(xy$COND)
 > xy$REPNO <- factor(xy$REPNO)
 > xy$y <- rnorm(40)

Plea to those who post such questions to the list:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please Include either a toy data set or, if the actual data set is  
small,
lists of factor values.  If you are happy to make the information  
public,
give the result vector also (this is less important!)  Or you can put  
the
data and, where relevant, your code, on a web site.

Be careful about the use of the word "groups" in an experimental
design context; speak of "treatment groups" if that is the meaning,
or "blocks" if that is what is intended.  I suspect that confusion
between these two contexts in which the word groups is wont to
be used lay behind the use of the EXPNO/COND form of
model formula.

John Maindonald.

On 10 Sep 2005, at 8:00 PM, Larry A Sonna wrote:

> From: "Larry A Sonna" <larry_sonna at hotmail.com>
> Date: 10 September 2005 12:10:06 AM
> To: <r-help at stat.math.ethz.ch>
> Subject: [R] Discrepancy between R and SPSS in 2-way, repeated  
> measures ANOVA
>
>
> Dear R community,
>
> I am trying to resolve a discrepancy between the way SPSS and R  
> handle 2-way, repeated measures ANOVA.
>
> An experiment was performed in which samples were drawn before and  
> after treatment of four groups of subjects (control and disease  
> states 1, 2 and 3).  Each group contained five subjects.  An  
> experimental measurement was performed on each sample to yield a  
> "signal".  The before and after treatment signals for each
subject
> were treated as repeated measures.  We desire to obtain P values  
> for disease state ("CONDITION"), and the interaction between
signal
> over time and disease state ("CONDITION*TIME").
>
> Using SPSS, the following output was obtained:
>                      DF        SumSq (Type 3)    Mean Sq    F  
> value     P>
> COND              3                 42861            14287        
> 3.645 0.0355
>
> TIME                1                     473                
> 473       0.175 0.681
>
> COND*TIME     3                     975               325        
> 0.120 0.947
>
> Error                16                43219             2701
>
>
>
> By contrast, using the following R command:
>
> summary(aov(SIGNAL~(COND+TIME+COND*TIME)+Error(EXPNO/COND),  
> Type="III"))
>
> the output was as follows:
>
>                  Df     Sum Sq     Mean Sq     F value  Pr(>F)
>
> COND          3          26516       8839      3.2517     0.03651 *
>
> TIME            1            473         473      0.1739     0.67986
>
> COND:TIME  3            975         325      0.1195     0.94785
>
> Residuals     28        76107      2718
>
>
>
> I don't understand why the two results are discrepant.  In  
> particular, I'm not sure why R is yielding 28 DF for the residuals  
> whereas SPSS only yields 16.  Can anyone help?
John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

Reasonably Related Threads

Search for more maybe matching threads

R help - Sep 2005 - Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

[R] Discrepancy between R and SPSS in 2-way, repeated measures ANOVA

Reasonably Related Threads