Gilbert G
2007-Nov-04 16:34 UTC
[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
Dear R people: I wish to switch from SPSS to R, but there is one particular type of ANOVA design that cannot be done in R. Or more likely, it can be done, but it is nowhere documented. The problem is typical for psychologists: You have a repeated measures design with different groups of subjects. Now, this can be done with the aov command, but the number of subjects in both groups must be equal (i.e., balanced design). SPSS allows for unbalanced designs as well. If you are still with me, let me just give you an example of what R can and cannot do so far. Imagine I have a 2x2 within subjects design and I have 2 groups (e.g., group healthy and patients, which is stored in MyGroup). And imagine I measure reaction time RT in four conditions, say, in a color condition (red vs green) and in a shape condition (square vs circle). Now, in R you would have something like, as anybody who does balanced repeated measures anova's might know: aov( RT ~ color * shape * MyGroup + Error( Subjects/( color*shape) ) In spss you would have something like this (of course with the data organized slightly differently : GLM x1 x2 x3 x4 BY MyGroup /WSFACTOR = color 2 Polynomial shape 2 Polynomial /METHOD = SSTYPE(3) /CRITERIA = ALPHA(.05) /WSDESIGN = color shape color*shape /DESIGN = VAR00001 . Ok, the question is. If the group sizes are different (say 10 people in one group and 12 people in the other group) R is going to give the wrong answer. Of course that is not R's fault. BUT MY QUESTION IS: HOW TO GET THE UNBALANCED REPEATED MEASURES ANOVA RIGHT? Thanks for the answer!
Charles C. Berry
2007-Nov-04 17:28 UTC
[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
'nowhere documented' ?? As the posting guide suggests, you could perform a search using> RSiteSearch("repeated measures", restric="functions")say. That generates an inventory of functions that pertain to repeated measures designs. And once you have browsed through that you might focus on> library(nlme) # load a library with nifty mixed model stuff > ?lme # find out how to fit linear mixed models > example(lme) # run some examplesIf that is not enough to get you started, the book J. C. Pinheiro and D. M. Bates (2000), Mixed-Effects Models in S and S-Plus., Springer, ISBN 0-387-98957-0 documents all the stops and whistles of the nlme library. Chuck On Sun, 4 Nov 2007, Gilbert G wrote:> Dear R people: > > I wish to switch from SPSS to R, but there is one particular type of > ANOVA design that cannot be done in R. Or more likely, it can be > done, but it is nowhere documented. > > The problem is typical for psychologists: > You have a repeated measures design with different groups of subjects. > Now, this can be done with the aov command, but the number of > subjects in both groups must be equal (i.e., balanced design). SPSS > allows for unbalanced designs as well. > > If you are still with me, let me just give you an example of what R > can and cannot do so far. Imagine I have a 2x2 within subjects design > and I have 2 groups (e.g., group healthy and patients, which is stored > in MyGroup). And imagine I measure reaction time RT in four > conditions, say, in a color condition (red vs green) and in a shape > condition (square vs circle). > > Now, in R you would have something like, as anybody who does balanced > repeated measures anova's might know: > > aov( RT ~ color * shape * MyGroup + Error( Subjects/( color*shape) ) > > In spss you would have something like this (of course with the data > organized slightly differently : > > GLM > x1 x2 x3 x4 BY MyGroup > /WSFACTOR = color 2 Polynomial shape 2 Polynomial > /METHOD = SSTYPE(3) > /CRITERIA = ALPHA(.05) > /WSDESIGN = color shape color*shape > /DESIGN = VAR00001 . > > Ok, the question is. If the group sizes are different (say 10 people > in one group and 12 people in the other group) R is going to give the > wrong answer. Of course that is not R's fault. > > BUT MY QUESTION IS: HOW TO GET THE UNBALANCED REPEATED MEASURES ANOVA RIGHT? > > Thanks for the answer! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Jonathan Baron
2007-Nov-05 12:37 UTC
[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
On 11/04/07 16:34, Gilbert G wrote:> Dear R people: > > I wish to switch from SPSS to R, but there is one particular type of > ANOVA design that cannot be done in R. Or more likely, it can be > done, but it is nowhere documented. > > The problem is typical for psychologists: > You have a repeated measures design with different groups of subjects. > Now, this can be done with the aov command, but the number of > subjects in both groups must be equal (i.e., balanced design). SPSS > allows for unbalanced designs as well. > > If you are still with me, let me just give you an example of what R > can and cannot do so far. Imagine I have a 2x2 within subjects design > and I have 2 groups (e.g., group healthy and patients, which is stored > in MyGroup). And imagine I measure reaction time RT in four > conditions, say, in a color condition (red vs green) and in a shape > condition (square vs circle).At the risk of getting in trouble, let me suggest another approach. Compute the relevant terms for each subject, then do a t test comparing your two groups. The t test does not assume equal sized groups. Yuelin Li, in our "Notes on the use of R ..." shows how to use a t test to check a design very similar to what you suggest: http://www.psych.upenn.edu/~baron/rpsych/rpsych.html (section 6.10, I think, perhaps elsewhere too). You can do this either with a loop or with the lmList() function in the lme4 package (which is not discussed yet in our "Notes..."). For example, with a loop, you would compute CS[i] <- RTredsquare[i] - RTbluesquare[i] - RTredcircle[i] + RTbluecircle[i] and then do t.test(CS ~ Group) to see if your two groups differ in the interaction effect. It is easier with lmList. You don't need the loop. I do think that nlme is going to replace a lot of standard approaches in psychology. (I am almost to the point of understanding it.) But I don't think it is necessary for the kind of design you describe. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron
Dieter Menne
2007-Nov-06 17:48 UTC
[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
Charles C. Berry <cberry <at> tajo.ucsd.edu> writes:> As the posting guide suggests, you could perform a search using > > > RSiteSearch("repeated measures", restric="functions") > > say. > > That generates an inventory of functions that pertain to repeated > measures designs.The problem I noted in practice with researcher is that nlme is not among those associated with "repeated measurements", with the exception of the citation of Davidian, M. and Giltinan, D.M. (1995) "Nonlinear Mixed Effects Models for Repeated Measurement Data", Chapman and Hall. somewhere in the nlme documentation. And as far as I remember, the word "repeated measurements" cannot be found in text of Pinheiro & Bates; which otherwise is the most worn-out book on my shelf. I may be slightly off with both statements, but the association (mixed model) <=> (repeated measurements) is not at all obvious. For the record: forget repeated measurements. It definitively pays to be fluent in mixed models. Once you have your data in the "long" form, most work is done. And if lme tells you it cannot solve your problem, it has damned good reasons for it. Dieter
Yuelin Li
2007-Nov-08 15:52 UTC
[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
Hope I am not too late joining this thread. I believe the difference between R and SPSS is because SPSS adjusts the Type III SS by the harmonic mean of the unbalanced cell sizes. This calculation is discussed in Maxwell and Delaney (1990, pp. 271-297). In short, the best explanation I can offer (details see below) is that SPSS and R produces the same output if you tell SPSS to do SSTYPE(1) or SSTYPE(2) instead of the default SSTYPE(3). As discussed in Maxwell and Delaney, the calculations of SS1 and SS2 do not involve the harmonic mean. Maxwell and Delaney discussed the pros and cons of each type of Sums of Squares. Apparently SPSS thinks that the harmonic mean SS3 is the *right* analysis. Like people who responded before me, I'd also suggest the use of lme() in unbalanced designs. Yuelin. ---- details ------- I used the Hays.df data: http://www.psych.upenn.edu/~baron/rpsych/rpsych.html And I added one between-subject variable: Hays.df$grpuneven <- c(1,1,1,1,1,1,1,1,2,2,2,2) # n=8 in grp 1; 4 in grp 2 I ran aov(rt ~ grpuneven*color*shape + Error(subj/shape+color), data=Hays.df) which gives you the same output as SSTYPE(1) and SSTYPE(2) using this syntax in SPSS. GLM Sh1Col1 Sh2Col1 Sh1Col2 Sh2Col2 BY grpuneven /WSFACTOR = color 2 Polynomial shape 2 Polynomial /METHOD = SSTYPE(2) /CRITERIA = ALPHA(.05) /WSDESIGN = color shape color*shape /DESIGN = grpuneven . -- Gilbert G wrote --|Sun (Nov/04/2007)[04:34]|--: Dear R people: I wish to switch from SPSS to R, but there is one particular type of ANOVA design that cannot be done in R. Or more likely, it can be done, but it is nowhere documented. [... snip ...] Now, in R you would have something like, as anybody who does balanced repeated measures anova's might know: aov( RT ~ color * shape * MyGroup + Error( Subjects/( color*shape) ) In spss you would have something like this (of course with the data organized slightly differently : GLM x1 x2 x3 x4 BY MyGroup /WSFACTOR = color 2 Polynomial shape 2 Polynomial /METHOD = SSTYPE(3) /CRITERIA = ALPHA(.05) /WSDESIGN = color shape color*shape /DESIGN = VAR00001 . Ok, the question is. If the group sizes are different (say 10 people in one group and 12 people in the other group) R is going to give the wrong answer. Of course that is not R's fault. BUT MY QUESTION IS: HOW TO GET THE UNBALANCED REPEATED MEASURES ANOVA RIGHT? ==================================================================== Please note that this e-mail and any files transmitted with it may be privileged, confidential, and protected from disclosure under applicable law. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this communication or any of its attachments is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting this message, any attachments, and all copies and backups from your computer.