Dear list members: I have the following data: group <- rep(rep(1:2, c(5,5)), 3) time <- rep(1:3, rep(10,3)) subject <- rep(1:10, 3) p.pa <- c(92, 44, 49, 52, 41, 34, 32, 65, 47, 58, 94, 82, 48, 60, 47, 46, 41, 73, 60, 69, 95, 53, 44, 66, 62, 46, 53, 73, 84, 79) P.PA <- data.frame(subject, group, time, p.pa) The ten subjects were randomly assigned to one of two groups and measured three times. (The treatment changes after the second time point.) Now I am trying to find out the most adequate way for an analysis of main effects and interaction. Most social scientists would call this analysis a repeated measures ANOVA, but I understand that mixed-effects model is a more generic term for the same analysis. I did the analysis in four ways (one in SPSS, three in R): 1. In SPSS I used "general linear model, repeated measures", defining a "within-subject factor" for the three different time points. (The data frame is structured differently in SPSS so that there is one line for each subject, and each time point is a separate variable.) Time was significant. 2. Analogous to what is recommended in the first chapter of Pinheiro & Bates' "Mixed-Effects Models" book, I used library(nlme) summary(lme ( p.pa ~ time * group, random = ~ 1 | subject)) Here, time was NOT significant. This was surprising not only in comparison with the result in SPSS, but also when looking at the graph: interaction.plot(time, group, p.pa) 3. I then tried a code for the lme4 package, as described by Douglas Bates in RNews 5(1), 2005 (p. 27-30). The result was the same as in 2. library(lme4) summary(lmer ( p.pa ~ time * group + (time*group | subject), P.PA )) 4. The I also tried what Jonathan Baron suggests in his "Notes on the use of R for psychology experiments and questionnaires" (on CRAN): summary( aov ( p.pa ~ time * group + Error(subject/(time * group)) ) ) This gives me yet another result. So I am confused. Which one should I use? Thanks Christian -- ____________________________ Dr. Christian Gold, PhD http://www.hisf.no/~chrisgol
Christian, One thing that may help with the data you provide is to make sure that group, time, and subject are indeed factors. group <- factor(group) time <- factor(time) subject <- factor(subject) Running your analyses in both SPSS 13.0 and R.2.2.1 (the R sessions were ran in win xp and ubuntu/linux), gave the following results: 1) SPSS time: F(2,16) = 7.623,p <.005. 2) When I ran your code, the aov piece gave a singularity warning, while the lmer bit gave a false convergence message. I believe that in your case, the code should be: aov(p.pa~time*group + Error(subject)) or aov(p.pa~time*group + Error(subject + subject:time) They both give identical results When following the "nlme way", your code is correct and should give the same results as in spss, or aov. I was also stuck in the "lmer way", even when I changed the code to: lmer(p.pa~time*group + (time|subject). Perhaps, another list member, or Prof. Bates could provide more info on this one? IKD On Mon, February 27, 2006 17:15, Christian Gold wrote:> Dear list members: > > I have the following data: > group <- rep(rep(1:2, c(5,5)), 3) > time <- rep(1:3, rep(10,3)) > subject <- rep(1:10, 3) > p.pa <- c(92, 44, 49, 52, 41, 34, 32, 65, 47, 58, 94, 82, 48, 60, 47, > 46, 41, 73, 60, 69, 95, 53, 44, 66, 62, 46, 53, 73, 84, 79) > P.PA <- data.frame(subject, group, time, p.pa) > > The ten subjects were randomly assigned to one of two groups and > measured three times. (The treatment changes after the second time > point.) > > Now I am trying to find out the most adequate way for an analysis of > main effects and interaction. Most social scientists would call this > analysis a repeated measures ANOVA, but I understand that mixed-effects > model is a more generic term for the same analysis. I did the analysis > in four ways (one in SPSS, three in R): > > 1. In SPSS I used "general linear model, repeated measures", defining a > "within-subject factor" for the three different time points. (The data > frame is structured differently in SPSS so that there is one line for > each subject, and each time point is a separate variable.) > Time was significant. > > 2. Analogous to what is recommended in the first chapter of Pinheiro & > Bates' "Mixed-Effects Models" book, I used > library(nlme) > summary(lme ( p.pa ~ time * group, random = ~ 1 | subject)) > Here, time was NOT significant. This was surprising not only in > comparison with the result in SPSS, but also when looking at the graph: > interaction.plot(time, group, p.pa) > > 3. I then tried a code for the lme4 package, as described by Douglas > Bates in RNews 5(1), 2005 (p. 27-30). The result was the same as in 2. > library(lme4) > summary(lmer ( p.pa ~ time * group + (time*group | subject), P.PA )) > > 4. The I also tried what Jonathan Baron suggests in his "Notes on the > use of R for psychology experiments and questionnaires" (on CRAN): > summary( aov ( p.pa ~ time * group + Error(subject/(time * group)) ) ) > This gives me yet another result. > > So I am confused. Which one should I use? > > Thanks > > Christian > > > > > -- > ____________________________ > Dr. Christian Gold, PhD > http://www.hisf.no/~chrisgol > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Ioannis C. Dimakos, Ph.D. University of Patras Department of Elementary Education Patras, GR-26500 GREECE http://www.elemedu.upatras.gr/dimakos/ http://yannishome.port5.com/ -- Ioannis C. Dimakos, Ph.D. University of Patras Department of Elementary Education Patras, GR-26500 GREECE http://www.elemedu.upatras.gr/dimakos/ http://yannishome.port5.com/
Christian, You need, first to factor() your factors in the data frame P.PA, and then denote the error-terms in aov correctly, as follows: > group <- rep(rep(1:2, c(5,5)), 3) > time <- rep(1:3, rep(10,3)) > subject <- rep(1:10, 3) > p.pa <- c(92, 44, 49, 52, 41, 34, 32, 65, 47, 58, 94, 82, 48, 60, 47, + 46, 41, 73, 60, 69, 95, 53, 44, 66, 62, 46, 53, 73, 84, 79) > P.PA <- data.frame(subject, group, time, p.pa) > # added code: > P.PA$group=factor(P.PA$group) > P.PA$time=factor(P.PA$time) > P.PA$subject=factor(P.PA$subject) > summary(aov(p.pa~group*time+Error(subject/time),data=P.PA)) Error: subject Df Sum Sq Mean Sq F value Pr(>F) group 1 158.7 158.7 0.1931 0.672 Residuals 8 6576.3 822.0 Error: subject:time Df Sum Sq Mean Sq F value Pr(>F) time 2 1078.07 539.03 7.6233 0.004726 ** group:time 2 216.60 108.30 1.5316 0.246251 Residuals 16 1131.33 70.71 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 On 28-Feb-06, at 4:00 AM, r-help-request at stat.math.ethz.ch wrote:> Dear list members: > > I have the following data: > group <- rep(rep(1:2, c(5,5)), 3) > time <- rep(1:3, rep(10,3)) > subject <- rep(1:10, 3) > p.pa <- c(92, 44, 49, 52, 41, 34, 32, 65, 47, 58, 94, 82, 48, 60, 47, > 46, 41, 73, 60, 69, 95, 53, 44, 66, 62, 46, 53, 73, 84, 79) > P.PA <- data.frame(subject, group, time, p.pa) > > The ten subjects were randomly assigned to one of two groups and > measured three times. (The treatment changes after the second time > point.) > > Now I am trying to find out the most adequate way for an analysis of > main effects and interaction. Most social scientists would call this > analysis a repeated measures ANOVA, but I understand that mixed- > effects > model is a more generic term for the same analysis. I did the analysis > in four ways (one in SPSS, three in R): > > 1. In SPSS I used "general linear model, repeated measures", > defining a > "within-subject factor" for the three different time points. (The data > frame is structured differently in SPSS so that there is one line for > each subject, and each time point is a separate variable.) > Time was significant. > > 2. Analogous to what is recommended in the first chapter of Pinheiro & > Bates' "Mixed-Effects Models" book, I used > library(nlme) > summary(lme ( p.pa ~ time * group, random = ~ 1 | subject)) > Here, time was NOT significant. This was surprising not only in > comparison with the result in SPSS, but also when looking at the > graph: > interaction.plot(time, group, p.pa) > > 3. I then tried a code for the lme4 package, as described by Douglas > Bates in RNews 5(1), 2005 (p. 27-30). The result was the same as in 2. > library(lme4) > summary(lmer ( p.pa ~ time * group + (time*group | subject), P.PA )) > > 4. The I also tried what Jonathan Baron suggests in his "Notes on the > use of R for psychology experiments and questionnaires" (on CRAN): > summary( aov ( p.pa ~ time * group + Error(subject/(time * group)) ) ) > This gives me yet another result. > > So I am confused. Which one should I use? > > Thanks > > Christian-- Please avoid sending me Word or PowerPoint attachments. See <http://www.gnu.org/philosophy/no-word-attachments.html> -Dr. John R. Vokey
There seem to several issues here: 1) In the analysis that has a (1|Subject) error term, there is a large negative correlation between the parameter estimates for time and time:group. Overall, the effect of time is significant, as can be seen from time.lme <- lme ( p.pa ~ time * group, random = ~ 1 | subject, method="ML") > notime.lme <- lme ( p.pa ~ group, random = ~ 1 | subject, method="ML") > anova(time.lme, notime.lme) Model df AIC BIC logLik Test L.Ratio p-value time.lme 1 6 245.0 253.4 -116.5 notime.lme 2 4 254.0 259.6 -123.0 1 vs 2 12.95 0.0015 What is uncertain is how this time effect should be divided up, between a main effect of slope and the interaction. 2) What the interaction plot makes clear, and what the change in treatment (for group 1 only?) for time point 3 should have suggested is that the above analysis is not really appropriate. There are two comparisons: (i) at time points 1 and 2; and (ii) at time point 3. (3) The above does not allow for a random group to group change in slope, additional to the change that can be expected from random variation about the line. Models 3 and 4 in your account do this, and allow also for a group:subject and group:time random effects that make matters more complicated still. The fitting of such a model has the consequence that between group differences in slope are entirely explained by this random effect. Contrary to what the lmer() output might suggest, no degrees of freedom are left with which to estimate the time:group interaction. (Or you can estimate the interaction, and no degrees of freedom are left for either the time or time:group random effect). All you can talk about is the average and the difference of the time effects for these two specific groups. Thus, following on from (3), I do not understand how lmer() is able to calculate a t-statistic. There seems to me to be double dipping. Certainly, I noted a convergence problem. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Mathematical Sciences Institute, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 28 Feb 2006, at 10:00 PM, Christian Gold wrote:> From: Christian Gold <c.gold at magnet.at> > Date: 28 February 2006 2:15:04 AM > To: r-help at stat.math.ethz.ch > Subject: [R] repeated measures ANOVA > > > Dear list members: > > I have the following data: > group <- rep(rep(1:2, c(5,5)), 3) > time <- rep(1:3, rep(10,3)) > subject <- rep(1:10, 3) > p.pa <- c(92, 44, 49, 52, 41, 34, 32, 65, 47, 58, 94, 82, 48, 60, 47, > 46, 41, 73, 60, 69, 95, 53, 44, 66, 62, 46, 53, 73, 84, 79) > P.PA <- data.frame(subject, group, time, p.pa) > > The ten subjects were randomly assigned to one of two groups and > measured three times. (The treatment changes after the second time > point.) > > Now I am trying to find out the most adequate way for an analysis of > main effects and interaction. Most social scientists would call this > analysis a repeated measures ANOVA, but I understand that mixed- > effects > model is a more generic term for the same analysis. I did the analysis > in four ways (one in SPSS, three in R): > > 1. In SPSS I used "general linear model, repeated measures", > defining a > "within-subject factor" for the three different time points. (The data > frame is structured differently in SPSS so that there is one line for > each subject, and each time point is a separate variable.) > Time was significant. > > 2. Analogous to what is recommended in the first chapter of Pinheiro & > Bates' "Mixed-Effects Models" book, I used > library(nlme) > summary(lme ( p.pa ~ time * group, random = ~ 1 | subject)) > Here, time was NOT significant. This was surprising not only in > comparison with the result in SPSS, but also when looking at the > graph: > interaction.plot(time, group, p.pa) > > 3. I then tried a code for the lme4 package, as described by Douglas > Bates in RNews 5(1), 2005 (p. 27-30). The result was the same as in 2. > library(lme4) > summary(lmer ( p.pa ~ time * group + (time*group | subject), P.PA )) > > 4. The I also tried what Jonathan Baron suggests in his "Notes on the > use of R for psychology experiments and questionnaires" (on CRAN): > summary( aov ( p.pa ~ time * group + Error(subject/(time * group)) ) ) > This gives me yet another result. > > So I am confused. Which one should I use? > > Thanks > > Christian
There was a mistake in my earlier note, that I should correct: " (Or you can estimate the interaction, and no degrees of freedom are left for either the time or time:group random effect). All you can talk ^^^^^^^ about is the average and the difference of the time effects for these two specific groups. " There is no problem with the time random effect; that can be estimated from the within group variation in slopes, between subjects. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Mathematical Sciences Institute, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.
Apparently Analagous Threads
- R-2.0.1 Gentoo g77 problem
- Measure of agreement??
- repeated measures help; disagreement with SPSS
- Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
- Is 64-bit linux compatible version of 'R' available?