Tal Galili
2009-Feb-18 21:04 UTC
[R] [package-car:Anova] extracting residuals from Anova for Type II/III Repeated Measures ?
Hello dear R members. I have been learning the Anova syntax in order to perform an SS type III Anova with repeated measures designs (thank you Prof. John Fox!) And another question came up: where/what are the (between/within) residuals for my model? ############ Play code: phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5, 3)) idata <- data.frame(phase, hour) idata mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5, fup.1, fup.2, fup.3, fup.4, fup.5) ~ treatment*gender, data=OBrienKaiser) av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour) summary(av.ok, multivariate=FALSE) ## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity ## ## SS num Df Error SS den Df F Pr(>F) ## treatment 211.286 2 228.056 10 4.6323 0.037687 ## gender 58.286 1 228.056 10 2.5558 0.140974 ## treatment:gender 130.241 2 228.056 10 2.8555 0.104469 ## phase 167.500 2 80.278 20 20.8651 1.274e-05 ## treatment:phase 78.668 4 80.278 20 4.8997 0.006426 ## gender:phase 1.668 2 80.278 20 0.2078 0.814130 ## treatment:gender:phase 10.221 4 80.278 20 0.6366 0.642369 ## hour 106.292 4 62.500 40 17.0067 3.191e-08 ## treatment:hour 1.161 8 62.500 40 0.0929 0.999257 ## gender:hour 2.559 4 62.500 40 0.4094 0.800772 ## treatment:gender:hour 7.755 8 62.500 40 0.6204 0.755484 ## phase:hour 11.083 8 96.167 80 1.1525 0.338317 ## treatment:phase:hour 6.262 16 96.167 80 0.3256 0.992814 ## gender:phase:hour 6.636 8 96.167 80 0.6900 0.699124 ## treatment:gender:phase:hour 14.155 16 96.167 80 0.7359 0.749562 -- ---------------------------------------------- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il [[alternative HTML version deleted]]
John Fox
2009-Feb-18 22:24 UTC
[R] [package-car:Anova] extracting residuals from Anova for Type II/III Repeated Measures ?
Dear Tal, I suppose that the "between" residuals would be obtained, for your example, by residuals(mod.ok). I'm not sure what the "within" residuals are. You could apply the transformation for each within-subject effect to the matrix of residuals to get residuals for that effect -- is that what you had in mind? A list of transformations is in the element $P of the Anova.mlm object. Regards, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Tal Galili > Sent: February-18-09 4:04 PM > To: r-help at r-project.org > Subject: [R] [package-car:Anova] extracting residuals from Anova for Type > II/III Repeated Measures ? > > Hello dear R members. > I have been learning the Anova syntax in order to perform an SS type III > Anova with repeated measures designs (thank you Prof. John Fox!) > And another question came up: where/what are the (between/within)residuals> for my model? > > > > ############ Play code: > > > phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), > levels=c("pretest", "posttest", "followup")) > hour <- ordered(rep(1:5, 3)) > idata <- data.frame(phase, hour) > idata > > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, > post.1, post.2, post.3, post.4, post.5, > fup.1, fup.2, fup.3, fup.4, fup.5) ~ > treatment*gender, > data=OBrienKaiser) > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour) > > > summary(av.ok, multivariate=FALSE) > > ## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity > ## > ## SS num Df Error SS den Df F > Pr(>F) > ## treatment 211.286 2 228.056 10 4.6323 > 0.037687 > ## gender 58.286 1 228.056 10 2.5558 > 0.140974 > ## treatment:gender 130.241 2 228.056 10 2.8555 > 0.104469 > ## phase 167.500 2 80.278 20 20.8651 > 1.274e-05 > ## treatment:phase 78.668 4 80.278 20 4.8997 > 0.006426 > ## gender:phase 1.668 2 80.278 20 0.2078 > 0.814130 > ## treatment:gender:phase 10.221 4 80.278 20 0.6366 > 0.642369 > ## hour 106.292 4 62.500 40 17.0067 > 3.191e-08 > ## treatment:hour 1.161 8 62.500 40 0.0929 > 0.999257 > ## gender:hour 2.559 4 62.500 40 0.4094 > 0.800772 > ## treatment:gender:hour 7.755 8 62.500 40 0.6204 > 0.755484 > ## phase:hour 11.083 8 96.167 80 1.1525 > 0.338317 > ## treatment:phase:hour 6.262 16 96.167 80 0.3256 > 0.992814 > ## gender:phase:hour 6.636 8 96.167 80 0.6900 > 0.699124 > ## treatment:gender:phase:hour 14.155 16 96.167 80 0.7359 > 0.749562 > > > > > > > > > > -- > ---------------------------------------------- > > > My contact information: > Tal Galili > Phone number: 972-50-3373767 > FaceBook: Tal Galili > My Blogs: > www.talgalili.com > www.biostatistics.co.il > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Tal Galili
2009-Feb-21 09:27 UTC
[R] [package-car:Anova] extracting residuals from Anova for Type II/III Repeated Measures ?
Dear John. Thank you so much for the patience and time, you have just answered my question! I was so used to the "mean ss" term from the aov table, that I didn't stop to look and realize that the values you added in the "Error SS" and "den Df" columns in the summary tables are actually the residuals sum of squares Errors and there respective degrees of freedom (and to that confusion also contributed, I guess, the fact that until your last e-mail I couldn't reproduce the aov results with Anova for me to experiment and compare). I hope that the clarification evoked by this discussion might have some use for future updates to the Anova (or aov) help files. And even if not - I personally feel to have gained a lot from this weeks correspondents. So again - thanks a lot John, for your patience and help. (p.s: I am adding " r-help at r-project.org" cc'd, so that other people encountering my question could benefit from your help as well) With regards, Tal On Sat, Feb 21, 2009 at 4:25 AM, John Fox <jfox at mcmaster.ca> wrote:> Dear Tal, > >> -----Original Message----- >> From: Tal Galili [mailto:tal.galili at gmail.com] >> Sent: February-20-09 6:15 PM >> To: John Fox >> Subject: Re: [R] [package-car:Anova] extracting residuals from Anova for > Type >> II/III Repeated Measures ? >> >> Hello John, thanks for your reply and correction. >> >> I apologies for my crude mistake in applying the aov (now I have learned >> better). I hope to get a hold of "Statistical Models in S", but I don't >> predict it could easily happen in the near future. >> >> Also, I would be very happy if you could supply me with some more > directions >> as to how to obtain the "within" residuals (such as reported from the aov >> summary), since I am not sure how to proceed with that. > > I'm not sure what it is that you're looking for. First, I'm pretty sure that > you mean a residual sum of squares not residuals. But in any event, the sums > of squares reported by aov(), when you formulate the model correctly, are > exactly the same as those reported by Anova(). The "Error SS" given by > Anova() correspond to the "Residuals" sums of square given by aov(); these > are for the appropriate error term for testing each term. > > John > >> >> With regards, >> Tal >> >> >> >> >> >> >> >> >> >> On Sat, Feb 21, 2009 at 12:41 AM, John Fox <jfox at mcmaster.ca> wrote: >> >> >> Dear Tal, >> >> I didn't have time to look at all this yesterday. >> >> Since aov() doesn't do what I typically want to do, I guess I've not >> paid >> much attention to it recently. I can see, however, that you appear > to >> have >> specified the error strata incorrectly, since (given your desire to >> compare >> to Anova) the within-block factors are nested within blocks. > Something >> like >> >> > npk.aovE <- aov(value ~ N*P*K + Error(block/N*P*K), npk.long) >> >> should be closer to what you want, and in fact produces all of the > sums >> of >> squares, but doesn't put all of the error terms together with the >> corresponding terms; thus, you get, e.g., the test for N but not for > P >> and >> K, even though the SSs and error SSs for the latter are in the > table. >> By >> permuting N, P, and K, you can get the other F tests. I suspect that >> this >> has to do with the sequential approach taken by aov() but someone > else >> more >> familiar with how it works will have to fill in the details. I > wonder, >> though, whether you've read the sections in Statistical Models in S > and >> MASS >> referenced in the help file for aov. >> >> > summary(npk.aovE) >> >> >> Error: block >> Df Sum Sq Mean Sq F value Pr(>F) >> Residuals 5 153.147 30.629 >> >> >> Error: P >> >> Df Sum Sq Mean Sq >> >> P 1 16.803 16.803 >> >> Error: K >> >> Df Sum Sq Mean Sq >> >> K 1 190.40 190.40 >> >> >> Error: block:N >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> N 1 378.56 378.56 38.614 0.001577 ** >> Residuals 5 49.02 9.80 >> >> --- >> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> >> >> Error: block:P >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> Residuals 5 19.1317 3.8263 >> >> Error: block:K >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> Residuals 5 24.4933 4.8987 >> >> Error: P:K >> >> Df Sum Sq Mean Sq >> >> P:K 1 0.96333 0.96333 >> >> Error: block:N:P >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> N:P 1 42.563 42.563 8.6888 0.03197 * >> Residuals 5 24.493 4.899 >> >> --- >> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> >> >> Error: block:N:K >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> N:K 1 66.270 66.270 17.320 0.00881 ** >> Residuals 5 19.132 3.826 >> >> --- >> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> >> >> Error: block:P:K >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> Residuals 5 49.018 9.804 >> >> Error: block:N:P:K >> >> Df Sum Sq Mean Sq F value Pr(>F) >> >> N:P:K 1 74.003 74.003 2.4161 0.1808 >> >> Residuals 5 153.147 30.629 >> >> >> >> John >> >> >> > -----Original Message----- >> > From: Tal Galili [mailto:tal.galili at gmail.com] >> > Sent: February-19-09 2:23 AM >> > To: John Fox >> > Cc: r-help at r-project.org >> > Subject: Re: [R] [package-car:Anova] extracting residuals from > Anova >> for >> Type >> > II/III Repeated Measures ? >> > >> > Hello John, thank you for the fast reply. >> > >> > Thanks to your answer I was able to reproduce the "between" > residuals >> by >> > simply writing: >> > sum(residuals(mod.ok)^2) >> > >> > But I must admit that how to obtain the "within" was beyond me, so >> some >> > advice here would be of great help. >> > (Also, it might be worth adding these residuals to the standard > Anova >> output, >> > since I believe some researchers using the package could find it >> useful >> when >> > reporting there results) >> > >> > >> > By working with the Anova package, I stumbled into a conflict > between >> the >> > Anova and the aov outputs. >> > The toy data used was reassembled from the (?aov) example (and > filled >> in >> for >> > balance, and contr.sum contrasts where assigned). >> > When I compared the Anova (on SS type III) and the aov F (And >> p.value) >> > results - I found significant differences between them (though the > SS >> where >> > identical). >> > If I understood correctly, for balanced designs both test should > have >> yielded >> > the same results. any ideas on what the source of the difference >> might be? >> > >> > Here is the code: >> > >> > >> > >> > >> > ## get the data: >> > utils::data(npk, package="MASS") >> > library(reshape) # for data manipulation >> > colnames(npk)[5] <- "value" # so that the data set will read > proparly >> in >> > "reshape" >> > >> > npk.wide = cast(block ~ N +P+ K, data = npk) >> > >> > # now we fill in the NA's, to get a balanced design >> > is.na(npk.wide) #it has na's. we'll fill them with that factor's >> combination >> > mean, so to keep some significance around >> > combination.mean <- apply(npk.wide, 2, function(x){mean(x,na.rm >> T)}) >> > for(i in c(2:9)) >> > { npk.wide[is.na(npk.wide)[,i],i] = combination.mean[i-1] } >> > is.na(npk.wide) #no na's anymore >> > >> > # recreating a balanced npk for the aov >> > npk.long <- melt(npk.wide) >> > >> > head(npk.long,4) >> > # block value N P K >> > #X0_0_0 1 46.80000 0 0 0 >> > #X0_0_0.1 2 51.43333 0 0 0 >> > #X0_0_0.2 3 51.43333 0 0 0 >> > #X0_0_0.3 4 51.43333 0 0 0 >> > head(npk.wide,4) >> > # block 0_0_0 0_0_1 0_1_0 0_1_1 1_0_0 1_0_1 1_1_0 >> 1_1_1 >> > #1 1 46.80000 52.0 54.33333 49.5 63.76667 57.00000 62.80000 >> 54.36667 >> > #2 2 51.43333 55.5 56.00000 50.5 59.80000 54.66667 57.93333 >> 58.50000 >> > #3 3 51.43333 55.0 62.80000 50.5 69.50000 54.66667 57.93333 >> 55.80000 >> > #4 4 51.43333 45.5 44.20000 50.5 62.00000 54.66667 57.93333 >> 48.80000 >> > >> > >> > # npk.wide$block # it is a factor - good. >> > >> > >> > # change into contr.sum >> > for(i in c(1,3:5)) npk.long[,i] <- C(npk.long[,i] , "contr.sum") >> > npk.wide[,1] <- C(npk.wide[,1] , "contr.sum") >> > >> > >> > >> > ## as a test, not particularly sensible statistically >> > npk.aovE <- aov(value~ N*P*K + Error(block), npk.long) >> > npk.aovE >> > summary(npk.aovE) >> > >> > ## output >> > Error: block >> > Df Sum Sq Mean Sq F value Pr(>F) >> > Residuals 5 153.147 30.629 >> > >> > Error: Within >> > Df Sum Sq Mean Sq F value Pr(>F) >> > N 1 378.56 378.56 39.1502 3.546e-07 *** >> > P 1 16.80 16.80 1.7378 0.19599 >> > K 1 190.40 190.40 19.6911 8.657e-05 *** >> > N:P 1 42.56 42.56 4.4018 0.04319 * >> > N:K 1 66.27 66.27 6.8535 0.01298 * >> > P:K 1 0.96 0.96 0.0996 0.75415 >> > N:P:K 1 74.00 74.00 7.6533 0.00899 ** >> > Residuals 35 338.43 9.67 >> > --- >> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> > ## END output >> > >> > # the residuals SS >> > # 338.43 + 153.147 >> > # =>> > sum(residuals(lm(value~ N*P*K , npk.long))^2) >> > # 491.58 >> > >> > >> > >> > idata <- cast(N+P+K ~ . , data = npk.long)[,c(1:3)] >> > idata >> > for(i in 1:3) idata[,i] <- C(idata[,i] , "contr.sum") >> > >> > mod.ok <- lm(as.matrix(npk.wide[,-1]) ~ 1, data = npk.wide) >> > av.ok <- Anova(mod.ok, idata=idata, idesign=~ N*P*K, type = "III") >> > summary(av.ok, multivariate=FALSE) >> > >> > >> > ## output - with different F and p.value results >> > Univariate Type III Repeated-Measures ANOVA Assuming Sphericity >> > >> > SS num Df Error SS den Df F Pr(>F) >> > (Intercept) 144541 1 153 5 4719.0302 1.238e-08 *** >> > N 379 1 49 5 38.6145 0.001577 ** >> > P 17 1 19 5 4.3915 0.090257 . >> > K 190 1 24 5 38.8684 0.001554 ** >> > N:P 43 1 24 5 8.6888 0.031971 * >> > N:K 66 1 19 5 17.3195 0.008810 ** >> > P:K 1 1 49 5 0.0983 0.766580 >> > N:P:K 74 1 153 5 2.4161 0.180810 >> > --- >> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> > ## End output >> > >> > >> > # the SS residuals are the same thou >> > sum(residuals(mod.ok)^2) >> > # 491.58 >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Thu, Feb 19, 2009 at 12:24 AM, John Fox <jfox at mcmaster.ca> > wrote: >> > >> > >> > Dear Tal, >> > >> > I suppose that the "between" residuals would be obtained, > for >> your >> > example, >> > by residuals(mod.ok). I'm not sure what the "within" > residuals >> are. >> You >> > could apply the transformation for each within-subject > effect >> to the >> > matrix >> > of residuals to get residuals for that effect -- is that > what >> you >> had >> > in >> > mind? A list of transformations is in the element $P of the >> Anova.mlm >> > object. >> > >> > Regards, >> > John >> > >> > ------------------------------ >> > John Fox, Professor >> > Department of Sociology >> > McMaster University >> > Hamilton, Ontario, Canada >> > web: socserv.mcmaster.ca/jfox >> > >> > >> > >> > > -----Original Message----- >> > > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r- >> > project.org] >> > On >> > > Behalf Of Tal Galili >> > > Sent: February-18-09 4:04 PM >> > > To: r-help at r-project.org >> > > Subject: [R] [package-car:Anova] extracting residuals from >> Anova >> for >> > Type >> > > II/III Repeated Measures ? >> > > >> > > Hello dear R members. >> > > I have been learning the Anova syntax in order to perform > an >> SS >> type >> > III >> > > Anova with repeated measures designs (thank you Prof. John >> Fox!) >> > > And another question came up: where/what are the >> (between/within) >> > residuals >> > > for my model? >> > > >> > > >> > > >> > > ############ Play code: >> > > >> > > >> > > phase <- factor(rep(c("pretest", "posttest", "followup"), >> c(5, 5, >> > 5)), >> > > levels=c("pretest", "posttest", "followup")) >> > > hour <- ordered(rep(1:5, 3)) >> > > idata <- data.frame(phase, hour) >> > > idata >> > > >> > > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, >> > > post.1, post.2, post.3, post.4, > post.5, >> > > fup.1, fup.2, fup.3, fup.4, fup.5) ~ >> > > treatment*gender, >> > > data=OBrienKaiser) >> > > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour) >> > > >> > > >> > > summary(av.ok, multivariate=FALSE) >> > > >> > > ## Univariate Type II Repeated-Measures ANOVA Assuming >> Sphericity >> > > ## >> > > ## SS num Df Error SS den > Df >> F >> > > Pr(>F) >> > > ## treatment 211.286 2 228.056 > 10 >> 4.6323 >> > > 0.037687 >> > > ## gender 58.286 1 228.056 > 10 >> 2.5558 >> > > 0.140974 >> > > ## treatment:gender 130.241 2 228.056 > 10 >> 2.8555 >> > > 0.104469 >> > > ## phase 167.500 2 80.278 > 20 >> 20.8651 >> > > 1.274e-05 >> > > ## treatment:phase 78.668 4 80.278 > 20 >> 4.8997 >> > > 0.006426 >> > > ## gender:phase 1.668 2 80.278 > 20 >> 0.2078 >> > > 0.814130 >> > > ## treatment:gender:phase 10.221 4 80.278 > 20 >> 0.6366 >> > > 0.642369 >> > > ## hour 106.292 4 62.500 > 40 >> 17.0067 >> > > 3.191e-08 >> > > ## treatment:hour 1.161 8 62.500 > 40 >> 0.0929 >> > > 0.999257 >> > > ## gender:hour 2.559 4 62.500 > 40 >> 0.4094 >> > > 0.800772 >> > > ## treatment:gender:hour 7.755 8 62.500 > 40 >> 0.6204 >> > > 0.755484 >> > > ## phase:hour 11.083 8 96.167 > 80 >> 1.1525 >> > > 0.338317 >> > > ## treatment:phase:hour 6.262 16 96.167 > 80 >> 0.3256 >> > > 0.992814 >> > > ## gender:phase:hour 6.636 8 96.167 > 80 >> 0.6900 >> > > 0.699124 >> > > ## treatment:gender:phase:hour 14.155 16 96.167 > 80 >> 0.7359 >> > > 0.749562 >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > ---------------------------------------------- >> > > >> > > >> > > My contact information: >> > > Tal Galili >> > > Phone number: 972-50-3373767 >> > > FaceBook: Tal Galili >> > > My Blogs: >> > > www.talgalili.com >> > > www.biostatistics.co.il >> > > >> > >> > > [[alternative HTML version deleted]] >> > > >> > > ______________________________________________ >> > > R-help at r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, > reproducible >> code. >> > >> > >> > >> > >> > >> > >> > >> > -- >> > ---------------------------------------------- >> > >> > >> > My contact information: >> > Tal Galili >> > Phone number: 972-50-3373767 >> > FaceBook: Tal Galili >> > My Blogs: >> > www.talgalili.com >> > www.biostatistics.co.il >> > >> > >> >> >> >> >> >> >> >> >> -- >> ---------------------------------------------- >> >> >> My contact information: >> Tal Galili >> Phone number: 972-50-3373767 >> FaceBook: Tal Galili >> My Blogs: >> www.talgalili.com >> www.biostatistics.co.il >> >> > > > >-- ---------------------------------------------- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il
Apparently Analagous Threads
- how to access the elements of a univariate results table with Anova (library car)
- car::Anova - Can it be used for ANCOVA with repeated-measures factors.
- Wrapper of linearHypothesis (car) for post-hoc of repeated measures ANOVA
- new to repeated measures anova in R
- model simplification in repeated anova