Hi, Thanks for this information. Is there any way to force R to use Type-1 SS? I think most textbooks use this only. Thanks and regards, On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 2024-08-07 6:06 a.m., Brian Smith wrote: > > Hi, > > > > I have performed ANOVA as below > > > > dat = data.frame( > > 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, > > -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), > > 'B' = c(2,1,2,2,1,2,2,2,2,2), > > 'C' = c(0,1,1,1,1,1,1,0,1,1)) > > > > summary(aov(A ~ B * C, dat)) > > > > However now I also tried to calculate SSE for factor C > > > > Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) > > N = sapply(split(dat, dat$C), function(x) dim(x)[1]) > > > > N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 > > #1.691 > > > > But in ANOVA table the sum-square for C is reported as 0.77. > > > > Could you please help how exactly this C = 0.77 is obtained from aov() > > Your design isn't balanced, so there are several ways to calculate the > SS for C. What you have calculated looks like the "Type I SS" in SAS > notation, if I remember correctly, assuming that C enters the model > before B. That's not what R uses; I think it is Type II SS. > > For some details about this, see > https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/ >
Sure, summary(aov(A ~ C, dat)) will give it to you. Duncan Murdoch On 2024-08-07 8:27 a.m., Brian Smith wrote:> Hi, > > Thanks for this information. Is there any way to force R to use Type-1 > SS? I think most textbooks use this only. > > Thanks and regards, > > On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> On 2024-08-07 6:06 a.m., Brian Smith wrote: >>> Hi, >>> >>> I have performed ANOVA as below >>> >>> dat = data.frame( >>> 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, >>> -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), >>> 'B' = c(2,1,2,2,1,2,2,2,2,2), >>> 'C' = c(0,1,1,1,1,1,1,0,1,1)) >>> >>> summary(aov(A ~ B * C, dat)) >>> >>> However now I also tried to calculate SSE for factor C >>> >>> Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) >>> N = sapply(split(dat, dat$C), function(x) dim(x)[1]) >>> >>> N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 >>> #1.691 >>> >>> But in ANOVA table the sum-square for C is reported as 0.77. >>> >>> Could you please help how exactly this C = 0.77 is obtained from aov() >> >> Your design isn't balanced, so there are several ways to calculate the >> SS for C. What you have calculated looks like the "Type I SS" in SAS >> notation, if I remember correctly, assuming that C enters the model >> before B. That's not what R uses; I think it is Type II SS. >> >> For some details about this, see >> https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/ >>
Ebert,Timothy Aaron
2024-Aug-07 13:09 UTC
[R] Manually calculating values from aov() result
In proc glm, SAS will give both type I and type III sums of squares. You can get type II if you ask. Other procedures in SAS have different output as defaults. Introductory textbooks may only cover type I SS. It is easy to calculate and gets the idea across. In application one of the problems is that a researcher could reanalyze the data to get an outcome of their choice because the order in which variables appear in the model influences their type I SS. A + B + A*B + C + A*C + B*C + A*B*C B + C + A + A*B + A*C + B*C + A*B*C Each variable (or interaction) can have a different type I SS and that can change the perception of which elements of the model are significant or not. Type III SS will give the same SS values for either model. Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Brian Smith Sent: Wednesday, August 7, 2024 8:28 AM To: Duncan Murdoch <murdoch.duncan at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Manually calculating values from aov() result [External Email] Hi, Thanks for this information. Is there any way to force R to use Type-1 SS? I think most textbooks use this only. Thanks and regards, On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 2024-08-07 6:06 a.m., Brian Smith wrote: > > Hi, > > > > I have performed ANOVA as below > > > > dat = data.frame( > > 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, > > -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), 'B' > > c(2,1,2,2,1,2,2,2,2,2), 'C' = c(0,1,1,1,1,1,1,0,1,1)) > > > > summary(aov(A ~ B * C, dat)) > > > > However now I also tried to calculate SSE for factor C > > > > Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) N > > sapply(split(dat, dat$C), function(x) dim(x)[1]) > > > > N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 > > #1.691 > > > > But in ANOVA table the sum-square for C is reported as 0.77. > > > > Could you please help how exactly this C = 0.77 is obtained from > > aov() > > Your design isn't balanced, so there are several ways to calculate the > SS for C. What you have calculated looks like the "Type I SS" in SAS > notation, if I remember correctly, assuming that C enters the model > before B. That's not what R uses; I think it is Type II SS. > > For some details about this, see > https://mcfr/ > omnz.wordpress.com%2F2011%2F03%2F02%2Fanova-type-iiiiii-ss-explained%2 > F&data=05%7C02%7Ctebert%40ufl.edu%7C21219670c5d541da503d08dcb6dc6c08%7 > C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638586305046428793%7CUnkno > wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL > CJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=HIGCrSH46P%2B1Y0cXUAfZ7DMCCORvtWaiRGC > crokr4Rs%3D&reserved=0 >______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dear Brian, As Duncan mentioned, the terms type-I, II, and III sums of squares originated in SAS. The type-II and III SSs computed by the Anova() function in the car package take a different computational approach than in SAS, but in almost all cases produce the same results. (I slightly regret using the "type-*" terminology for car::Anova() because of the lack of exact correspondence to SAS.) The standard R anova() function computes type-I (sequential) SSs. The focus, however, shouldn't be on the SSs, or how they're computed, but on the hypotheses that are tested. Briefly, the hypotheses for type-I tests assume that all terms later in the sequence are 0 in the population; type-II tests assume that interactions to which main effects are marginal (and higher-order interactions to which lower-order interactions are marginal) are 0. Type-III tests don't, e.g., assume that interactions to which a main effect are marginal are 0 in testing the main effect, which represents an average over levels of the factor(s) with which the factor in the main effect interact. The description of the hypotheses for type-III tests is even more complex if there are covariates. In my opinion, researchers are usually interested in the hypotheses for type-II tests. These matters are described in detail, for example, in my applied regression text <https://www.john-fox.ca/AppliedRegression/index.html>. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-08-07 8:27 a.m., Brian Smith wrote:> [You don't often get email from briansmith199312 at gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > Caution: External email. > > > Hi, > > Thanks for this information. Is there any way to force R to use Type-1 > SS? I think most textbooks use this only. > > Thanks and regards, > > On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> On 2024-08-07 6:06 a.m., Brian Smith wrote: >>> Hi, >>> >>> I have performed ANOVA as below >>> >>> dat = data.frame( >>> 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, >>> -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), >>> 'B' = c(2,1,2,2,1,2,2,2,2,2), >>> 'C' = c(0,1,1,1,1,1,1,0,1,1)) >>> >>> summary(aov(A ~ B * C, dat)) >>> >>> However now I also tried to calculate SSE for factor C >>> >>> Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) >>> N = sapply(split(dat, dat$C), function(x) dim(x)[1]) >>> >>> N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 >>> #1.691 >>> >>> But in ANOVA table the sum-square for C is reported as 0.77. >>> >>> Could you please help how exactly this C = 0.77 is obtained from aov() >> >> Your design isn't balanced, so there are several ways to calculate the >> SS for C. What you have calculated looks like the "Type I SS" in SAS >> notation, if I remember correctly, assuming that C enters the model >> before B. That's not what R uses; I think it is Type II SS. >> >> For some details about this, see >> https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/ >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Possibly Parallel Threads
- Manually calculating values from aov() result
- Manually calculating values from aov() result
- Manually calculating values from aov() result
- marginality principle / selecting the right type of SS for an interaction hypothesis
- (off topic) article on advantages/disadvantages of types of SS?