Dear R users, This is probably a very stupid question, nevertheless I obviously am not qualified enough to cope with it. I do not understand what the coefficients are that are output by running summary.lm on an aov object. I thought they should be the differential effects for the levels of the factor and the overall mean, but they are obviously not, as illustrated by the following simple example: x <- c(1:15); y <- factor(c(rep("a", 5), rep("b", 10)))> tapply(X=x, INDEX=y, FUN=mean)a b 3.0 10.5> mean(x)[1] 8> a <- aov(x ~ y) > summary.lm(a)$coefEstimate Std. Error t value Pr(>|t|) (Intercept) 3.0 1.192928 2.514821 0.0258555905 yb 7.5 1.461032 5.133357 0.0001921826> model.tables(a)Tables of effects y a b -5 2.5 rep 5 10.0 Besides, I fit a factor with two levels, "a" and "b", but there is only the "yb" coefficient for the "b" level, no "ya" coefficient for the "a" factor level. I read a lot of materials on anova with R, but I could not find what are these coefficients. I would be grateful if someone gives me some clue. And what is the intercept term? I though it should be the overall mean, but it is obviously not. Regards and best wishes, Martin Ivanov
On Feb 4, 2010, at 1:02 PM, Martin Ivanov wrote:> Dear R users, > > This is probably a very stupid question, nevertheless I obviously am > not qualified enough > to cope with it. I do not understand what the coefficients are that > are output by running > summary.lm on an aov object. I thought they should be the > differential effects for the levels of the factor and the overall > mean, but they are obviously not, as illustrated by the following > simple example: > > x <- c(1:15); y <- factor(c(rep("a", 5), rep("b", 10))) >> tapply(X=x, INDEX=y, FUN=mean) > a b > 3.0 10.5 > >> mean(x) > [1] 8 > >> a <- aov(x ~ y) >> summary.lm(a)$coef > Estimate Std. Error t value Pr(>|t|) > (Intercept) 3.0 1.192928 2.514821 0.0258555905 > yb 7.5 1.461032 5.133357 0.0001921826 > > >> model.tables(a) > Tables of effects > > y > a b > -5 2.5 > rep 5 10.0 > > Besides, I fit a factor with two levels, "a" and "b", but there is > only the "yb" coefficient for the "b" level, no "ya" coefficient for > the "a" factor level.R reports treatment contrasts (at least by default) so the base level, "a" in your case, is reported as the "Intercept". The "yb effect" is the difference between the mean yb estimate and the baseline. So your estimated mean for a subject with yb="b" would be "Intercept" + beta(yb) = 10.5 So all is right in stats-land.> I read a lot of materials on anova with R, but I could not find what > are these coefficients. I would be grateful if someone gives me some > clue. And what is the intercept term? I though it should be the > overall mean, but it is obviously not.-- David.
Hi Martin, See ?contrasts and associated help pages ?contr.sum, ?contr.treatment etc. Also note that you can set contrasts "manually": D <- data.frame(y = rnorm(20), group <- factor(c(rep("A", 5), rep("B", 5), rep("C", 5), rep("D", 5)))) group.dumcodes <- matrix(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1), ncol=3, byrow=TRUE) colnames(group.dumcodes) <- c("dum1", "dum2", "dum3") contrasts(D$group) <- group.dumcodes summary(lm(y ~ group, data=D)) -Ista On Thu, Feb 4, 2010 at 6:02 PM, Martin Ivanov <tramni at abv.bg> wrote:> ?Dear R users, > > This is probably a very stupid question, nevertheless I obviously am not qualified enough > to cope with it. I do not understand what the coefficients are that are output by running > summary.lm on an aov object. I thought they should be the differential effects for the levels of the factor and the overall mean, but they are obviously not, as illustrated by the following simple example: > > x <- c(1:15); y <- factor(c(rep("a", 5), rep("b", 10))) >> tapply(X=x, INDEX=y, FUN=mean) > ? a ? ?b > ?3.0 10.5 > >> mean(x) > [1] 8 > >> a <- aov(x ~ y) >> summary.lm(a)$coef > ? ? ? ? ? ?Estimate Std. Error ?t value ? ? Pr(>|t|) > (Intercept) ? ? ?3.0 ? 1.192928 2.514821 0.0258555905 > yb ? ? ? ? ? ? ? 7.5 ? 1.461032 5.133357 0.0001921826 > > >> model.tables(a) > Tables of effects > > ?y > ? ? a ? ?b > ? ?-5 ?2.5 > rep ?5 10.0 > > Besides, I fit a factor with two levels, "a" and "b", but there is only the "yb" coefficient for the "b" level, no "ya" coefficient for the "a" factor level. I read a lot of materials on anova with R, but I could not find what are these coefficients. I would be grateful if someone gives me some clue. And what is the intercept term? I though it should be the overall mean, but it is obviously not. > > Regards and best wishes, > > Martin Ivanov > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org