Dear R users,
This is probably a very stupid question, nevertheless I obviously am not
qualified enough
to cope with it. I do not understand what the coefficients are that are output
by running
summary.lm on an aov object. I thought they should be the differential effects
for the levels of the factor and the overall mean, but they are obviously not,
as illustrated by the following simple example:
x <- c(1:15); y <- factor(c(rep("a", 5), rep("b",
10)))> tapply(X=x, INDEX=y, FUN=mean)
a b
3.0 10.5
> mean(x)
[1] 8
> a <- aov(x ~ y)
> summary.lm(a)$coef
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0 1.192928 2.514821 0.0258555905
yb 7.5 1.461032 5.133357 0.0001921826
> model.tables(a)
Tables of effects
y
a b
-5 2.5
rep 5 10.0
Besides, I fit a factor with two levels, "a" and "b", but
there is only the "yb" coefficient for the "b" level, no
"ya" coefficient for the "a" factor level. I read a lot of
materials on anova with R, but I could not find what are these coefficients. I
would be grateful if someone gives me some clue. And what is the intercept term?
I though it should be the overall mean, but it is obviously not.
Regards and best wishes,
Martin Ivanov
On Feb 4, 2010, at 1:02 PM, Martin Ivanov wrote:> Dear R users, > > This is probably a very stupid question, nevertheless I obviously am > not qualified enough > to cope with it. I do not understand what the coefficients are that > are output by running > summary.lm on an aov object. I thought they should be the > differential effects for the levels of the factor and the overall > mean, but they are obviously not, as illustrated by the following > simple example: > > x <- c(1:15); y <- factor(c(rep("a", 5), rep("b", 10))) >> tapply(X=x, INDEX=y, FUN=mean) > a b > 3.0 10.5 > >> mean(x) > [1] 8 > >> a <- aov(x ~ y) >> summary.lm(a)$coef > Estimate Std. Error t value Pr(>|t|) > (Intercept) 3.0 1.192928 2.514821 0.0258555905 > yb 7.5 1.461032 5.133357 0.0001921826 > > >> model.tables(a) > Tables of effects > > y > a b > -5 2.5 > rep 5 10.0 > > Besides, I fit a factor with two levels, "a" and "b", but there is > only the "yb" coefficient for the "b" level, no "ya" coefficient for > the "a" factor level.R reports treatment contrasts (at least by default) so the base level, "a" in your case, is reported as the "Intercept". The "yb effect" is the difference between the mean yb estimate and the baseline. So your estimated mean for a subject with yb="b" would be "Intercept" + beta(yb) = 10.5 So all is right in stats-land.> I read a lot of materials on anova with R, but I could not find what > are these coefficients. I would be grateful if someone gives me some > clue. And what is the intercept term? I though it should be the > overall mean, but it is obviously not.-- David.
Hi Martin,
See ?contrasts and associated help pages ?contr.sum, ?contr.treatment
etc. Also note that you can set contrasts "manually":
D <- data.frame(y = rnorm(20), group <- factor(c(rep("A", 5),
rep("B",
5), rep("C", 5), rep("D", 5))))
group.dumcodes <- matrix(c(0, 0, 0,
1, 0, 0,
0, 1, 0,
0, 0, 1), ncol=3, byrow=TRUE)
colnames(group.dumcodes) <- c("dum1", "dum2",
"dum3")
contrasts(D$group) <- group.dumcodes
summary(lm(y ~ group, data=D))
-Ista
On Thu, Feb 4, 2010 at 6:02 PM, Martin Ivanov <tramni at abv.bg>
wrote:> ?Dear R users,
>
> This is probably a very stupid question, nevertheless I obviously am not
qualified enough
> to cope with it. I do not understand what the coefficients are that are
output by running
> summary.lm on an aov object. I thought they should be the differential
effects for the levels of the factor and the overall mean, but they are
obviously not, as illustrated by the following simple example:
>
> x <- c(1:15); y <- factor(c(rep("a", 5), rep("b",
10)))
>> tapply(X=x, INDEX=y, FUN=mean)
> ? a ? ?b
> ?3.0 10.5
>
>> mean(x)
> [1] 8
>
>> a <- aov(x ~ y)
>> summary.lm(a)$coef
> ? ? ? ? ? ?Estimate Std. Error ?t value ? ? Pr(>|t|)
> (Intercept) ? ? ?3.0 ? 1.192928 2.514821 0.0258555905
> yb ? ? ? ? ? ? ? 7.5 ? 1.461032 5.133357 0.0001921826
>
>
>> model.tables(a)
> Tables of effects
>
> ?y
> ? ? a ? ?b
> ? ?-5 ?2.5
> rep ?5 10.0
>
> Besides, I fit a factor with two levels, "a" and "b",
but there is only the "yb" coefficient for the "b" level, no
"ya" coefficient for the "a" factor level. I read a lot of
materials on anova with R, but I could not find what are these coefficients. I
would be grateful if someone gives me some clue. And what is the intercept term?
I though it should be the overall mean, but it is obviously not.
>
> Regards and best wishes,
>
> Martin Ivanov
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org