Hi Professor Howell,
I think the issue here is simply in the assumption that the regression
coefficients will always be equal to the product of the means and the
contrast codes. I tend to think of regression coefficients as the
quotient of the covariance of x and y divided by the variance of x,
and this definition agrees with the coefficients calculated by lm().
See below for a long-winded example.
On Wed, Sep 29, 2010 at 3:42 PM, David Howell <David.Howell at uvm.edu>
wrote:> ?#I am trying to understand how R fits models for contrasts in a
> #simple one-way anova. This is an example, I am not stupid enough to want
> #to simultaneously apply all of these contrasts to real data. With a few
> #exceptions, the tests that I would compute by hand (or by other software)
> #will give the same t or F statistics. It is the contrast estimates that R
> produces
> #that I can't seem to understand.
> #
> # In searching for answers to this problem, I found a great PowerPoint
slide
> (I think by John Fox).
> # The slide pointed to the coefficients, said something like "these
are
> coeff. that no one could love," and
> #then suggested looking at the means to understand where they came from. I
> have stared
> # and stared at his means and then my means, but can't find a
relationship.
>
> # The following code and output illustrates the problem.
>
> # Various examples of Anova using R
>
> dv <- c(1.28, ?1.35, ?3.31, ?3.06, ?2.59, ?3.25, ?2.98, ?1.53, -2.68,
?2.64,
> ?1.26, ?1.06,
> ? ? ? -1.18, ?0.15, ?1.36, ?2.61, ?0.66, ?1.32, ?0.73, -1.06, ?0.24, ?0.27,
> ?0.72, ?2.28,
> ? ? ? -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, -1.20,
> -0.31, -0.74,
> ? ? ? -0.45, ?0.54, -0.98, ?1.68, ?2.25, -0.19, -0.90, ?0.78, ?0.05, ?2.69,
> ?0.15, ?0.91,
> ? ? ? ?2.01, ?0.40, ?2.34, -1.80, ?5.00, ?2.27, ?6.47, ?2.94, ?0.47, ?3.22,
> ?0.01, -0.66)
>
> group <- factor(rep(1:5, each = 12))
>
>
> # Use treatment contrasts to compare each group to the first group.
> options(contrasts = c("contr.treatment","contr.poly"))
?# The default
> model2 <- lm(dv ~ group)
> summary(model2)
> ?# Summary table is the same--as it should be
> ?# Intercept is Group 1 mean and other coeff. are deviations from that.
> ?# This is what I would expect.
> ?#summary(model1)
> ?# ? ? ? ? ? ? ?Df Sum Sq Mean Sq F value ? ?Pr(>F)
> ?# ?group ? ? ? ?4 ?62.46 15.6151 ?6.9005 0.0001415 ***
> ?# ?Residuals ? 55 124.46 ?2.2629
> ?#Coefficients:
> ?# ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|)
> ?#(Intercept) ?1.80250 ? ?0.43425 ? 4.151 0.000116 ***
> ?#group2 ? ? ?-1.12750 ? ?0.61412 ?-1.836 0.071772 .
> ?#group3 ? ? ?-2.71500 ? ?0.61412 ?-4.421 4.67e-05 ***
> ?#group4 ? ? ?-1.25833 ? ?0.61412 ?-2.049 0.045245 *
> ?#group5 ? ? ? 0.08667 ? ?0.61412 ? 0.141 0.888288
>
>
> # Use sum contrasts to compare each group against grand mean.
> options(contrasts = c("contr.sum","contr.poly"))
> model3 <- lm(dv ~ group)
> summary(model3)
>
> ?# Again, this is as expected. Intercept is grand mean and others are
> deviatoions from that.
> ?#Coefficients:
> ?# ? ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|)
> ?# ?(Intercept) ? 0.7997 ? ? 0.1942 ? 4.118 0.000130 ***
> ?# ?group1 ? ? ? ?1.0028 ? ? 0.3884 ? 2.582 0.012519 *
> ?# ?group2 ? ? ? -0.1247 ? ? 0.3884 ?-0.321 0.749449
> ?# ?group3 ? ? ? -1.7122 ? ? 0.3884 ?-4.408 4.88e-05 ***
> ?# ?group4 ? ? ? -0.2555 ? ? 0.3884 ?-0.658 0.513399
>
> #SO FAR, SO GOOD
>
> # IF I wanted polynomial contrasts BY HAND I would use
> # ? ?a(i) = ?-2 ? -1 ? 0 ? 1 ? 2 ? for linear contrast ? ? ? ?(or some
> linear function of this )
> # ? ?Effect = Sum(a(j)M(i)) ? ?# where M = mean
> # ? ?Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) >
0.043
> # ? ?SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2)) ?= 12(.043)/10 = .002
> # ? ?F(linear) = SS(linear)/MS(error) = .002/2.263 = .001
> # ? ?t(linear) = sqrt(.001) = .031
>
> # To do this in R I would use
> order.group <- ordered(group)
> model4 <- lm(dv~order.group)
> summary(model4)
> # ?This gives:
> ? ?#Coefficients:
> # ? ? ? ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|)
> # ? ?(Intercept) ? ?0.79967 ? ?0.19420 ? 4.118 0.000130 ***
> # ? ?order.group.L ?0.01344 ? ?0.43425 ? 0.031 0.975422
> # ? ?order.group.Q ?2.13519 ? ?0.43425 ? 4.917 8.32e-06 ***
> # ? ?order.group.C ?0.11015 ? ?0.43425 ? 0.254 0.800703
> # ? ?order.group^4 -0.79602 ? ?0.43425 ?-1.833 0.072202 .
>
> # The t value for linear is same as I got (as are others) but I don't
> understand
> # the estimates. The intercept is the grand mean, but I don't see the
> relationship
> # of other estimates to that or to the ones I get by hand.
> # My estimates are the sum of (coeff times means) i.e. ?0 (intercept),
> .0425, 7.989, .3483, -6.66
> # and these are not a linear (or other nice pretty) function of est. from
R.
# OK, let's break it down
Means <- tapply(dv, order.group, mean)
coefs.by.hand.m4 <- c(mean(Means),
sum(Means*contrasts(order.group)[,1]),
sum(Means*contrasts(order.group)[,2]),
sum(Means*contrasts(order.group)[,3]),
sum(Means*contrasts(order.group)[,4]))
(coefs.check.m4 <- rbind(coef(model4), coefs.by.hand.m4))
## OK, so these coefficeints are in fact the sum of the product of the
contrast codes and the group means. The only difference between this
and what you got by hand calculation should be in the actual contrast
codes used. You don't say what they are, but I'll assume you did the
standard thing, like this:
contrasts.by.hand <- cont.by.hand <- matrix(c(-2, -1, 0, 1, 2, 2, -1,
-2, -1, 2, -1, 2, 0, -2, 1, 1, -4, 6, -4, 1), byrow=FALSE, ncol=4,
dimnames=list(1:5, c("L", "Q", "C",
"x4")))
contrasts(order.group) <- contrasts.by.hand
model5 <- lm(dv~order.group)
summary(model5)
coefs.by.hand.m5 <- c(mean(Means),
sum(Means*contrasts(order.group)[,1]),
sum(Means*contrasts(order.group)[,2]),
sum(Means*contrasts(order.group)[,3]),
sum(Means*contrasts(order.group)[,4]))
(coefs.check.m5 <- rbind(coef(model5), coefs.by.hand.m5))
## Not the same, and not a linear function of one another
## OK, let's see what's going on here. We can define the regression
coefficient b_i as cov(x_iy)/var(x). First check model 4
mm4 <- as.data.frame(cbind(dv, model.matrix(model4)[,-1]))
cov.L.m4 <- cov(mm4[,"dv"], mm4[,"order.group.L"])
cov.Q.m4 <- cov(mm4[,"dv"], mm4[,"order.group.Q"])
cov.C.m4 <- cov(mm4[,"dv"], mm4[,"order.group.C"])
cov.x4.m4 <- cov(mm4[,"dv"], mm4[,"order.group^4"])
var.L.m4 <- var(mm4[,"order.group.L"])
var.Q.m4 <- var(mm4[,"order.group.Q"])
var.C.m4 <- var(mm4[,"order.group.C"])
var.x4.m4 <- var(mm4[,"order.group^4"])
covs.m4 <- c(cov.L.m4, cov.Q.m4, cov.C.m4, cov.x4.m4)
vars.m4 <- c(var.L.m4, var.Q.m4, var.C.m4, var.x4.m4)
coefs.by.hand.m4.2 <- c(mean(dv), covs.m4/vars.m4)
(coefs.check.m4.2 <- rbind(coef(model4), coefs.by.hand.m4.2))
## OK those are equal. Now try for model 5
mm5 <- as.data.frame(cbind(dv, model.matrix(model5)[,-1]))
names(mm5) <- names(mm4)
cov.L.m5 <- cov(mm5[,"dv"], mm5[,"order.group.L"])
cov.Q.m5 <- cov(mm5[,"dv"], mm5[,"order.group.Q"])
cov.C.m5 <- cov(mm5[,"dv"], mm5[,"order.group.C"])
cov.x4.m5 <- cov(mm5[,"dv"], mm5[,"order.group^4"])
var.L.m5 <- var(mm5[,"order.group.L"])
var.Q.m5 <- var(mm5[,"order.group.Q"])
var.C.m5 <- var(mm5[,"order.group.C"])
var.x4.m5 <- var(mm5[,"order.group^4"])
covs.m5 <- c(cov.L.m5, cov.Q.m5, cov.C.m5, cov.x4.m5)
vars.m5 <- c(var.L.m5, var.Q.m5, var.C.m5, var.x4.m5)
coefs.by.hand.m5.2 <- c(mean(dv), covs.m5/vars.m5)
(coefs.check.m5.2 <- rbind(coef(model5), coefs.by.hand.m5.2))
## Those are equal. OK, so we see that the coefficients are
consistently equal to cov(xy)/var(x), but only equal to
sum(contrasts(x)*Means) under certain circumstances. What are those
circumstances? Let's try scaling our contrasts so they have the same
variance
contrasts.by.hand.scaled <- scale(contrasts.by.hand)
contrasts(order.group) <- contrasts.by.hand.scaled
model7 <- lm(dv~order.group)
summary(model7)
coefs.by.hand.m7 <- c(mean(Means),
sum(Means*contrasts(order.group)[,1]),
sum(Means*contrasts(order.group)[,2]),
sum(Means*contrasts(order.group)[,3]),
sum(Means*contrasts(order.group)[,4]))
(coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
## Not the same, but they are a linear function of one another:
coefs.by.hand.m7 <- c(mean(Means),
sum(Means*contrasts(order.group)[,1])/4,
sum(Means*contrasts(order.group)[,2])/4,
sum(Means*contrasts(order.group)[,3])/4,
sum(Means*contrasts(order.group)[,4])/4)
(coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
## Hopefully this clarifies what the coefficients calculated by the
lm() function represent, i.e., cor(xy)/var(x)
>
> # Just as a check, I forced intercept through zero with with deviation
> scores or -1 in model.
> # ?Now force intercept to 0 by using deviation scores
>
> devdv <- dv-mean(dv)
> model5 <- lm(devdv~order.group)
> summary(model5)
> #Same as above except intercept = 0
>
> # Now do it by removing the intercept
> model6 <- lm(dv~order.group -1)
> summary(model6)
> # ?Weird because coeff = cell means
> ? ?# ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|)
> # ? ?order.group1 ? 1.8025 ? ? 0.4342 ? 4.151 0.000116 ***
> # ? ?order.group2 ? 0.6750 ? ? 0.4342 ? 1.554 0.125824
> # ? ?order.group3 ?-0.9125 ? ? 0.4342 ?-2.101 0.040208 *
> # ? ?order.group4 ? 0.5442 ? ? 0.4342 ? 1.253 0.215464
> # ? ?order.group5 ? 1.8892 ? ? 0.4342 ? 4.350 5.94e-05 ***
>
> # BUT note that row labels would suggest that we were not solving for
> polynomials,
> # but the contrast matrix is made up of polynomial coefficients.
## I think you're correct that by removing the intercept lm() is no
longer using contrast coding. I've never really understood regression
models without intercept terms I'm afraid...
>
> # contrasts(order.group)
> # ? ? ? ? ? ? ? ?.L ? ? ? ? .Q ? ? ? ? ? ?.C ? ? ? ? ^4
> #[1,] -6.324555e-01 ?0.5345225 -3.162278e-01 ?0.1195229
> #[2,] -3.162278e-01 -0.2672612 ?6.324555e-01 -0.4780914
> #[3,] -3.287978e-17 -0.5345225 ?1.595204e-16 ?0.7171372
> #[4,] ?3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
> #[5,] ?6.324555e-01 ?0.5345225 ?3.162278e-01 ?0.1195229
>
> #So, when I use polynomials (either through contr.poly or by supplying a
> matrix of
> #coefficients, what do the estimates represent?
I hope my examples have clarified this. They represent the increase in
y for a one-unit increase in X. What that one-unit increase represents
is of course a function of the contrast codes used.
>
> # My problem is not restricted to polynomials. If I try a set of orthogonal
> linear contrasts
> # on group means I get
> #contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0),
c(-0,0,0,1,-1),
> c(1,1,4,-1,1))
# These are not orthogonal:
cor(contrasts(group))
## fixing that gives us
contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0),
c(-0,0,0,1,-1), c(1,1,-4,1,1))
model3 <- lm(dv ~ group)
summary(model3)
## These coefficients are functions of the specified contrasts:
coefs.by.hand.m3 <- c(mean(Means),
(mean(Means[1:2]) - mean(Means[4:5]))/2,
(Means[1] - Means[2])/2,
(Means[4] - Means[5])/2,
(mean(Means[c(1,2,4,5)]) - Means[3])/5)
## Note that we divide each mean difference by the difference of the contrasts
(coef.check.m3 <- rbind(coef(model3), coefs.by.hand.m3))
Hope it helps,
Ista
> #model3 <- lm(dv ~ group)
> #summary(model3)
> #Coefficients:
> # ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|)
> #(Intercept) ? 1.5335 ? ? 0.2558 ? 5.995 1.64e-07 ***
> #group1 ? ? ? ?0.3168 ? ? 0.2279 ? 1.390 0.170185
> #group2 ? ? ? ?0.5638 ? ? 0.3071 ? 1.836 0.071772 .
> #group3 ? ? ? -1.2840 ? ? 0.3369 ?-3.811 0.000352 ***
> #group4 ? ? ? -0.6115 ? ? 0.1387 ?-4.408 4.88e-05 ***
> #These are not even close to what I would expect. By hand I would compute
> the contrasts as
> # .0442, 1.1275, 1.3450, and 8.5608 with different t values.
>
> # Any help would be appreciated.
>
> Thanks,
> Dave Howell
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org