The following is a data frame
> "jjd" <- structure(list(Observations = c(6.8, 6.6, 5.3, 6.1,
7.5, 7.4, 7.2, 6.5, 7.8, 9.1, 8.8, 9.1), LevelA = structure(c(1,
1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3), .Label = c("A1",
"A2",
"A3"), class = "factor"), LevelB = structure(c(1, 1,
2, 2,
1, 1, 2, 2, 1, 1, 2, 2), .Label = c("B1", "B2"), class
= "factor")),
.Names = c("Observations", "LevelA",
"LevelB"), row.names = c("1",
"2", "3", "4", "5",
"6", "7", "8", "9", "10",
"11", "12"),
class = "data.frame")
representing data from
@BOOK{Dobson02,
author = {Annette J. Dobson},
year = 2002,
title = {An Introduction to Generalized Linear Models},
edition = {2.},
publisher = {Chapman \& Hall/CRC},
address = {Boca Raton, Florida, 33431}
}
page 101. To reproduce the estimates c(6.7,0.75,1.75,-1.0,0.4,1.5)
given on page 103 in a two factor ANOVA entering
> jja1 <- lm(Observations~LevelA*LevelB,data=jjd)
> summary(jja1)
I get
Call:
lm(formula = Observations ~ LevelA * LevelB, data = jjd)
Residuals:
Min 1Q Median 3Q Max
-6.500e-01 -2.000e-01 -3.469e-17 2.000e-01 6.500e-01
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.7000 0.3512 19.078 1.34e-06 ***
LevelAA2 0.7500 0.4967 1.510 0.1818
LevelAA3 1.7500 0.4967 3.524 0.0125 *
LevelBB2 -1.0000 0.4967 -2.013 0.0907 .
LevelAA2:LevelBB2 0.4000 0.7024 0.569 0.5897
LevelAA3:LevelBB2 1.5000 0.7024 2.136 0.0766 .
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` '
1
Residual standard error: 0.4967 on 6 degrees of freedom
Multiple R-Squared: 0.9065, Adjusted R-squared: 0.8286
F-statistic: 11.64 on 5 and 6 DF, p-value: 0.00481
This is fine. But why do I get these estimates?
Entering
> model.matrix(jja1)
delivers
(Intercept) LevelAA2 LevelAA3 LevelBB2 LevelAA2:LevelBB2
LevelAA3:LevelBB2
1 1 0 0 0 0
0
2 1 0 0 0 0
0
3 1 0 0 1 0
0
4 1 0 0 1 0
0
5 1 1 0 0 0
0
6 1 1 0 0 0
0
7 1 1 0 1 1
0
8 1 1 0 1 1
0
9 1 0 1 0 0
0
10 1 0 1 0 0
0
11 1 0 1 1 0
1
12 1 0 1 1 0
1
attr(,"assign")
[1] 0 1 1 2 3 3
attr(,"contrasts")
attr(,"contrasts")$LevelA
[1] "contr.treatment"
attr(,"contrasts")$LevelB
[1] "contr.treatment"
which shows that internally lm() seems to use corner point constraints
of the form
\[\alpha_1=\beta_1=(\alpha\beta)_{11}(\alpha\beta)_{12}=(\alpha\beta)_{12}=(\alpha\beta)_{31}=0\]
in the model $E[Y_{jkl}]=\mu+\alpha_j+\beta_k+(\alpha\beta)_{jk}$
$j=1,2,3$, $k=1,2$, $l=1,2$, Dobson, page 102.
My question is: how can I incorporate restrictions like
$\alpha_1+\alpha_2+\alpha_3=0$, $\beta_1+\beta_2=0$,
$(\alpha\beta)_{21}+\alpha\beta)_{22}=0$,
$(\alpha\beta)_{31}+(\alpha\beta)_{32}=0$ and
$(\alpha\beta)_{11}+(\alpha\beta)_{21}+(\alpha\beta)_{31}=0$ from the
outset? Or put another way: Why is it that lm() uses the corner point
constraints by default? Where can I find a documentation for this
behavior?
I know that I can use something like lm(y~X) where y <- c(6.8, 6.6,
5.3, 6.1, 7.5, 7.4, 7.2, 6.5, 7.8, 9.1, 8.8, 9.1) and X is an
appropriate design matrix. But I wonder if there is a more direct way.
Many thanks in advance.
D. Trenkler
--
Dietrich Trenkler Universit??t Osnabr??ck
FB Wirtschaftswissenschaften
Rolandstr.8 D-49069 Osnabr??ck
dtrenkler at nts6.oec.uni-osnabrueck.de