Fernando Henrique Ferraz P. da Rosa
2004-Sep-07 03:01 UTC
[R] Contrast matrices for nested factors
Hi, I'd like to know if it's possible to specify different contrast matrices in lm() for a factor that is nested within another one. This is useful when we have a model where the nested factor has a different number of levels, depending on the main factor. Let me illustrate with an example to make it clearer. Consider the following data set: set.seed(1) y <- rnorm(14) a <- factor(c(rep(1,7),rep(2,3),rep(3,4))) b <- factor(c(1,1,1,2,2,3,3,1,1,2,1,1,2,2)) k <- factor(c(1,2,3,1,2,1,2,1,2,1,1,2,1,2)) internal <- data.frame(y,a,b,k) Where y is an arbitrary response, a is a main factor, b is a factor nested within a, and k is the replicate number. It is easy to see that depending on the level of a, b has different numbers of levels. For instance, when a = 1, we have that b might assume values 1, 2 or 3, while a = 2 or 3, b might assume only 1 or 2. I'd like then to use contrasts summing to 0, so I issue: z <- lm(y ~ a + a/b,data=internal,contrasts=list(a=contr.sum, b=contr.sum)) The problem is, the design matrix is not quite what I expected. What happens is, instead of using a different contrast matrix for each level of a where b is nested, it's using the same contrast matrix for every b, namely: > contr.sum(3) [,1] [,2] 1 1 0 2 0 1 3 -1 -1 So, when a=1, the columns of the design matrix are as expected. It sums to 0, because there are levels of b 1, 2 and 3, when a=1. But, when a=2 or a=3, the same contrast matrix is being used, and then, the factor effects do not sum to 0. That's obviously because there are no values for b equal 3, when a != 1, and then the coding that gets done is '0' or '1'. The design matrix lm() is creating is:> model.matrix(z)(Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2 a2:b2 a3:b2 1 1 1 0 1 0 0 0 0 0 2 1 1 0 1 0 0 0 0 0 3 1 1 0 1 0 0 0 0 0 4 1 1 0 0 0 0 1 0 0 5 1 1 0 0 0 0 1 0 0 6 1 1 0 -1 0 0 -1 0 0 7 1 1 0 -1 0 0 -1 0 0 8 1 0 1 0 1 0 0 0 0 9 1 0 1 0 1 0 0 0 0 10 1 0 1 0 0 0 0 1 0 11 1 -1 -1 0 0 1 0 0 0 12 1 -1 -1 0 0 1 0 0 0 13 1 -1 -1 0 0 0 0 0 1 14 1 -1 -1 0 0 0 0 0 1 What I would like to use is: (Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2 1 1 1 0 1 0 0 0 0 0 2 1 1 0 1 0 0 0 0 0 3 1 1 0 1 0 0 0 0 0 4 1 1 0 0 0 0 1 0 0 5 1 1 0 0 0 0 1 0 0 6 1 1 0 -1 0 0 -1 0 0 7 1 1 0 -1 0 0 -1 0 0 8 1 0 1 0 1 0 0 0 0 9 1 0 1 0 1 0 0 0 0 10 1 0 1 0 -1 0 0 0 0 11 1 -1 -1 0 0 1 0 0 0 12 1 -1 -1 0 0 1 0 0 0 13 1 -1 -1 0 0 -1 0 0 0 14 1 -1 -1 0 0 -1 0 0 0 (notice that in the second matrix all collumns sum to 0, in the first they don't). Thank you, -- Fernando Henrique Ferraz P. da Rosa http://www.ime.usp.br/~feferraz
You should be able to get the behavior you want using the fit.glh() (short for fit general linear hypothesis) function from the gregmisc/gmodels package. -G> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of > Fernando Henrique > Ferraz P. da Rosa > Sent: Monday, September 06, 2004 11:02 PM > To: r-help > Subject: [R] Contrast matrices for nested factors > > > Hi, I'd like to know if it's possible to specify different > contrast matrices in lm() for a factor that is nested within > another one. This > is useful when we have a model where the nested factor has a different > number of levels, depending on the main factor. > > Let me illustrate with an example to make it clearer. Consider > the following data set: > > set.seed(1) > y <- rnorm(14) > a <- factor(c(rep(1,7),rep(2,3),rep(3,4))) > b <- factor(c(1,1,1,2,2,3,3,1,1,2,1,1,2,2)) > k <- factor(c(1,2,3,1,2,1,2,1,2,1,1,2,1,2)) > internal <- data.frame(y,a,b,k) > > Where y is an arbitrary response, a is a main factor, b is a > factor nested within a, and k is the replicate number. It is > easy to see > that depending on the level of a, b has different numbers of > levels. For > instance, when a = 1, we have that b might assume values 1, 2 or 3, > while a = 2 or 3, b might assume only 1 or 2. > > I'd like then to use contrasts summing to 0, so I issue: > > z <- lm(y ~ a + a/b,data=internal,contrasts=list(a=contr.sum, > b=contr.sum)) > > The problem is, the design matrix is not quite what I > expected. > What happens is, instead of using a different contrast matrix for each > level of a where b is nested, it's using the same contrast matrix for > every b, namely: > > > contr.sum(3) > [,1] [,2] > 1 1 0 > 2 0 1 > 3 -1 -1 > > So, when a=1, the columns of the design matrix are as > expected. > It sums to 0, because there are levels of b 1, 2 and 3, when a=1. But, > when a=2 or a=3, the same contrast matrix is being used, and then, the > factor effects do not sum to 0. That's obviously because there are no > values for b equal 3, when a != 1, and then the coding that > gets done is > '0' or '1'. > > The design matrix lm() is creating is: > > > model.matrix(z) > (Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2 a2:b2 a3:b2 > 1 1 1 0 1 0 0 0 0 0 > 2 1 1 0 1 0 0 0 0 0 > 3 1 1 0 1 0 0 0 0 0 > 4 1 1 0 0 0 0 1 0 0 > 5 1 1 0 0 0 0 1 0 0 > 6 1 1 0 -1 0 0 -1 0 0 > 7 1 1 0 -1 0 0 -1 0 0 > 8 1 0 1 0 1 0 0 0 0 > 9 1 0 1 0 1 0 0 0 0 > 10 1 0 1 0 0 0 0 1 0 > 11 1 -1 -1 0 0 1 0 0 0 > 12 1 -1 -1 0 0 1 0 0 0 > 13 1 -1 -1 0 0 0 0 0 1 > 14 1 -1 -1 0 0 0 0 0 1 > > > What I would like to use is: > > (Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2 > 1 1 1 0 1 0 0 0 0 0 > 2 1 1 0 1 0 0 0 0 0 > 3 1 1 0 1 0 0 0 0 0 > 4 1 1 0 0 0 0 1 0 0 > 5 1 1 0 0 0 0 1 0 0 > 6 1 1 0 -1 0 0 -1 0 0 > 7 1 1 0 -1 0 0 -1 0 0 > 8 1 0 1 0 1 0 0 0 0 > 9 1 0 1 0 1 0 0 0 0 > 10 1 0 1 0 -1 0 0 0 0 > 11 1 -1 -1 0 0 1 0 0 0 > 12 1 -1 -1 0 0 1 0 0 0 > 13 1 -1 -1 0 0 -1 0 0 0 > 14 1 -1 -1 0 0 -1 0 0 0 > > (notice that in the second matrix all collumns sum to > 0, in the > first they don't). > > > Thank you, > > -- > Fernando Henrique Ferraz P. da Rosa > http://www.ime.usp.br/~feferraz > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}