Suppose I have longitudinal data and want to use the econometric strategy of
"de-meaning" a model matrix by time. For sake of illustration
'mat' is a model matrix for 3 individuals each with 3 observations where
``1'' denotes that individual i was in group j at time t or
``0'' otherwise.
mat <- matrix(c(1,1,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,1,0,0,0,1,1,0),
ncol=3)
mat <- data.frame(mat, id=gl(3,3))
I can conceive of two ways of de-meaning: either use an explicit loop or use
mapply, both of which are below.
# put this in a loop over each column to create the de-meaned X matrix
mat2 <- matrix(0, 9,3)
for(i in 1:3){
mat2[,i] <- mat[,i] - ave(mat[,i], mat$id)
}
# Or use mapply as follows
mat[,1:3]-mapply(ave, mat[,1:3], MoreArgs=list(mat$id))
Both work, but they require that the model matrix is explictly created and then
used in the regression. For example, assume I am using the star data in the
mlmRev package
data(star, package='mlmRev')
I would first need to explictly create the model matrix for the fixed effects as
follows and then use the strategy above to de-mean this matrix.
mat <-model.matrix(lm(math~ -1 + sch, star))
Of course in R, this is rather inefficient as one generally only needs to have a
factor for any independent variables and the model matrix is created for you
when using lm(). So, my question is whether there is a more efficient way of
creating the time de-meaned model matrix? Or, is the solution above the kind of
strategy that must be used for this situation?
Harold
[[alternative HTML version deleted]]