J.R. Lockwood
2003-Jun-19 19:58 UTC
[R] Fitting particular repeated measures model with lme()
Hello,
I have a simulated data structure in which students are nested within
teachers, and with each student are associated two test scores. There
are 20 classrooms and 25 students per classroom, for a total of 500
students and two scores per student. Here are the first 10 lines of
my dataframe "d":
studid tchid Y time
1 1 1 -1.0833222 0
2 1 1 -0.7656281 1
3 2 1 -1.0198641 0
4 2 1 0.7808148 1
5 3 1 -1.1381721 0
6 3 1 -0.4395021 1
7 4 1 -2.0944685 0
8 4 1 -1.8746840 1
9 5 1 -0.7784412 0
10 5 1 1.9952170 1
...
I am trying to use lme() to fit a relatively basic repeated measures
model where there are random teacher intercepts, and an unstructured
residual covariance matrix within students. The following call to
lme() seems to fit the model:
lme.t5<-lme(fixed=Y~time,data=d,random=~1|tchid,weights=varIdent(form=~1|time),\
correlation=corSymm(form = ~1|tchid/studid))
Now, I would like to try to alter this model to one in which the
"teacher effect" applies to only one year. One can think of the first
score on the student as a score from a prior year (for which I have no
teacher links), and the second score is from the current year and is
linked to the teacher. The model for student j in class i is:
Y_{ij0} = a_0 + e_{ij0}
Y_{ij1} = a_1 + b_i + e_{ij1}
with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1})
unstructured. That is, if the data are organized by student, the "Z"
matrix in the usual linear mixed model notation has every other row
equal to a row of zeros.
I am wondering whether there is some way to fit this model using
lme(). Thanks in advance for your help and patience.
best regards,
J.R. Lockwood
412-683-2300 x4941
lockwood at rand.org
http://www.rand.org/methodology/stat/members/lockwood/
J.R. Lockwood
2003-Jun-20 15:06 UTC
[R] Fitting particular repeated measures model with lme()
Hello,
I have a simulated data structure in which students are nested within
teachers, and with each student are associated two test scores. There
are 20 classrooms and 25 students per classroom, for a total of 500
students and two scores per student. Here are the first 10 lines of
my dataframe "d":
studid tchid Y time
1 1 1 -1.0833222 0
2 1 1 -0.7656281 1
3 2 1 -1.0198641 0
4 2 1 0.7808148 1
5 3 1 -1.1381721 0
6 3 1 -0.4395021 1
7 4 1 -2.0944685 0
8 4 1 -1.8746840 1
9 5 1 -0.7784412 0
10 5 1 1.9952170 1
...
I am trying to use lme() to fit a relatively basic repeated measures
model where there are random teacher intercepts, and an unstructured
residual covariance matrix within students. The following call to
lme() seems to fit the model:
lme.t5<-lme(fixed=Y~time,data=d,random=~1|tchid,weights=varIdent(form=~1|time),\
correlation=corSymm(form = ~1|tchid/studid))
Now, I would like to try to alter this model to one in which the
"teacher effect" applies to only one year. One can think of the first
score on the student as a score from a prior year (for which I have no
teacher links), and the second score is from the current year and is
linked to the teacher. The model for student j in class i is:
Y_{ij0} = a_0 + e_{ij0}
Y_{ij1} = a_1 + b_i + e_{ij1}
with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1})
unstructured. That is, if the data are organized by student, the "Z"
matrix in the usual linear mixed model notation has every other row
equal to a row of zeros.
I am wondering whether there is some way to fit this model using
lme(). Thanks in advance for your help and patience.
best regards,
J.R. Lockwood
412-683-2300 x4941
lockwood at rand.org
http://www.rand.org/methodology/stat/members/lockwood/
J.R. Lockwood
2003-Jul-02 18:07 UTC
[R] Re: Fitting particular repeated measures model with lme()
Hello,
A couple weeks ago I posted a message about using lme() to fit a model
based on the following: Students are tested in two years, and are
linked to teachers in the second year only. Thus students are not
nested within teachers in the traditional sense. The model for
student j in class i is:
Y_{ij0} = a_0 + e_{ij0}
Y_{ij1} = a_1 + b_i + e_{ij1}
with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1})
unstructured. Thus if the data are organized by student, the "Z"
matrix in the usual linear mixed model notation has every other row
equal to a row of zeros. For reference I include at the end of this
message a function "generate.data()" that simulates (balanced) data
according to this model.
I think I figured out a way to fit this model in lme() but I have some
questions about it. My strategy was essentially to brute-force create
the Z matrix with columns containing indicators of the second year
teacher interleaved with zeros for the first year scores, and then
force the covariance matrix of the random effects to be a constant
times the identity. Here is the code I am using:
############
library(nlme)
library(MASS)
ntch<-30
d<-generate.data(k=ntch)
varnames<-paste("tchid",1:ntch,sep="")
fmla<-as.formula(paste("~",paste(varnames,collapse="+"),"-1"))
lme.u2a<-lme(fixed=Y~time,data=d,random=list(tchid=pdIdent(form=fmla)),weights=varIdent(form=~1|time),correlation=corSymm(form
= ~1|tchid/studid))
lme.u2b<-lme(fixed=Y~time,data=d,random=list(dummy.group=pdIdent(form=fmla)),weights=varIdent(form=~1|time),correlation=corSymm(form
= ~1|dummy.group/studid))
############
I have one set of questions and one observation:
1) The two model fits provide (essentially) the same estimates, and
these estimates seem reasonable. However, they provide different DF
in the table of fixed effects, reflecting the fact that the nesting
structures specified in the two models are different. In the first
case, I specify that students are nested within teachers which is not
exactly true. In the second, I use a dummy grouping variable
"dummy.group" which is identically equal to 1 for all records.
My major questions are why/how these specifications fit the same
model, whether it is possible for me to fit the model without
specifying names to the list components in the random statement, and
more generally, whether lme() *requires* a nesting structure at all.
I guess what it comes down to is that I am having trouble
understanding how exactly specifying the random statement as a list of
formulas works.
2) intervals() on either of these objects fails with the following
error message:
Error in structure(exp(as.vector(object)), names = c(paste("sd(",
deparse(formula(object)[[2]]), :
names attribute must be the same length as the vector
Any insights would be greatly appreciated.
best regards,
J.R. Lockwood
412-683-2300 x4941
lockwood at rand.org
http://www.rand.org/methodology/stat/members/lockwood/
## Function for generating data
generate.data<-function(k=50,n=25,mu=c(0,10),tau=sqrt(2),esd=sqrt(c(2,4)),corr=0.8){
## k=number of teachers
## n=number of students per teacher
Sigma<-diag(esd)%*%matrix(c(1,corr,corr,1),ncol=2)%*%diag(esd)
theta<-rnorm(k,sd=tau)
theta<-cbind(0,rep(theta,each=n))
e<-mvrnorm(n*k,mu=mu,Sigma=Sigma)
Y<-c(t(theta))+c(t(e))
tchid<-gl(k,2*n)
z<-model.matrix(~tchid-1)
s<-seq(from=1,to=(2*k*n-1),by=2)
z[s,]<-rep(0,k)
studid<-gl(n*k,2)
time<-rep(c(0,1),times=n*k)
dummy.group<-factor(rep(1,2*n*k))
return(data.frame(studid,Y,time,dummy.group,tchid,z))
}