J.R. Lockwood
2003-Jun-19 19:58 UTC
[R] Fitting particular repeated measures model with lme()
Hello, I have a simulated data structure in which students are nested within teachers, and with each student are associated two test scores. There are 20 classrooms and 25 students per classroom, for a total of 500 students and two scores per student. Here are the first 10 lines of my dataframe "d": studid tchid Y time 1 1 1 -1.0833222 0 2 1 1 -0.7656281 1 3 2 1 -1.0198641 0 4 2 1 0.7808148 1 5 3 1 -1.1381721 0 6 3 1 -0.4395021 1 7 4 1 -2.0944685 0 8 4 1 -1.8746840 1 9 5 1 -0.7784412 0 10 5 1 1.9952170 1 ... I am trying to use lme() to fit a relatively basic repeated measures model where there are random teacher intercepts, and an unstructured residual covariance matrix within students. The following call to lme() seems to fit the model: lme.t5<-lme(fixed=Y~time,data=d,random=~1|tchid,weights=varIdent(form=~1|time),\ correlation=corSymm(form = ~1|tchid/studid)) Now, I would like to try to alter this model to one in which the "teacher effect" applies to only one year. One can think of the first score on the student as a score from a prior year (for which I have no teacher links), and the second score is from the current year and is linked to the teacher. The model for student j in class i is: Y_{ij0} = a_0 + e_{ij0} Y_{ij1} = a_1 + b_i + e_{ij1} with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1}) unstructured. That is, if the data are organized by student, the "Z" matrix in the usual linear mixed model notation has every other row equal to a row of zeros. I am wondering whether there is some way to fit this model using lme(). Thanks in advance for your help and patience. best regards, J.R. Lockwood 412-683-2300 x4941 lockwood at rand.org http://www.rand.org/methodology/stat/members/lockwood/
J.R. Lockwood
2003-Jun-20 15:06 UTC
[R] Fitting particular repeated measures model with lme()
Hello, I have a simulated data structure in which students are nested within teachers, and with each student are associated two test scores. There are 20 classrooms and 25 students per classroom, for a total of 500 students and two scores per student. Here are the first 10 lines of my dataframe "d": studid tchid Y time 1 1 1 -1.0833222 0 2 1 1 -0.7656281 1 3 2 1 -1.0198641 0 4 2 1 0.7808148 1 5 3 1 -1.1381721 0 6 3 1 -0.4395021 1 7 4 1 -2.0944685 0 8 4 1 -1.8746840 1 9 5 1 -0.7784412 0 10 5 1 1.9952170 1 ... I am trying to use lme() to fit a relatively basic repeated measures model where there are random teacher intercepts, and an unstructured residual covariance matrix within students. The following call to lme() seems to fit the model: lme.t5<-lme(fixed=Y~time,data=d,random=~1|tchid,weights=varIdent(form=~1|time),\ correlation=corSymm(form = ~1|tchid/studid)) Now, I would like to try to alter this model to one in which the "teacher effect" applies to only one year. One can think of the first score on the student as a score from a prior year (for which I have no teacher links), and the second score is from the current year and is linked to the teacher. The model for student j in class i is: Y_{ij0} = a_0 + e_{ij0} Y_{ij1} = a_1 + b_i + e_{ij1} with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1}) unstructured. That is, if the data are organized by student, the "Z" matrix in the usual linear mixed model notation has every other row equal to a row of zeros. I am wondering whether there is some way to fit this model using lme(). Thanks in advance for your help and patience. best regards, J.R. Lockwood 412-683-2300 x4941 lockwood at rand.org http://www.rand.org/methodology/stat/members/lockwood/
J.R. Lockwood
2003-Jul-02 18:07 UTC
[R] Re: Fitting particular repeated measures model with lme()
Hello, A couple weeks ago I posted a message about using lme() to fit a model based on the following: Students are tested in two years, and are linked to teachers in the second year only. Thus students are not nested within teachers in the traditional sense. The model for student j in class i is: Y_{ij0} = a_0 + e_{ij0} Y_{ij1} = a_1 + b_i + e_{ij1} with Var(b_i) the teacher variance component and Cov(e_{ij0},e_{ij1}) unstructured. Thus if the data are organized by student, the "Z" matrix in the usual linear mixed model notation has every other row equal to a row of zeros. For reference I include at the end of this message a function "generate.data()" that simulates (balanced) data according to this model. I think I figured out a way to fit this model in lme() but I have some questions about it. My strategy was essentially to brute-force create the Z matrix with columns containing indicators of the second year teacher interleaved with zeros for the first year scores, and then force the covariance matrix of the random effects to be a constant times the identity. Here is the code I am using: ############ library(nlme) library(MASS) ntch<-30 d<-generate.data(k=ntch) varnames<-paste("tchid",1:ntch,sep="") fmla<-as.formula(paste("~",paste(varnames,collapse="+"),"-1")) lme.u2a<-lme(fixed=Y~time,data=d,random=list(tchid=pdIdent(form=fmla)),weights=varIdent(form=~1|time),correlation=corSymm(form = ~1|tchid/studid)) lme.u2b<-lme(fixed=Y~time,data=d,random=list(dummy.group=pdIdent(form=fmla)),weights=varIdent(form=~1|time),correlation=corSymm(form = ~1|dummy.group/studid)) ############ I have one set of questions and one observation: 1) The two model fits provide (essentially) the same estimates, and these estimates seem reasonable. However, they provide different DF in the table of fixed effects, reflecting the fact that the nesting structures specified in the two models are different. In the first case, I specify that students are nested within teachers which is not exactly true. In the second, I use a dummy grouping variable "dummy.group" which is identically equal to 1 for all records. My major questions are why/how these specifications fit the same model, whether it is possible for me to fit the model without specifying names to the list components in the random statement, and more generally, whether lme() *requires* a nesting structure at all. I guess what it comes down to is that I am having trouble understanding how exactly specifying the random statement as a list of formulas works. 2) intervals() on either of these objects fails with the following error message: Error in structure(exp(as.vector(object)), names = c(paste("sd(", deparse(formula(object)[[2]]), : names attribute must be the same length as the vector Any insights would be greatly appreciated. best regards, J.R. Lockwood 412-683-2300 x4941 lockwood at rand.org http://www.rand.org/methodology/stat/members/lockwood/ ## Function for generating data generate.data<-function(k=50,n=25,mu=c(0,10),tau=sqrt(2),esd=sqrt(c(2,4)),corr=0.8){ ## k=number of teachers ## n=number of students per teacher Sigma<-diag(esd)%*%matrix(c(1,corr,corr,1),ncol=2)%*%diag(esd) theta<-rnorm(k,sd=tau) theta<-cbind(0,rep(theta,each=n)) e<-mvrnorm(n*k,mu=mu,Sigma=Sigma) Y<-c(t(theta))+c(t(e)) tchid<-gl(k,2*n) z<-model.matrix(~tchid-1) s<-seq(from=1,to=(2*k*n-1),by=2) z[s,]<-rep(0,k) studid<-gl(n*k,2) time<-rep(c(0,1),times=n*k) dummy.group<-factor(rep(1,2*n*k)) return(data.frame(studid,Y,time,dummy.group,tchid,z)) }