(Ted Harding)
2008-Mar-12 14:36 UTC
[R] [follow-up] "Longitudinal" with binary covariates and outcome
Hi again! Following up my previous posting below (to which no response as yet), I have located a report which situates this type of question in a longitudinal modelling context. http://www4.stat.ncsu.edu/~dzhang2/paper/glm.ps Generalized Linear Models with Longitudinal Covariates Daowen Zhang & Xihong Lin (This work seems to originally date from around 1999). They consider an outcome Y, with a fixed covariate [vector] Z and a longitudinal covariate [vector] X observed at n time points t1,...,tn; the outcome Y is observed only at the end of the sequence. They model Y with a GLM in which Z and subject-specific random effects U are predictors in the GLM, where U satisfies a linear mixed model X = T'*U + error and is normally distributed. However, in view of the fact that the longitudinal covariates X in my query below are binary, there cannot be a linear mixed model for them; there would have to be a generalised linear mixed model. I have had a good poke around in the R resources, and have failed to find anything which directly addresses this question (nor which addresses Zhang & Lin's original question). So, if anyone has done R work in this kind of context, I'd be most grateful for any suggestions (including worked examples of datasets) arising from it! With thanks again, and best wishes to all, Ted. -----FW: <XFMail.080311001718.Ted.Harding at manchester.ac.uk>----- Date: Tue, 11 Mar 2008 00:17:18 -0000 (GMT) From: (Ted Harding) <Ted.Harding at manchester.ac.uk> To: r-help at stat.math.ethz.ch Subject: "Longitudinal" with binary covariates and outcome Hi Folks, I'd be grateful for suggestions about approaching the following kind of data. I'm not sure what general class of models it is best situated in (that's just my ignorance), and in particular if anyone could point me to case studies associated with an R approach that would be most useful. Suppose I have data of the following kind. Each "subject" is observed at say 4 time-points T2, T2, T3, T4, yielding values of binary (0/1) variables X1, X2, X3, X4. At time T4 is also observed a binary variable Y. The objective is to study the predictive power of (X1, X2, X3, X4) for the outcome "Y=1". A useful model should take account of the possibility that more "recent" X's are likely to be better predictors than less "recent" so that, say, P(Y=1|X4=1) is likely to be larger than P(Y=1|X1=1), and also that the more X's are 1, the more likely it is that Y=1. Any suggestions or comments and, as I say, pointers to an R treatment of similar problems would be most welcome. With thanks, Ted. --------------End of forwarded message------------------------- -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 12-Mar-08 Time: 14:35:59 ------------------------------ XFMail ------------------------------
Charles C. Berry
2008-Mar-12 15:03 UTC
[R] [follow-up] "Longitudinal" with binary covariates and outcome
Ted, What you have can be rendered as a 2^5 (X1 by X2 by X3 by X4 by Y) table of counts, right? Why isn't this a vanilla log-linear modelling (as in loglin() ) problem? It seems to me that the temporal aspect you describe suggests a sequence of margins that could be studied, viz list( 1:4, c(4,5) ) list( 1:4, c(3,5), c(4,5) ) list( 1:4, c(2,5), c(3,5), (4,5) ) list( 1:4, c(1,5, c(2,5), c(3,5), (4,5) ) (taking X1 is the first and Y as the last slice in the table) and perhaps intercalating higher order effects involving slice 5 amongst those. ?? Chuck On Wed, 12 Mar 2008, Ted.Harding at manchester.ac.uk wrote:> Hi again! > Following up my previous posting below (to which no response > as yet), I have located a report which situates this type > of question in a longitudinal modelling context. > > http://www4.stat.ncsu.edu/~dzhang2/paper/glm.ps > Generalized Linear Models with Longitudinal Covariates > Daowen Zhang & Xihong Lin > > (This work seems to originally date from around 1999). > > They consider an outcome Y, with a fixed covariate [vector] Z > and a longitudinal covariate [vector] X observed at n time > points t1,...,tn; the outcome Y is observed only at the end > of the sequence. They model Y with a GLM in which Z and > subject-specific random effects U are predictors in the GLM, > where U satisfies a linear mixed model X = T'*U + error > and is normally distributed. > > However, in view of the fact that the longitudinal covariates > X in my query below are binary, there cannot be a linear > mixed model for them; there would have to be a generalised > linear mixed model. > > I have had a good poke around in the R resources, and have > failed to find anything which directly addresses this question > (nor which addresses Zhang & Lin's original question). > > So, if anyone has done R work in this kind of context, > I'd be most grateful for any suggestions (including worked > examples of datasets) arising from it! > > With thanks again, and best wishes to all, > Ted. > > > -----FW: <XFMail.080311001718.Ted.Harding at manchester.ac.uk>----- > Date: Tue, 11 Mar 2008 00:17:18 -0000 (GMT) > From: (Ted Harding) <Ted.Harding at manchester.ac.uk> > To: r-help at stat.math.ethz.ch > Subject: "Longitudinal" with binary covariates and outcome > > Hi Folks, > I'd be grateful for suggestions about approaching the > following kind of data. I'm not sure what general class of > models it is best situated in (that's just my ignorance), > and in particular if anyone could point me to case studies > associated with an R approach that would be most useful. > > Suppose I have data of the following kind. Each "subject" > is observed at say 4 time-points T2, T2, T3, T4, yielding > values of binary (0/1) variables X1, X2, X3, X4. At time T4 > is also observed a binary variable Y. The objective is to > study the predictive power of (X1, X2, X3, X4) for the > outcome "Y=1". > > A useful model should take account of the possibility > that more "recent" X's are likely to be better predictors > than less "recent" so that, say, P(Y=1|X4=1) is likely to > be larger than P(Y=1|X1=1), and also that the more X's > are 1, the more likely it is that Y=1. > > Any suggestions or comments and, as I say, pointers to > an R treatment of similar problems would be most welcome. > > With thanks, > Ted. > --------------End of forwarded message------------------------- > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> > Fax-to-email: +44 (0)870 094 0861 > Date: 12-Mar-08 Time: 14:35:59 > ------------------------------ XFMail ------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901