array chip
2010-Sep-07 17:05 UTC
[R] some questions about longitudinal study with baseline
Hi all, I asked this before the holiday, didn't get any response. So would like to resend the message, hope to get any fresh attention. Since this is not purely lme technical question, so I also cc-ed R general mailing list, hope to get some suggestions from there as well. I asked some questions on how to analyze longitudinal study with only 2 time points (baseline and a follow-up) previously. I appreciate many useful comments from some members, especially Dennis Murphy and Marc Schwartz who refered the following paper addressing specifically this type of study with only 2 time points: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/ Basically, with only 2 time points (baseline and one follow-up), ANCOVA with follow-up as dependent variable and baseline as covariate should be used: follow-up = a + b*baseline + treatment Now I have a regular longitudinal study with 6 time points, 7 treatments (vehicle, A, B, C, D, F, G), measuring a response variable "y". The dataset is attached. I have some questions, and appreciate any suggestions on how to analyze the dataset. dat<-read.table("dat.txt",sep='\t',header=T,row.names=NULL) library(MASS) dat$trt<-relevel(dat$trt,'vehicle') xyplot(y~time, groups=trt, data=dat, ylim=c(3,10),col=c(1:6,8),lwd=2,type=c('g','a'),xlab='Days',ylab="response", key = list(lines=list(col=c(1:6,8),lty=1,lwd=2), text = list(lab = levels(dat$trt)), columns = 3, title = "Treatment")) So as you can see that there is some curvature between glucose level and time, so a quadratic fit might be needed. dat$time2<-dat$time*dat$time A straight fit like below seems reasonable: fit<-lmer(y~trt*time+trt*time2+(time|id),dat) Checking on random effects, it appears that variance component for random slope is very small, so a simpler model with random intercept only may be sufficient: fit<-lmer(y~trt*time+trt*time2+(1|id),dat) Now, I want to incorporate baseline response into the model in order to account for any baseline imbalance. I need to generate a new variable "baseline" based on glucose levels at time=0: dat<-merge(dat, dat[dat$time==0,c('id','y')], by.x='id',by.y='id',all.x=T) colnames(dat)[c(4,6)]<-c('y','baseline') so the new fit adding baseline into the mixed model is: fit<-lmer(y~baseline+trt*time+trt*time2+(1|id),dat) Now my question is 1). Is the above model a reasonable thing to do? 2) when baseline is included as a covariate, should I remove the data points at baseline from the dataset? I am kind of unsure if it's reasonable to use the baseline both as a covariate and as part of the dependent variable values. Next thing I want to do with this dataset is to do multiple comparisons between each treatment (A, B, C, D, F, G) vs. vehicle at a given time point, say time=56 (the last time points) after adjusting the baseline imbalance. This seems to be done using Dunnet test. When I say "after adjusting baseline imbalance", I mean the comparisons should be done based on the difference between time=56 and time=0 (baseline), i.e. is there any difference in the change from baseline for treatment A (or B, C, D, F, G) vs. vehicle?. How can we test this? Will glht() in multcomp work for a lmer fit? If yes, how can I specify the syntax? Finally, with the above model, how to estimate the difference (and the standard error) between time=56 and time=0 (baseline) for each treatment groups? Thank you all for your attention. John -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dat.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100907/156ddd31/attachment.txt>
You may be interested in the tutorial of repeated measure ANOVA at UCLA computing page at: http://www.ats.ucla.edu/stat/R/seminars/Repeated_Measures/repeated_measures.htm -- View this message in context: http://r.789695.n4.nabble.com/some-questions-about-longitudinal-study-with-baseline-tp2530097p2530264.html Sent from the R help mailing list archive at Nabble.com.
Frank Harrell
2010-Sep-07 22:30 UTC
[R] some questions about longitudinal study with baseline
Baseline should appear only as a baseline and should be removed from the set of longitudinal responses. This is often done with a merge( ) operation. Frank Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University On Tue, 7 Sep 2010, array chip wrote:> Hi all, > > I asked this before the holiday, didn't get any response. So would like to > resend the message, hope to get any fresh attention. Since this is not purely > lme technical question, so I also cc-ed R general mailing list, hope to get some > suggestions from there as well. > > > I asked some questions on how to analyze longitudinal study with only 2 time > points (baseline and a follow-up) previously. I appreciate many useful comments > from some members, especially Dennis Murphy and Marc Schwartz who refered the > following paper addressing specifically this type of study with only 2 time > points: > > https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/ > > Basically, with only 2 time points (baseline and one follow-up), ANCOVA with > follow-up as dependent variable and baseline as covariate should be used: > > follow-up = a + b*baseline + treatment > > Now I have a regular longitudinal study with 6 time points, 7 treatments > (vehicle, A, B, C, D, F, G), measuring a response variable "y". The dataset is > attached. I have some questions, and appreciate any suggestions on how to > analyze the dataset. > > dat<-read.table("dat.txt",sep='\t',header=T,row.names=NULL) > library(MASS) > dat$trt<-relevel(dat$trt,'vehicle') > > xyplot(y~time, groups=trt, data=dat, > ylim=c(3,10),col=c(1:6,8),lwd=2,type=c('g','a'),xlab='Days',ylab="response", > key = list(lines=list(col=c(1:6,8),lty=1,lwd=2), > text = list(lab = levels(dat$trt)), > columns = 3, title = "Treatment")) > > So as you can see that there is some curvature between glucose level and time, > so a quadratic fit might be needed. > > > > dat$time2<-dat$time*dat$time > > A straight fit like below seems reasonable: > > fit<-lmer(y~trt*time+trt*time2+(time|id),dat) > > Checking on random effects, it appears that variance component for random slope > is very small, so a simpler model with random intercept only may be sufficient: > > fit<-lmer(y~trt*time+trt*time2+(1|id),dat) > > Now, I want to incorporate baseline response into the model in order to account > for any baseline imbalance. I need to generate a new variable "baseline" based > on glucose levels at time=0: > > dat<-merge(dat, dat[dat$time==0,c('id','y')], by.x='id',by.y='id',all.x=T) > colnames(dat)[c(4,6)]<-c('y','baseline') > > so the new fit adding baseline into the mixed model is: > > fit<-lmer(y~baseline+trt*time+trt*time2+(1|id),dat) > > Now my question is 1). Is the above model a reasonable thing to do? 2) when > baseline is included as a covariate, should I remove the data points at baseline > from the dataset? I am kind of unsure if it's reasonable to use the baseline > both as a covariate and as part of the dependent variable values. > > Next thing I want to do with this dataset is to do multiple comparisons between > each treatment (A, B, C, D, F, G) vs. vehicle at a given time point, say time=56 > (the last time points) after adjusting the baseline imbalance. This seems to be > done using Dunnet test. When I say "after adjusting baseline imbalance", I mean > the comparisons should be done based on the difference between time=56 and > time=0 (baseline), i.e. is there any difference in the change from baseline for > treatment A (or B, C, D, F, G) vs. vehicle?. How can we test this? Will glht() > in multcomp work for a lmer fit? If yes, how can I specify the syntax? > > Finally, with the above model, how to estimate the difference (and the standard > error) between time=56 and time=0 (baseline) for each treatment groups? > > Thank you all for your attention. > > John > > >
Seemingly Similar Threads
- creating baseline variable from a longitudinal sequence
- summary of the effects after logistic regression model
- analysis strategy - baseline and repeated measure
- Creating dummy vars with contrasts - why does the returned identity matrix contain all levels (and not n-1 levels) ?
- data analysis for partial two-by-two factorial design