Steffen Fleischer
2010-Dec-03 08:12 UTC
[R] treatment effects with lme (repeated measurements)
Dear, I want to analyze an outcome in an RCT using lme but I am not sure that I have chosen the right way for the model. We measured the outcome three times repeatedly in the same patient. One time before intervention and two times after intervention. I wanted to adjust for the correlated data in the repeated measurement and baseline differences in the variable in order to get the treatment effect. Here the model: lme(outcome~treatment*time+baseline; random=~1|id) for the data structure: id time outcome baseline treatment 1 1 10 5 1 1 2 12 5 1 2 1............ . . . alternatively I could use 3 rows per participant, omitting baseline as a variable as it would be included in "outcome" and "time" then. The model then would be: lme(outcome~treatment*time; random=~1|id) I am not sure which way is better/right or if there is a third alternative for this problem. Thanks in advance Steffen Fleischer
Steffen Fleischer wrote:> > .. > We measured the outcome three times repeatedly in the same patient. One > time before intervention and two times after intervention. I wanted to > adjust for the correlated data in the repeated measurement and baseline > differences in the variable in order to get the treatment effect. > > Here the model: > lme(outcome~treatment*time+baseline; random=~1|id) > > for the data structure: > > id time outcome baseline treatment > 1 1 10 5 1 > 1 2 12 5 1 > 2 1............ > . > . > > alternatively I could use 3 rows per participant, omitting baseline as a > variable as it would be included in "outcome" and "time" then. > The model then would be: > lme(outcome~treatment*time; random=~1|id) > >You should be aware that by calling treatments 1 and 2 and not doing an "as.factor" on it, treat is considered a continuous variables. With two variables, the result look similar to what you expect, but you are living on dangerous ground here. I prefer to always name my variables "Placebo" and "Antibiotics", forcing them to be factors. But old habit of coding 1/2 die hard. Time, however, is definitively a continuous variable. When you use the second version, you effectively fit a linear model through the three data points, and the interaction term tells you how different the slopes are for the two treatments. This approach has considerable power when the model assumption is reasonable, but you must check for this by visual inspection of the residuals. I often use it, but it is always hard work to convince medical researchers and reviewers to at least consider the idea. The usual reply is "this is not linear over time"; and my usual answer is: the linear-over-time is the next step after the "is constant" assumption; which they would immediately accept without asking that being constant is an assumption. In this approach, the slope is an indicator of the trend. The model it is the more useful, the more points-in-time you have. The alternative (essentially your version 1) is to test all values against baseline (or, better, all differences against zero). This is acceptable for two (post-treat) points in time; but I remember the many cases where I got asked: "we have ten points in time, and would like to know after how many time points the treatment effect is significantly different from zero". Or, even more fun: we would like to test every time against every other to find out (what?? That after 3 it's signif, not after 5, again after 7) I tend to apply a rude Bonferroni correction in that case, which often gets people down to earth, and we can consider a linear or transformed linear continuous-in-time model. Summary: Both approaches are possible. Check your model assumptions. And don't say "it's not linear" easily. It might be really non-linear. With large errors we have in medical research, the linear assumption might be quite good. Dieter -- View this message in context: http://r.789695.n4.nabble.com/treatment-effects-with-lme-repeated-measurements-tp3070759p3070800.html Sent from the R help mailing list archive at Nabble.com.