Hi, assume that I have a repeated measure dataset with 3 time points: baseline,
day 5 and day 10. There are 4 treatment groups (vehicle, treatment 1, treatment
2 and treatment 3). 20 subjects per treatment group. A simple straight-forward
way to analyze the data is to use mixed model:
model 1:
obj <- lmer(y ~ treatment * time +(time|subject)) where time is numeric with
value 0,5 and 10.
The problem with this approach is that this model does not account for baseline
imbalance between treatment groups. But if I want to include baseline value of
the response variable in the model, then I think I have to exclude the baseline
data from the rows of the dataset (so that baseline will become one variable,
i.e. one column of the dataset, correct me if I am wrong). With this dataset
tranformation, I only end up with 2 time points left in the dataset (day 5 and
day 10). Then a linear term on the numeric time variable is not possible in
lmer().
In this situation, what I can think of is to treatment time variable as a
factor (say named as "time.f"), and run the following model:
model 2:
obj<- lmer(y ~ treatment * time.f +(1|subject)) where time.f is a factor with
value 5 and 10.
Couple of questions:
1. Should we really concern about the baseline imbalance by including baseline
as a variable in the model? What's the advantage of doing so versus not
doing
so?
2. If the objective of the study is to evaluate at the end of the study (day
10), which treatment group produces significantly difference from the vehicle
group, is model 2 a reasonable model to do that?
3. In general, with a repeated measures of 2 to 3 time points, is mixed models
really necessary? In mixed-model mailing list, I realized that there is concerns
about running mixed models on just a few time points. But I feel uncomfortable
to run simple ANOVA (or ANCOVA) while completely ignore the fact the data
arecorrelated among time points.
4. What are the better alternatives analyzing such datasets?
Thanks
John
[[alternative HTML version deleted]]