My question concerns the logic behind hypothesis tests for fixed-effect terms in models fitted with lme. Suppose the levels of Subj indicate a grouping structure (k subjects) and Trt is a two-level factor (two treatments) for which there are several (n) responses y from each treatment and subject combination. If one suspects a subject by treatment interaction, either of the following models seem natural> fm1 <- lme(y ~ Trt, random = list(Subj = pdDiag(~ Trt))) > fm2 <- lme(y ~ trt, random = ~ 1 | Subj/Trt)These models seem to correspond to the same situation. Both have two variance components (subject and treatment within subject). However, they result in different denominator degrees of freedom (denDF) of the F-statistic for a (fixed-effect) test for treatment. For the case of few subjects and many observations per subject-treatment combination, denDF will be much larger for fm1 (denDF = k*2*n-k-1) than for fm2 (denDF k-1). What is the essential difference in the nature of random effects for situations modelled by fm1 and fm2? On the other hand, if there is no essential difference between fm1 and fm2, why are the tests different? I realize that fm2 corresponds to the classical analysis> fm3 <- aov(y ~ Trt + Error(Subj/Trt))but what is the logic behind fm1? I have looked in Venables & Ripley (2002) and Pinheiro & Bates (2000), but neither of these excellent books seems to explain how or whether models like fm1 and fm2 differ. Olof Leimar, Professor Department of Zoology Stockholm University SE-106 91 Stockholm Sweden Olof.Leimar at zoologi.su.se
Douglas Bates
2002-Dec-15 20:33 UTC
[R] Interpretation of hypothesis tests for mixed models
Olof Leimar <Olof.Leimar at zoologi.su.se> writes:> My question concerns the logic behind hypothesis tests for fixed-effect > terms in models fitted with lme. Suppose the levels of Subj indicate a > grouping structure (k subjects) and Trt is a two-level factor (two > treatments) for which there are several (n) responses y from each > treatment and subject combination. If one suspects a subject by > treatment interaction, either of the following models seem natural > > > fm1 <- lme(y ~ Trt, random = list(Subj = pdDiag(~ Trt))) > > fm2 <- lme(y ~ trt, random = ~ 1 | Subj/Trt)I don't think those models are equivalent. To get equivalent models I think you need random = list(Subj = pdCompSymm(~ Trt - 1)) in the first model. For example> library(nlme)Loading required package: nls Loading required package: lattice> data(Machines) > fm1 <- lme(score ~ Machine, data = Machines, random = ~ 1 | Worker/Machine) > logLik(fm1)`log Lik.' -107.8438 (df=6)> fm2 <- lme(score ~ Machine, data = Machines, random=list(Worker=pdCompSymm(~Machine - 1))) > logLik(fm2)`log Lik.' -107.8438 (df=6)> summary(fm1)Linear mixed-effects model fit by REML Data: Machines AIC BIC logLik 227.6876 239.2785 -107.8438 Random effects: Formula: ~1 | Worker (Intercept) StdDev: 4.781049 Formula: ~1 | Machine %in% Worker (Intercept) Residual StdDev: 3.729536 0.9615768 Fixed effects: score ~ Machine Value Std.Error DF t-value p-value (Intercept) 52.35556 2.485829 36 21.061606 <.0001 MachineB 7.96667 2.176974 10 3.659514 0.0044 MachineC 13.91667 2.176974 10 6.392665 0.0001 Correlation: (Intr) MachnB MachineB -0.438 MachineC -0.438 0.500 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.26958756 -0.54846582 -0.01070588 0.43936575 2.54005852 Number of Observations: 54 Number of Groups: Worker Machine %in% Worker 6 18> summary(fm2)Linear mixed-effects model fit by REML Data: Machines AIC BIC logLik 227.6876 239.2785 -107.8438 Random effects: Formula: ~Machine - 1 | Worker Structure: Compound Symmetry StdDev Corr MachineA 6.0626364 MachineB 6.0626364 0.621 MachineC 6.0626364 0.621 0.621 Residual 0.9615618 Fixed effects: score ~ Machine Value Std.Error DF t-value p-value (Intercept) 52.35556 2.485416 46 21.065106 <.0001 MachineB 7.96667 2.177416 46 3.658770 7e-04 MachineC 13.91667 2.177416 46 6.391367 <.0001 Correlation: (Intr) MachnB MachineB -0.438 MachineC -0.438 0.500 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.26962147 -0.54844560 -0.01068651 0.43936587 2.54004419 Number of Observations: 54 Number of Groups: 6 This still doesn't get around the problem of the different definitions of the degrees of freedom in the denominator of the F tests. That occurs because the definition of the degrees of freedom is not invariant with respect to the different formulation of the models. I prefer to formulate such a model as nested random effects rather than trying to decide how the model matrices should be defined in a model with a compound symmetry structure for the variance-covariance term. I think the definition of the degrees of freedom is cleaner in that case too.
Thanks for setting me straight about the model> fm1 <- lme(y ~ Trt, random = list(Subj = pdCompSymm(~ Trt - 1)))being the one that is equivalent to> fm2 <- lme(y ~ Trt, random = ~ 1 | Subj/Trt)It seems that denDF of a fixed effect test for treatment should also be the same for fm1 and fm2. Is it possible to modify the method of computing denDF in nlme to achive this? Meanwhile, my understanding is that fm2 is to be preferred over fm1. I did simulations for fm1/fm2 with no true (fixed) difference between treatments, which seemed to show that a test with the fm1 formulation can sometimes produce considerably more statistical significances than would be warranted. I then have another question. How should I go about formulating a model corresponding to the nesting in fm2 if instead of a treatment factor I have a covariate? Since in my example Trt was a two-level factor, one could for instance let the levels be zero and one and regard the treatment as a covariate. If I express the treatment as a covariate x and fit> fm4 <- lme(y ~ x, random = ~ 1 | Subj/x)I get the same denDF as for fm2, but for a general covariate (with more than two values) denDF depends on the number of distinct values taken by the covariate (but it should not, should it?). It seems that random = ~ 1 | Subj/x treats x as a a factor. Is there another model formulation that takes care of this problem? More generally, if I have complex terms, like a treatment by covariate interaction, for which I suspect random subject components, how can I formulate a mixed model so that denDF properly takes into account the nested random effects? -- Olof Leimar, Professor Department of Zoology Stockholm University SE-106 91 Stockholm Sweden Olof.Leimar at zoologi.su.se