There has been some recent discussion on this list about the value of using
intervals with lme() to check for whether a model is ill-defined. My
question is, what else can drive very large confidence intervals for the
variance components (or cause the error message "Error in
intervals.lme(Object) : Cannot get confidence intervals on var-cov
components: Non-positive definite approximate variance-covariance"). To
illustrate my question, I use examples from the book "Mixed-Effects-Models
in S and S-PLUS" by Pinheiro and Bates, and from an analysis of my own
data.
In chapter 1, Pinheiro and Bates show that if you use a model with an
interaction in the random effects term without appropriate replication in
the data, the model will appear to fit but the confidence intervals for the
variance components will be very large. They suggest using intervals() as a
check that the model is appropriately defined:
> test.1 <- lme(effort~Type, data=ergoStool, random=~1|Subject)
> test.2 <- lme(effort~Type, data=ergoStool, random=~1|Subject/Type)
> intervals(test.2)
Random Effects:
Within-group standard error:
lower est. upper
1.054760e-07 4.599834e-01 2.005999e+06
In fact, using anova() to compare these two models shows that nothing is
gained by adding the interaction:
> anova(test.1,test.2)
Model df AIC BIC logLik Test L.Ratio p-value
test.1 1 6 133.1308 141.9252 -60.56539
test.2 2 7 135.1308 145.3909 -60.56539 1 vs 2 0 1
HOWEVER, for the example in chapter 5.3 of the book in which an
autoregressive structure is used for the within group errors, I get the
following error:
> test <-
lme(follicles~sin(2*pi*Time)+cos(2*pi*Time),data=Ovary,random=pdDiag(~sin(2*
pi*Time)))> test.ar1 <-
lme(follicles~sin(2*pi*Time)+cos(2*pi*Time),data=Ovary,random=pdDiag(~sin(2*
pi*Time)),correlation=corAR1())> intervals(test.ar1)
Error in intervals.lme(test.ar1) : Cannot get confidence intervals on
var-cov components: Non-positive definite approximate variance-covariance
BUT, anova appropriately selects the autoregressive model as
best:> anova(test,test.ar1)
Model df AIC BIC logLik Test L.Ratio p-value
test 1 6 1638.082 1660.404 -813.0409
test.ar1 2 7 1564.445 1590.487 -775.2224 1 vs 2 75.637
<.0001
In the book, intervals() DOES appear to work, but the authors are using
S-PLUS. My concern is that when I try to fit the following two models to my
own data, I get very large confidence intervals for the within-subject error
even thought AIC selects the autoregressive model as best:
> result <-
lme(log(T1+1)~factor(trt1)*factor(trt2)*factor(Census),data=data,random=~1|B
lock/Subject)> result.ar1 <-
lme(log(T1+1)~factor(trt1)*factor(trt2)*factor(Census),data=data,random=~1|B
lock/Subject,correlation=corAR1())> intervals(result.ar1)
Random Effects:
lower est. upper
sd((Intercept)) 3.491934e-13 0.01461032 611299013
Correlation structure:
lower est. upper
Phi 0.6543028 0.7574806 0.8329729
> anova(result,result.ar1)
Model df AIC BIC logLik Test L.Ratio p-value
result 1 19 518.1501 581.5633 -240.0750
result.ar1 2 20 475.9776 542.7283 -217.9888 1 vs 2 44.17249 <.0001
Why are my within-subject errors so large, and why doesn't intervals() work
for the autoregressive errors example in the book. I am using R version 1.8
on Windows 2000. Any insight would be greatly appreciated!
Manuel
Manuel A. Morales
Assistant Professor, Biology
Williams College
Williamstown, MA 01267
ph: 413-597-2983 | fax: 413-597-3495
http://mutualism.williams.edu