Dear R list members,
I've turned up a strange discrepancy between results obtained from the lme
function in the nlme package in R and results obtained with lme in S-PLUS.
I'm using version 3.1-24 of nlme in R 1.4.1 under Windows 2000, and both
S-PLUS 2000 and 6.0, again under Windows 2000.
I've noticed discrepancies in a couple of instances. Here's one, using
data
from Bryk and Raudenbush's Hierarchical Linear models:
From R:
> attach(Bryk)
> cses <- meanses <- numeric(length(ses)) # initialize
> for (sc in unique(school)) {
+ meanses[school==sc] <- mean(ses[school==sc])
+ cses[school==sc]<-ses[school==sc]- meanses[school==sc]
+ }
> Bryk$cses <- cses
> Bryk$meanses <- meanses
> Bryk[1:10,]
school ses mathach sector cses meanses
1 1224 0.022 4.583 public 0.456383 -0.43438
2 1224 0.332 20.349 public 0.766383 -0.43438
3 1224 0.372 6.714 public 0.806383 -0.43438
4 1224 -0.078 16.405 public 0.356383 -0.43438
5 1224 -0.158 17.898 public 0.276383 -0.43438
6 1224 -0.298 19.338 public 0.136383 -0.43438
7 1224 -0.458 21.521 public -0.023617 -0.43438
8 1224 -0.468 3.154 public -0.033617 -0.43438
9 1224 -0.468 21.178 public -0.033617 -0.43438
10 1224 -0.528 20.349 public -0.093617 -0.43438
> remove(cses, meanses)
> Bryk$sector <- factor(Bryk$sector, levels=c('public',
'Catholic'))
> contrasts(Bryk$sector)
Catholic
public 0
Catholic 1
>
> bryk.lme.1<-lme(mathach ~ meanses*cses + sector*cses,
+ random = ~ cses | school,
+ data=Bryk)
> summary(bryk.lme.1)
Linear mixed-effects model fit by REML
Data: Bryk
AIC BIC logLik
46525 46594 -23252
Random effects:
Formula: ~cses | school
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 1.541177 (Intr)
cses 0.018174 0.006
Residual 6.063492
Fixed effects: mathach ~ meanses * cses + sector * cses
Value Std.Error DF t-value p-value
(Intercept) 12.1282 0.19920 7022 60.886 <.0001
meanses 5.3367 0.36898 157 14.463 <.0001
cses 2.9421 0.15122 7022 19.456 <.0001
sectorCatholic 1.2245 0.30611 157 4.000 1e-04
meanses:cses 1.0444 0.29107 7022 3.588 3e-04
cses:sectorCatholic -1.6421 0.23312 7022 -7.044 <.0001
Correlation:
(Intr) meanss cses sctrCt mnss:c
meanses 0.256
cses 0.000 0.000
sectorCatholic -0.699 -0.356 0.000
meanses:cses 0.000 0.000 0.295 0.000
cses:sectorCatholic 0.000 0.000 -0.696 0.000 -0.351
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.170106 -0.724877 0.014892 0.754263 2.965498
Number of Observations: 7185
Number of Groups: 160
>
From S-PLUS 6.0 (identical results from S-PLUS 2000):
> attach(Bryk)
> cses <- meanses <- numeric(length(ses)) # initialize
> for (sc in unique(school)) {
+ meanses[school==sc] <- mean(ses[school==sc])
+ cses[school==sc]<-ses[school==sc]- meanses[school==sc]
+ }
> Bryk$cses <- cses
> Bryk$meanses <- meanses
> Bryk[1:10,]
school ses mathach sector cses meanses
1 1224 0.022 4.583 public 0.456383 -0.43438
2 1224 0.332 20.349 public 0.766383 -0.43438
3 1224 0.372 6.714 public 0.806383 -0.43438
4 1224 -0.078 16.405 public 0.356383 -0.43438
5 1224 -0.158 17.898 public 0.276383 -0.43438
6 1224 -0.298 19.338 public 0.136383 -0.43438
7 1224 -0.458 21.521 public -0.023617 -0.43438
8 1224 -0.468 3.154 public -0.033617 -0.43438
9 1224 -0.468 21.178 public -0.033617 -0.43438
10 1224 -0.528 20.349 public -0.093617 -0.43438
> remove(c('cses', 'meanses'))
> options(contrasts=c('contr.treatment', 'contr.poly'))
>
> Bryk$sector <- factor(Bryk$sector, levels=c('public',
'Catholic'))
> contrasts(Bryk$sector)
Catholic
public 0
Catholic 1
>
> bryk.lme.1<-lme(mathach ~ meanses*cses + sector*cses,
+ random = ~ cses | school,
+ data=Bryk)
> summary(bryk.lme.1)
Linear mixed-effects model fit by REML
Data: Bryk
AIC BIC logLik
46524 46592 -23252
Random effects:
Formula: ~ cses | school
Structure: General positive-definite
StdDev Corr
(Intercept) 1.54278 (Inter
cses 0.32004 0.395
Residual 6.05976
Fixed effects: mathach ~ meanses * cses + sector * cses
Value Std.Error DF t-value p-value
(Intercept) 12.128 0.19931 7022 60.851 <.0001
meanses 5.333 0.36920 157 14.444 <.0001
cses 2.945 0.15565 7022 18.921 <.0001
sector 1.227 0.30630 157 4.005 0.0001
meanses:cses 1.039 0.29899 7022 3.476 0.0005
sector:cses -1.643 0.23986 7022 -6.849 <.0001
Correlation:
(Intr) meanss cses sector mnss:c
meanses 0.256
cses 0.076 0.020
sector -0.699 -0.356 -0.053
meanses:cses 0.019 0.076 0.293 -0.027
sector:cses -0.053 -0.028 -0.696 0.078 -0.351
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.1591 -0.72314 0.01723 0.75461 2.9581
Number of Observations: 7185
Number of Groups: 160
Notice that the results are very close, except for the standard deviation
of the random cses effect and the correlation between the random effects
for cses and the intercept. Published results and SAS PROC MIXED agree with
the S-PLUS output.
I don't think that I've done anything wrong in setting up the problem in
R.
In any event, any suggestions would be greatly appreciated. I could make
the data available, if that would help.
Thanks,
John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
-----------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
John, I'm travelling today and won't be able to check this discrepancy. I will try to look at it tomorrow. A couple of things to consider: - the underlying optimization software is different in S-PLUS and R. It may be that the REML estimates are poorly defined and convergence is being declared in different places by different software. Have you checked the intervals on those variance components using intervals(bryk.lme.1) - I believe the data are already in the nlme package as the MathAchieve and MathAchSchool data sets. - In Bryk's analysis the meanses is calculated incorrectly. - Some analysis of these data is given in the slides at http://www.stat.wisc.edu/~st850-1/MEcourse6_4up.pdf - It is much easier (and faster) to calculate the meanses gsummary. I think it would be gsummary(MathAchieve, which = "ses") -- Douglas Bates bates at stat.wisc.edu Statistics Department 608/262-2598 University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Doug, First, thanks for getting back to me. At 08:18 AM 5/2/2002 -0500, Douglas Bates wrote:>John, > >I'm travelling today and won't be able to check this discrepancy. I >will try to look at it tomorrow.There's no rush.>A couple of things to consider: > > - the underlying optimization software is different in S-PLUS and R. > It may be that the REML estimates are poorly defined and > convergence is being declared in different places by different software. > Have you checked the intervals on those variance components using > intervals(bryk.lme.1)It occurred to me that the problem may be ill-conditioned, but I didn't check the confidence intervals for the variance components. I did look at the log-likelihoods, though. To more decimal places than in my previous note, they are: R: -23252.3919 S-PLUS: -23253.2177 Getting the confidence intervals on the variance components produces: R: > intervals(bryk.lme.1) Approximate 95% confidence intervals Fixed effects: lower est. upper (Intercept) 11.7377216 12.128207 12.518692 meanses 4.6078631 5.336665 6.065467 cses 2.6457000 2.942145 3.238589 sectorCatholic 0.6198986 1.224531 1.829164 meanses:cses 0.4738118 1.044406 1.615000 cses:sectorCatholic -2.0991261 -1.642148 -1.185170 Random Effects: Level: school lower est. upper sd((Intercept)) 1.321746e+00 1.541176850 1.797037e+00 sd(cses) 6.686558e-09 0.018173637 4.939478e+04 cor((Intercept),cses) -8.630434e-02 0.005809486 9.782482e-02 Within-group standard error: lower est. upper 5.964008 6.063492 6.164636 S-PLUS > intervals(bryk.lme.1) Problem in intervals.lme(bryk.lme.1): Cannot get confidence inter vals on var-cov components: Non-positive definite approximate var iance-covariance Use traceback() to see the call stack Obviously, there's a problem here, as you suspected: the interval for sd(cses) in the R solution is very broad, and the covariance matrix in the S-PLUS solution isn't positive definite. (Does the S-PLUS version of the software use the log-Cholesky parametrization of the covariance matrix for the random effects?)> - I believe the data are already in the nlme package as the > MathAchieve and MathAchSchool data sets.Yes, they are -- I didn't know that. Thanks.> - In Bryk's analysis the meanses is calculated incorrectly.I was aware of this -- it's why I recalculated the means rather than using Bryk and Raudenbush's -- but the results don't change much.> - Some analysis of these data is given in the slides at > http://www.stat.wisc.edu/~st850-1/MEcourse6_4up.pdfAgain, thanks. I'll take a look. I'm working up the example for use in my appendix on mixed models, a draft of which is at <http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix-mixed-models.pdf>.> - It is much easier (and faster) to calculate the meanses gsummary. > I think it would be gsummary(MathAchieve, which = "ses")The versions of gsummary that I have don't seem to take a which argument, but tapply works (and is faster than my previous code, though in a data set this size both are essentially instantaneous): > mses <- tapply(ses, school, mean) > meanses <- mses[as.character(school)] I expect that the simple answer to my basic question is that the problem is ill-conditioned. Interesting, since Bryk and Raudenbush's data have been used a lot. Thanks for the suggestions and for your help. John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._