Hans Julius Skaug
2005-Dec-14 08:32 UTC
[R] glmmADMB: Generalized Linear Mixed Models using AD Model Builder
Dear R-users, Half a year ago we put out the R package "glmmADMB" for fitting overdispersed count data. http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html Several people who used this package have requested additional features. We now have a new version ready. The major new feature is that glmmADMB allows Bernoulli responses with logistic and probit links. In addition there is a "ranef.glmm.admb()" function for getting the random effects. The download site is still: http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html The package is based on the software ADMB-RE, but the full unrestricted R-package is made freely available by Otter Research Ltd and does not require ADMB-RE to run. Versions for Linux and Windows are available. We are still happy to get feedback for users, and to get suggestions for improvement. We have set up a forum at http://www.otter-rsch.ca/phpbb/ for discussions about the software. Regards, Hans _____________________________ Hans Julius Skaug Department of Mathematics University of Bergen Johannes Brunsgate 12 5008 Bergen Norway ph. (+47) 55 58 48 61
Roel de Jong
2005-Dec-15 12:11 UTC
[R] glmmADMB: Generalized Linear Mixed Models using AD Model Builder
Dear R-users, because lme(r) & glmmpql, which are based on Penalized Quasi Likelihood, are not very robust with Bernoulli responses, I wanted to test glmmADMB. I run the following simulation study: 500 samples are drawn with the model specification: y = (intercept*f1+pred2*f2+pred3*f3)+(intercept*ri+pred2*rs) where pred2 and pred3 are predictors distributed N(0,1) f1..f3 are fixed effects, f1=-1, f2=1.5, f3=0.5 ri is random intercept with associated variance var_ri=0.2 rs is random slope with associated variance var_rs=0.4 the covariance between ri and rs "covr"=0.1 1500 units/dataset, class size=30 convergence: ~~~~~~~~~~~~ No crashes. 5/500 Datasets had on exit a gradient of the log-likelihood > 0.001 though. Removing the datasets with questionable convergence doesn't seem to effect the simulation analysis. bias: ~~~~~~ f1=-1.00531376 f2= 1.49891060 f3= 0.50211520 ri= 0.20075947 covr=0.09886267 rs= 0.38948382 Only the random slope "rs" is somewhat low, but i don't think it is of significance coverage alpha=.95: (using asymmetric confidence intervals) ~~~~~~~~~~~~~~~~~~~~~~~~ f1=0.950 f2=0.950 f3=0.966 ri=0.974 covr=0.970 rs=0.970 While some coverages are somewhat high, confidence intervals based on asymptotic theory will not have exactly the nominal coverage level, but with simulations (parametric bootstrap) that can be corrected for. I can highly recommend this excellent package to anyone fitting these kinds of models, and want to thank Hans Skaug & Dave Fournier for their hard work! Roel de Jong. Hans Julius Skaug wrote:> Dear R-users, > > Half a year ago we put out the R package "glmmADMB" for fitting > overdispersed count data. > > http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html > > Several people who used this package have requested > additional features. We now have a new version ready. > The major new feature is that glmmADMB allows Bernoulli responses > with logistic and probit links. In addition there is > a "ranef.glmm.admb()" function for getting the random effects. > > The download site is still: > > http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html > > The package is based on the software ADMB-RE, but the full > unrestricted R-package is made freely available by Otter Research Ltd > and does not require ADMB-RE to run. Versions for Linux and Windows > are available. > > We are still happy to get feedback for users, and to get suggestions > for improvement. > > We have set up a forum at http://www.otter-rsch.ca/phpbb/ for discussions > about the software. > > Regards, > > Hans > > _____________________________ > Hans Julius Skaug > > Department of Mathematics > University of Bergen > Johannes Brunsgate 12 > 5008 Bergen > Norway > ph. (+47) 55 58 48 61 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Hans Julius Skaug
2005-Dec-19 18:19 UTC
[R] glmmADMB: Generalized Linear Mixed Models using AD Model Builder
Douglas Bates wrote:> >The "Laplace" method in lmer and the default method in glmm.admb, >which according to the documentation is the Laplace approximation, >produce essentially the same model fit. One difference is the >reported value of the log-likelihood, which we should cross-check, and >another difference is in the execution time >Yes, glmmADMB has sqrt(2*pi) constants missing. Thanks for pointing that out. Execution time: As pointed out by Roel de Jong, the underlying software AD Model Builder does not use hand-coded derivatives for the Hessian involved in the Laplace approximation, but calculates these by automatic differentiation. There is a cost in terms of execution speed, but on the other hand it is very quick to develop new models, as you do not have to worry about derivatives. I hope to exploit this beyond standard GLMMs, and provide other R packages. Comparison of glmmADMB with lmer: I find that the two packages do not give the same result on one of the standard datasets in the literature (Lesaffre et. al., Appl. Statist. (2001) 50, Part3, pp 325-335). The full set of R commands used to download data and fit the model is given at the end of this email.> fit_lmer_lapl <- lmer(y~ treat + time + (1|subject),data=lesaffre,family=binomial,method="Laplace")Warning message: optim or nlminb returned message ERROR: ABNORMAL_TERMINATION_IN_LNSRCH in: LMEopt(x = mer, value = cv) PART OF OUTPUT: Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.626214 0.264996 -2.3631 0.01812 * treat -0.304660 0.360866 -0.8442 0.39853 time -0.346605 0.026666 -12.9979 < 2e-16 *** The corresponding estimates with glmmADMB is:> fit_glmmADMB <- glmm.admb(y~ treat + time,random=~1,group="subject",data=lesaffre,family="binomial",link="logit")PART OF OUTPUT: Fixed effects: Log-likelihood: -359.649 Formula: y ~ treat + time (Intercept) treat time -2.33210 -0.68795 -0.46134 So, the estimates of the fixed effects differ. lmer() does infact produces a warning, and it appears that it method="Laplace" and method="PQL" produce the same results. Best regards, hans # Load data source("http://www.mi.uib.no/~skaug/cash/lesaffre_dat.s") # Run lmer library(lme4) fit_lmer <- lmer(y~ treat + time + (1|subject),data=lesaffre,family=binomial) fit_lmer_lapl <- lmer(y~ treat + time + (1|subject),data=lesaffre,family=binomial,method="Laplace") # Run glmmADMB library(glmmADMB) example(glmm.admb) # Must be run once in each new directory (this feature will be removed in future version of glmmADMB). fit_glmmADMB <- glmm.admb(y~ treat + time,random=~1,group="subject",data=lesaffre,family="binomial",link="logit")
Hans Julius Skaug
2005-Dec-20 11:25 UTC
[R] glmmADMB: Generalized Linear Mixed Models using AD Model Builder
I agree that the model is not fitting the Lesaffre data well, but my point was to show that glmmADMB is numerically stable. Numerical stability is obviously a nice property, but becomes particularly important when one wants to do parametric bootstrappin, which I think is needed for these kinds of models to assess bias in parameter estimates. glmmADMB produces the exact parameter values that maximizes the Laplace approximation for this dataset. Another story is that the Laplace approximation is inaccurate here, as can be shown by using other likelihood approximations. hans Douglas Bates wrote:>Ah yes, that example. It is also given as the 'toenail' data set in >the 'mlmus' package of data sets from the book "Multilevel and >Longitudinal Modeling Using Stata" by Sophia Rabe-Hesketh and Anders >Skrondal (Stata Press, 2005). > >It is not surprising that it is difficult to fit such a model to these >data because the data do not look like they come from such a model. >You did not include the estimates of the variance of the random >effects in your output. It is very large and very poorly determined. >If you check the distribution of the posterior modes of the random >effects (for linear mixed models these are called the BLUPs - Best >Linear Unbiased Predictors - and you could call them BLUPs here too >except for the fact that they are not linear and they are not unbiased >and there isn't a clear sense in which they are "best") it is clearly >not a Gaussian distribution with mean zero. I enclose a density plot. > You can see that it is bimodal and the larger of the two peaks is for >a negative value. These are the random effects for those subjects >that had no positive responses - 163 out of the 294 subjects. > >> sum(with(lesaffre, tapply(y, subject, mean)) == 0) >[1] 163 > >There is no information to estimate the random effects for these >subjects other than "make it as large and negative as possible". It >is pointless to estimate the fixed effects for such a clearly >inappropriate model. > lattice package. >