Paul Johnson
2006-Mar-08 18:17 UTC
[R] Want to fit random intercept in logistic regression (testing lmer and glmmML)
Greetings. Here is sample code, with some comments. It shows how I can simulate data and estimate glm with binomial family when there is no individual level random error, but when I add random error into the linear predictor, I have a difficult time getting reasonable estimates of the model parameters or the variance component. There are no clusters here, just individual level responses, so perhaps I am misunderstanding the translation from my simple mixed model to the syntax of glmmML and lmer. I get roughly the same (inaccurate) fixed effect parameter estimates from glmmML and lmer, but the information they give on the variance components is quite different. Thanks in advance. Now I paste in the example code ### Paul Johnson <pauljohn at ku.edu> ### 2006-03-08 N <- 1000 A <- -1 B <- 0.3 x <- 1 + 10 * rnorm(N) eta <- A + B * x pi <- exp(eta)/(1+exp(eta)) myunif <- runif(N) y <- ifelse(myunif < pi, 1, 0) plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] )) text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + exp(-eta[i] )))) myglm1 <- glm ( y ~ x, family=binomial(link="logit") ) summary(myglm1) ## Just for fun.... myglm2 <- glm(y~x, family=quasibinomial) summary(myglm2) ### Mixed model: random intercept with large variance eta <- A + B * x + 5 * rnorm(N) pi <- exp(eta)/(1+exp(eta)) myunif <- runif(N) y <- ifelse(myunif < pi, 1, 0) plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] + u[i])) text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + exp(-eta[i] )))) ### Parameter estimates go to hell, as expected myglm3 <- glm ( y ~ x, family=binomial(link="logit") ) summary(myglm3) ### Why doesn't quasibinomial show more evidence of the random intercept? myglm4 <- glm(y~x, family=quasibinomial) summary(myglm4) # Can I estimate with lmer? library(lme4) ### With no repeated observations, what does lmer want? ### Clusters of size 1 ? ### In lme, I'd need random= ~ 1 idnumber <- 1: length(y) mylmer1 <- lmer( y ~ x + ( 1 | idnumber ), family=binomial, method="Laplace" ) summary(mylmer1) ### Glmm wants clusters, and I don't have any clusters with more than 1 observation ### library(glmmML) myglmm1 <- glmmML(y~x, family=binomial, cluster = idnumber ) summary(myglmm1) -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas
Spencer Graves
2006-Mar-12 22:29 UTC
[R] Want to fit random intercept in logistic regression (testing lmer and glmmML)
The problem you report is related to the fact that a variance component can NOT be estimated from individual binary observations. The data do not contain enough information. I don't know the minimum requirements, but you could get estimates if you had some 0's and some 1's for multiple levels of x for at least two levels for "idnumber". Otherwise, the likelihood surface has no finite minimum: If you have all 0's or all 1's, the intercept wants to go to +/-Inf. If you have complete separation between the 0's and the 1's, the slope wants to go to +/-Inf. If you don't have complete separation and you have a random component with one binary outcome per group, you could use "offset" to fix the the slope at some large number and estimate the variance component of the group effect (I think; I haven't tried it). Currently, neither lme nor lmer contain good checks for overparameterization. I've gotten estimates for variance components that were not estimable. After I did it and tried to interpret the results, I realized the algoritm was too polite to tell me my model was stupid; it just tried to estimate it anyway -- often taking much longer to compute than if the model were actually estimable. ("lme" and "lmer" could, of course, be modified to include better checks for models that can't be estimated. However, I think Doug Bates' current development efforts are more important than these checks, and I don't have the time to develop code that checks for these issues.) Hope this helps, spencer graves Paul Johnson wrote:> Greetings. Here is sample code, with some comments. It shows how I > can simulate data and estimate glm with binomial family when there is > no individual level random error, but when I add random error into the > linear predictor, I have a difficult time getting reasonable estimates > of the model parameters or the variance component. > > There are no clusters here, just individual level responses, so > perhaps I am misunderstanding the translation from my simple mixed > model to the syntax of glmmML and lmer. I get roughly the same > (inaccurate) fixed effect parameter estimates from glmmML and lmer, > but the information they give on the variance components is quite > different. > > Thanks in advance. > > Now I paste in the example code > > ### Paul Johnson <pauljohn at ku.edu> > ### 2006-03-08 > > N <- 1000 > A <- -1 > B <- 0.3 > > > x <- 1 + 10 * rnorm(N) > eta <- A + B * x > > pi <- exp(eta)/(1+exp(eta)) > > myunif <- runif(N) > > y <- ifelse(myunif < pi, 1, 0) > > plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] )) > > text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + > exp(-eta[i] )))) > > > myglm1 <- glm ( y ~ x, family=binomial(link="logit") ) > summary(myglm1) > > ## Just for fun.... > myglm2 <- glm(y~x, family=quasibinomial) > summary(myglm2) > > > > ### Mixed model: random intercept with large variance > > eta <- A + B * x + 5 * rnorm(N) > pi <- exp(eta)/(1+exp(eta)) > myunif <- runif(N) > y <- ifelse(myunif < pi, 1, 0) > > plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] + u[i])) > > text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + > exp(-eta[i] )))) > > ### Parameter estimates go to hell, as expected > myglm3 <- glm ( y ~ x, family=binomial(link="logit") ) > summary(myglm3) > > ### Why doesn't quasibinomial show more evidence of the random intercept? > myglm4 <- glm(y~x, family=quasibinomial) > summary(myglm4) > > > > # Can I estimate with lmer? > > > library(lme4) > > ### With no repeated observations, what does lmer want? > ### Clusters of size 1 ? > ### In lme, I'd need random= ~ 1 > > idnumber <- 1: length(y) > > mylmer1 <- lmer( y ~ x + ( 1 | idnumber ), family=binomial, method="Laplace" ) > summary(mylmer1) > > > ### Glmm wants clusters, and I don't have any clusters with more than > 1 observation > ### > > library(glmmML) > > > myglmm1 <- glmmML(y~x, family=binomial, cluster = idnumber ) > > summary(myglmm1) > > -- > Paul E. Johnson > Professor, Political Science > 1541 Lilac Lane, Room 504 > University of Kansas > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html