johnson4 at babel.ling.upenn.edu
2008-Mar-13 23:52 UTC
[R] strange results from binomial lmer?
I'm running lmer repeatedly on artificial data with two fixed factors (called 'gender' and 'stress') and one random factor ('speaker'). Gender is a between-speaker variable, stress is a within-speaker variable, if that matters. Each dataset has 100 rows from each of 20 speakers, 2000 rows in all. About 5% of the time I get a strange result, where the lmer() model with BOTH fixed factors and the random factor ('gs_s') comes out MUCH worse compared to the models with ONE fixed factor and the random factor ('g_s' and 's_s'), and also compared to the glm() model with both fixed factors and no random factor ('gs'). This doesn't make much sense to me. I've placed a dataset on the Web that exhibits this behavior, as follows: dat <- read.csv("http://www.ling.upenn.edu/~johnson4/strange.csv") gs <- glm(outcome~gender+stress,binomial,dat) g_s <- lmer(outcome~gender+(1|speaker),dat,binomial) s_s <- lmer(outcome~stress+(1|speaker),dat,binomial) gs_s <- lmer(outcome~gender+stress+(1|speaker),dat,binomial) logLik(gs) # -1344 (df=3) logLik(g_s) # -1342 (df=3) logLik(s_s) # -1314 (df=3) logLik(gs_s) # -11823 (df=4) This seems like an error of some kind. The glm() model with both fixed effects is well-behaved, but lmer() seems to be going haywire when confronted with the same situation plus the random effect. Could anyone advise me how to stop this from happening, and/or explain why it is? Thanks very much, Daniel
johnson4 at babel.ling.upenn.edu wrote:> I'm running lmer repeatedly on artificial data with two fixed factors (called > 'gender' and 'stress') and one random factor ('speaker'). Gender is a > between-speaker variable, stress is a within-speaker variable, if that matters. > Each dataset has 100 rows from each of 20 speakers, 2000 rows in all. > > About 5% of the time I get a strange result, where the lmer() model with BOTH > fixed factors and the random factor ('gs_s') comes out MUCH worse compared to > the models with ONE fixed factor and the random factor ('g_s' and 's_s'), and > also compared to the glm() model with both fixed factors and no random factor > ('gs'). > > This doesn't make much sense to me. > > I've placed a dataset on the Web that exhibits this behavior, as follows: > > dat <- read.csv("http://www.ling.upenn.edu/~johnson4/strange.csv") > > gs <- glm(outcome~gender+stress,binomial,dat) > g_s <- lmer(outcome~gender+(1|speaker),dat,binomial) > s_s <- lmer(outcome~stress+(1|speaker),dat,binomial) > gs_s <- lmer(outcome~gender+stress+(1|speaker),dat,binomial) > > logLik(gs) # -1344 (df=3) > logLik(g_s) # -1342 (df=3) > logLik(s_s) # -1314 (df=3) > logLik(gs_s) # -11823 (df=4) > > This seems like an error of some kind. The glm() model with both fixed effects > is well-behaved, but lmer() seems to be going haywire when confronted with the > same situation plus the random effect.What version of the `lme4' package are you using? (Including the output of `sessionInfo()' would have helped here, as suggested by the posting guide.) Using version 0.999375-8 of the `lme4' package, I get > logLik(gs) # -1344 (df=3) 'log Lik.' -1344.320 (df=3) > logLik(g_s) # -1342 (df=3) 'log Lik.' -1341.794 (df=3) > logLik(s_s) # -1314 (df=3) 'log Lik.' -1314.395 (df=3) > logLik(gs_s) # -11823 (df=4) 'log Lik.' -1312.270 (df=4)> > Could anyone advise me how to stop this from happening, and/or explain why it > is?You're likely using some obsolete version of `lme4'. Try downloading a development snapshot from R-Forge (http://r-forge.r-project.org/projects/lme4/). HTH, Henric> > Thanks very much, > Daniel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Apparently Analagous Threads
- llvm-ir: anonymous struct name mangling
- zero random effect sizes with binomial lmer [sorry, ignore previous]
- zero random effect sizes with binomial lmer
- difference between lrm's "Model L.R." and anova's "Chi-Square"
- Block size optimization - let rsync find the optimal blocksize by itself.