thr3ads.net - R help - [R] fixed effects constant in mcmcsamp [Aug 2006]

If this information is useful, please help other people find it:
Share via:

Daniel Farewell

2006-Aug-08 09:58 UTC

[R] fixed effects constant in mcmcsamp

I'm fitting a GLMM to some questionnaire data. The structure is J
individuals,
nested within I areas, all of whom answer the same K (ordinal) questions. The
model I'm using is based on so-called continuation ratios, so that it can be
fitted using the lme4 package.

The lmer function fits the model just fine, but using mcmcsamp to judge the
variability of the parameter estimates produces some strange results. The
posterior sample is constant for the fixed effects, and the estimates of the
variance components are way out in the tails of their posterior samples.

The model I'm using says (for l = 1, ..., L - 1)

logit P(X[ijk] = l | X[ijk] >= l, U[i], V[j], W[k]) = U[i] + V[j] + W[k] +
a[l]

where X[ijk] is the ordinal response to question k for individual j in area i.
The U, V, and W are random effects and the a's are fixed effects. Here's
a
function to simulate data which mimics this setup (with a sequence of binary
responses Y[ijkl] = 1 iff X[ijk] = l):

sim <- function(n = c(10, 10, 10), sd = c(0.5, 2, 0.5), a = seq(-5, 1, 2)) {

 u <- rnorm(n[1], 0, sd[1])
 v <- rnorm(prod(n[1:2]), 0, sd[2])
 w <- rnorm(n[3], 0, sd[3])

 i <- factor(rep(1:n[1], each = prod(n[2:3])))
 j <- factor(rep(1:prod(n[1:2]), each = n[3]))
 k <- factor(rep(1:n[3], prod(n[1:2])))

 df <- NULL

 for(l in 1:length(a)) {
    
  y <- rbinom(length(i), 1, plogis(u[i] + v[j] + w[k] + a[l]))
  df <- rbind(df, data.frame(i, j, k, l, y))

  i <- i[!y]
  j <- j[!y]
  k <- k[!y]

 }

 df$l <- factor(df$l)
 return(df)
 
}

And here's a function which shows the difficulties I've been having:

test <- function(seed = 10, ...) {

 require(lme4)
 set.seed(seed)

 df <- sim(...)
 df.lmer <- lmer(y ~ 0 + l + (1 | i) + (1 | j) + (1 | k), family = binomial,
data = df) 
 df.mcmc <- mcmcsamp(df.lmer, 1000, trans = FALSE)

 print(summary(df.lmer))
 print(summary(df.mcmc))
 densityplot(~ as.numeric(df.mcmc) | factor(rep(colnames(df.mcmc), each 1000)),
scales = list(relation = "free"))

}

Running

test()

gives the following:

Generalized linear mixed model fit using PQL
Formula: y ~ 0 + l + (1 | i) + (1 | j) + (1 | k)
Data: df
 Family: binomial(logit link)
      AIC      BIC    logLik deviance
 2316.133 2359.166 -1151.066 2302.133
Random effects:
 Groups Name        Variance Std.Dev.
 j      (Intercept) 4.15914  2.03940
 k      (Intercept) 0.25587  0.50584
 i      (Intercept) 0.56962  0.75473
number of obs: 3455, groups: j, 100; k, 10; i, 10

Estimated scale (compare to 1)  0.8747598

Fixed effects:
   Estimate Std. Error  z value  Pr(>|z|)
l1 -4.50234    0.40697 -11.0632 < 2.2e-16 ***
l2 -3.27643    0.38177  -8.5821 < 2.2e-16 ***
l3 -1.05277    0.36566  -2.8791  0.003988 **
l4  0.76538    0.36832   2.0780  0.037706 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1

Correlation of Fixed Effects:
   l1    l2    l3
l2 0.843
l3 0.841 0.900
l4 0.805 0.865 0.921
       l1               l2               l3               l4
 Min.   :-4.502   Min.   :-3.276   Min.   :-1.053   Min.   :0.7654
 1st Qu.:-4.502   1st Qu.:-3.276   1st Qu.:-1.053   1st Qu.:0.7654
 Median :-4.502   Median :-3.276   Median :-1.053   Median :0.7654
 Mean   :-4.502   Mean   :-3.276   Mean   :-1.053   Mean   :0.7654
 3rd Qu.:-4.502   3rd Qu.:-3.276   3rd Qu.:-1.053   3rd Qu.:0.7654
 Max.   :-4.502   Max.   :-3.276   Max.   :-1.053   Max.   :0.7654
     j.(In)          k.(In)           i.(In)           deviance
 Min.   :1.911   Min.   :0.0509   Min.   :0.06223   Min.   :2011
 1st Qu.:2.549   1st Qu.:0.1310   1st Qu.:0.19550   1st Qu.:2044
 Median :2.789   Median :0.1756   Median :0.25581   Median :2054
 Mean   :2.824   Mean   :0.2085   Mean   :0.29948   Mean   :2054
 3rd Qu.:3.070   3rd Qu.:0.2463   3rd Qu.:0.34640   3rd Qu.:2064
 Max.   :4.615   Max.   :0.8804   Max.   :3.62168   Max.   :2107

As you can see, the posterior samples from the fixed effects are constant (at
the inital estimates) and the estimates of the variance components aren't
within
the IQ ranges of their posterior samples.

I understand from various posts that mcmcsamp is still a work in progress, and
may not work on every model. Is this one of those cases? I'm using R 2.3.1
and
lme4 0.995-2 on Windows XP.

Daniel Farewell
Cardiff University

Douglas Bates

2006-Aug-08 18:05 UTC

head link

[R] fixed effects constant in mcmcsamp

Thank you for the thorough report.

On 8/8/06, Daniel Farewell <farewelld at cardiff.ac.uk>
wrote:> I'm fitting a GLMM to some questionnaire data. The structure is J
individuals,
> nested within I areas, all of whom answer the same K (ordinal) questions.
The
> model I'm using is based on so-called continuation ratios, so that it
can be
> fitted using the lme4 package.
>
> The lmer function fits the model just fine, but using mcmcsamp to judge the
> variability of the parameter estimates produces some strange results. The
> posterior sample is constant for the fixed effects, and the estimates of
the
> variance components are way out in the tails of their posterior samples.
>
> The model I'm using says (for l = 1, ..., L - 1)
>
> logit P(X[ijk] = l | X[ijk] >= l, U[i], V[j], W[k]) = U[i] + V[j] + W[k]
+ a[l]
>
> where X[ijk] is the ordinal response to question k for individual j in area
i.
> The U, V, and W are random effects and the a's are fixed effects.
Here's a
> function to simulate data which mimics this setup (with a sequence of
binary
> responses Y[ijkl] = 1 iff X[ijk] = l):
>
> sim <- function(n = c(10, 10, 10), sd = c(0.5, 2, 0.5), a = seq(-5, 1,
2)) {
>
>  u <- rnorm(n[1], 0, sd[1])
>  v <- rnorm(prod(n[1:2]), 0, sd[2])
>  w <- rnorm(n[3], 0, sd[3])
>
>  i <- factor(rep(1:n[1], each = prod(n[2:3])))
>  j <- factor(rep(1:prod(n[1:2]), each = n[3]))
>  k <- factor(rep(1:n[3], prod(n[1:2])))
>
>  df <- NULL
>
>  for(l in 1:length(a)) {
>
>   y <- rbinom(length(i), 1, plogis(u[i] + v[j] + w[k] + a[l]))
>   df <- rbind(df, data.frame(i, j, k, l, y))
>
>   i <- i[!y]
>   j <- j[!y]
>   k <- k[!y]
>
>  }
>
>  df$l <- factor(df$l)
>  return(df)
>
> }
>
> And here's a function which shows the difficulties I've been
having:
>
> test <- function(seed = 10, ...) {
>
>  require(lme4)
>  set.seed(seed)
>
>  df <- sim(...)
>  df.lmer <- lmer(y ~ 0 + l + (1 | i) + (1 | j) + (1 | k), family =
binomial,
> data = df)
>  df.mcmc <- mcmcsamp(df.lmer, 1000, trans = FALSE)
>
>  print(summary(df.lmer))
>  print(summary(df.mcmc))
>  densityplot(~ as.numeric(df.mcmc) | factor(rep(colnames(df.mcmc), each
> 1000)), scales = list(relation = "free"))
>
> }
>
> Running
>
> test()
>
> gives the following:
>
> Generalized linear mixed model fit using PQL
> Formula: y ~ 0 + l + (1 | i) + (1 | j) + (1 | k)
> Data: df
>  Family: binomial(logit link)
>       AIC      BIC    logLik deviance
>  2316.133 2359.166 -1151.066 2302.133
> Random effects:
>  Groups Name        Variance Std.Dev.
>  j      (Intercept) 4.15914  2.03940
>  k      (Intercept) 0.25587  0.50584
>  i      (Intercept) 0.56962  0.75473
> number of obs: 3455, groups: j, 100; k, 10; i, 10
>
> Estimated scale (compare to 1)  0.8747598
>
> Fixed effects:
>    Estimate Std. Error  z value  Pr(>|z|)
> l1 -4.50234    0.40697 -11.0632 < 2.2e-16 ***
> l2 -3.27643    0.38177  -8.5821 < 2.2e-16 ***
> l3 -1.05277    0.36566  -2.8791  0.003988 **
> l4  0.76538    0.36832   2.0780  0.037706 *
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
>
> Correlation of Fixed Effects:
>    l1    l2    l3
> l2 0.843
> l3 0.841 0.900
> l4 0.805 0.865 0.921
>        l1               l2               l3               l4
>  Min.   :-4.502   Min.   :-3.276   Min.   :-1.053   Min.   :0.7654
>  1st Qu.:-4.502   1st Qu.:-3.276   1st Qu.:-1.053   1st Qu.:0.7654
>  Median :-4.502   Median :-3.276   Median :-1.053   Median :0.7654
>  Mean   :-4.502   Mean   :-3.276   Mean   :-1.053   Mean   :0.7654
>  3rd Qu.:-4.502   3rd Qu.:-3.276   3rd Qu.:-1.053   3rd Qu.:0.7654
>  Max.   :-4.502   Max.   :-3.276   Max.   :-1.053   Max.   :0.7654
>      j.(In)          k.(In)           i.(In)           deviance
>  Min.   :1.911   Min.   :0.0509   Min.   :0.06223   Min.   :2011
>  1st Qu.:2.549   1st Qu.:0.1310   1st Qu.:0.19550   1st Qu.:2044
>  Median :2.789   Median :0.1756   Median :0.25581   Median :2054
>  Mean   :2.824   Mean   :0.2085   Mean   :0.29948   Mean   :2054
>  3rd Qu.:3.070   3rd Qu.:0.2463   3rd Qu.:0.34640   3rd Qu.:2064
>  Max.   :4.615   Max.   :0.8804   Max.   :3.62168   Max.   :2107
>
> As you can see, the posterior samples from the fixed effects are constant
(at
> the inital estimates) and the estimates of the variance components
aren't within
> the IQ ranges of their posterior samples.
>
> I understand from various posts that mcmcsamp is still a work in progress,
and
> may not work on every model. Is this one of those cases? I'm using R
2.3.1 and
> lme4 0.995-2 on Windows XP.
In recent versions of the lme4/Matrix packages setting verbose = TRUE
in a call to mcmcsamp for a generalized linear mixed model causes the
Metropolis-Hastings ratio for the proposed change in the fixed effects
and random effects to be printed.  When I ran your example those
values were always 0 which is either extremely bad luck or a bug.  My
guess is that it's a bug.

Thanks for the report with the reproducible example.

Maybe Matching Threads

Search for more maybe matching threads

R help - Aug 2006 - fixed effects constant in mcmcsamp

[R] fixed effects constant in mcmcsamp

[R] fixed effects constant in mcmcsamp

Maybe Matching Threads