thr3ads.net - R help - [R] zero random effect sizes with binomial lmer [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Daniel Ezra Johnson

2006-Dec-31 09:19 UTC

[R] zero random effect sizes with binomial lmer

I am fitting models to the responses to a questionnaire that has  
seven yes/no questions (Item). For each combination of Subject and  
Item, the variable Response is coded as 0 or 1.

I want to include random effects for both Subject and Item. While I  
understand that the datasets are fairly small, and there are a lot of  
invariant subjects, I do not understand something that is happening  
here, and in comparing other subsets of the data.

In the data below, which has been adjusted to show this phenomenon  
clearly, the Subject random effect variance is comparable for A  
(1.63) and B (1.712), but the Item random effect variance comes out  
as 0.109 for B and essentially zero for A (5.00e-10).

Note that the only difference between data set A and data set B  
occurs on row 19, where a single instance of Response is changed.

Item			avg. Response in A		avg. Response in B
1				9%						9%
2				9%						9%
3				9%						9%
4				17%					17%
5				4%						4%
6				22%					26%
7				17%					17%

Why does the Item random effect sometimes "crash" to zero? Surely  
there is some more reasonable estimate of the Item effect than zero.  
The items still have clearly different Response behavior.

If one compares the random effect estimates, in fact, one sees that  
they are in the correct proportion, with the expected signs. They are  
just approximately eight orders of magnitude too small. Is this a bug?

More broadly, is it hopeless to analyze this data in this manner, or  
else, what should I try doing differently? It would be very useful to  
be able to have reliable estimates of random effect sizes, even when  
they are rather small.

I've included replicable code below, sorry that I did not know how to  
make it more compact!

a1 <- c(0,0,0,0,0,0,0)
a2 <- c(0,0,0,0,0,0,0)
a3 <- c(0,0,0,0,0,0,0)
a4 <- c(0,0,0,0,0,0,0)
a5 <- c(0,0,0,0,0,0,0)
a6 <- c(0,0,0,0,0,0,0)
a7 <- c(0,0,0,0,0,0,0)
a8 <- c(0,0,0,0,0,0,0)
a9 <- c(0,0,0,0,0,0,0)
a10 <- c(0,0,0,0,0,0,0)
a11 <- c(0,0,0,0,0,0,0)
a12 <- c(0,0,0,0,0,0,0)
a13 <- c(0,0,0,0,0,0,1)
a14 <- c(0,0,0,0,0,0,1)
a15 <- c(0,0,0,0,0,1,0)
a16 <- c(0,0,0,0,1,0,0)
a17 <- c(0,0,0,1,0,0,0)
a18 <- c(0,0,1,0,0,0,0)
a19 <- c(0,1,0,0,0,0,0)
a20 <- c(0,1,0,0,0,1,0)
a21 <- c(0,0,0,1,0,1,1)
a22 <- c(1,0,0,1,0,1,1)
a23 <- c(1,0,1,1,0,1,0)
aa <- rbind 
(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20, 
a21,a22,a23)

b1 <- c(0,0,0,0,0,0,0)
b2 <- c(0,0,0,0,0,0,0)
b3 <- c(0,0,0,0,0,0,0)
b4 <- c(0,0,0,0,0,0,0)
b5 <- c(0,0,0,0,0,0,0)
b6 <- c(0,0,0,0,0,0,0)
b7 <- c(0,0,0,0,0,0,0)
b8 <- c(0,0,0,0,0,0,0)
b9 <- c(0,0,0,0,0,0,0)
b10 <- c(0,0,0,0,0,0,0)
b11 <- c(0,0,0,0,0,0,0)
b12 <- c(0,0,0,0,0,0,0)
b13 <- c(0,0,0,0,0,0,1)
b14 <- c(0,0,0,0,0,0,1)
b15 <- c(0,0,0,0,0,1,0)
b16 <- c(0,0,0,0,1,0,0)
b17 <- c(0,0,0,1,0,0,0)
b18 <- c(0,0,1,0,0,0,0)
b19 <- c(0,1,0,0,0,1,0)
b20 <- c(0,1,0,0,0,1,0)
b21 <- c(0,0,0,1,0,1,1)
b22 <- c(1,0,0,1,0,1,1)
b23 <- c(1,0,1,1,0,1,0)
bb <- rbind 
(b1,b2,b3,b4,b5,b6,b7,b8,b9,b10,b11,b12,b13,b14,b15,b16,b17,b18,b19,b20, 
b21,b22,b23)

a <- array(0, c(161,3),
list(NULL,c("Subject","Item","Response")))
	for (s in c(1:23))
		for (i in c(1:7))
			a[7*(s-1)+i,] <- c(s,i,aa[s,i])		
A <- data.frame(a)

b <- array(0, c(161,3),
list(NULL,c("Subject","Item","Response")))
	for (s in c(1:23))
		for (i in c(1:7))
			b[7*(s-1)+i,] <- c(s,i,bb[s,i])
B <- data.frame(b)

A.fit <- lmer(Response~(1|Subject)+(1|Item),A,binomial)
B.fit <- lmer(Response~(1|Subject)+(1|Item),B,binomial)
A.fit
B.fit
ranef(A.fit)$Item
ranef(B.fit)$Item

Generalized linear mixed model fit using Laplace
Formula: Response ~ (1 | Subject) + (1 | Item)
    Data: A
Family: binomial(logit link)
AIC BIC logLik deviance
120 129  -56.8      114
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.63e+00 1.28e+00
Item    (Intercept) 5.00e-10 2.24e-05
number of obs: 161, groups: Subject, 23; Item, 7

Estimated scale (compare to  1 )  0.83326

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.517      0.395   -6.38  1.8e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 > B.fit
Generalized linear mixed model fit using Laplace
Formula: Response ~ (1 | Subject) + (1 | Item)
    Data: B
Family: binomial(logit link)
AIC BIC logLik deviance
123 133  -58.6      117
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.712    1.308
Item    (Intercept) 0.109    0.330
number of obs: 161, groups: Subject, 23; Item, 7

Estimated scale (compare to  1 )  0.81445

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.498      0.415   -6.02  1.8e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 > ranef(A.fit)$Item
   (Intercept)
1 -2.8011e-10
2 -2.8011e-10
3 -2.8011e-10
4  7.1989e-10
5 -7.8011e-10
6  1.2199e-09
7  7.1989e-10
 > ranef(B.fit)$Item
   (Intercept)
1   -0.056937
2   -0.056937
3   -0.056937
4    0.120293
5   -0.146925
6    0.293893
7    0.120293
	[[alternative HTML version deleted]]

Gregor Gorjanc

2006-Dec-31 12:55 UTC

head link

[R] zero random effect sizes with binomial lmer

Daniel Ezra Johnson <johnson4 <at> babel.ling.upenn.edu> writes:
...> If one compares the random effect estimates, in fact, one sees that  
> they are in the correct proportion, with the expected signs. They are  
> just approximately eight orders of magnitude too small. Is this a bug?...

BLUPs are essentially shrinkage estimates, where shrinkage is determined with
magnitude of variance. Lower variance more shrinkage towards the mean - zero 
in this case. So this is not a bug.

Gregor

Daniel Ezra Johnson

2006-Dec-31 17:32 UTC

head link

[R] zero random effect sizes with binomial lmer

> > If one compares the random effect estimates, in fact, one sees that
> > they are in the correct proportion, with the expected signs. They are
> > just approximately eight orders of magnitude too small. Is this a bug?
> 
> BLUPs are essentially shrinkage estimates, where shrinkage is
> determined with magnitude of variance. Lower variance more
> shrinkage towards the mean - zero in this case. So this is not a bug.
> 
> Gregor
I doubled each data set by duplicating each Subject. There are now
46 subjects in each group instead of 23. So now A and B differ in 2
observations out of 322. The lmer resuls are sensible.

I know BLUPs are not like regular parameters, but doubling (or cutting
in half) the data should not, in my opinion, cause this behavior. There
is a lot of room for the original A variance estimate to be lower than B,
maybe it should be 0.05, 0.01, 0.05, or whatever, but not < .0000000005.

"DOUBLED" (Laplace)
A
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.759    1.326
Item    (Intercept) 0.178    0.422
number of obs: 322, groups: Subject, 46; Item, 7
B
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.892    1.376
Item    (Intercept) 0.319    0.564
number of obs: 322, groups: Subject, 46; Item, 7

"ORIGINAL" (Laplace)
A
Groups Name Variance Std.Dev.
Subject (Intercept) 1.63    1.28
Item (Intercept) 5.00e-10 2.24e-05
number of obs: 161, groups: Subject, 23; Item, 7
B
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.712    1.308
Item    (Intercept) 0.109    0.330
number of obs: 161, groups: Subject, 23; Item, 7

By the way, using the PQL method instead of Laplace "fails"
even more badly with the original data (and gives very
different estimates with the doubled data).

"DOUBLED" (PQL)
A
Groups  Name        Variance Std.Dev.
Subject (Intercept) 2.997    1.731
Item    (Intercept) 0.509    0.713
number of obs: 322, groups: Subject, 46; Item, 7
B
Subject (Intercept) 3.317    1.821
Item    (Intercept) 0.725    0.852
number of obs: 322, groups: Subject, 46; Item, 7

"ORIGINAL" (PQL)
A
1: Estimated variance for factor Item
  is effectively zero
in: LMEopt(x = mer, value = cv)
2: Estimated variance for factors Subject, Item
  is effectively zero
in: LMEopt(x = mer, value = cv)
B
Estimated variance for factors Subject, Item
  is effectively zero
in: LMEopt(x = mer, value = cv)

I understand that the between-Item variance is low, and probably
it is no greater than what you would expect to occur by chance,
but isn't that what hypothesis testing is for (anova, etc.)?

Is my best way around the algorithm returning zero to do
what I have done above, with my real data? That is, duplicate
(or triplicate) Subjects to increase my data size, and thereby
get a reasonably comparable (if wrong) estimate of the Item variance?
Zero is not a reasonable estimate in any of these data sets.

Thanks,
Daniel
> A.fit # "DOUBLED" DATA AGeneralized linear mixed model fit using Laplace
Formula: Response ~ (1 | Subject) + (1 | Item)
    Data: A
Family: binomial(logit link)
AIC BIC logLik deviance
232 243   -113      226
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.759    1.326
Item    (Intercept) 0.178    0.422
number of obs: 322, groups: Subject, 46; Item, 7

Estimated scale (compare to  1 )  0.81576

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.601      0.327   -7.95  1.9e-15 ***
---
> B.fit  # "DOUBLED" DATA BGeneralized linear mixed model fit using Laplace
Formula: Response ~ (1 | Subject) + (1 | Item)
    Data: B
Family: binomial(logit link)
AIC BIC logLik deviance
237 249   -116      231
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.892    1.376
Item    (Intercept) 0.319    0.564
number of obs: 322, groups: Subject, 46; Item, 7

Estimated scale (compare to  1 )  0.8052

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    -2.61       0.36   -7.25  4.1e-13 ***
> ranef(A.fit)$Item  # "DOUBLED" DATA A   (Intercept)
1   -0.084969
2   -0.084969
3   -0.084969
4    0.325780
5   -0.308044
6    0.515590        # 10 1's, 36 0's
7    0.325780> ranef(B.fit)$Item  # "DOUBLED" DATA B   (Intercept)
1    -0.11962
2    -0.11962
3    -0.11962
4     0.42389
5    -0.43555
6     0.88322        # 12 1's, 34 0's
7     0.42389

On Dec 31, 2006, at 1:19 AM, Daniel Ezra Johnson wrote:

I am fitting models to the responses to a questionnaire that has seven 
yes/no questions (Item). For each combination of Subject and Item, the 
variable Response is coded as 0 or 1.

I want to include random effects for both Subject and Item. While I 
understand that the datasets are fairly small, and there are a lot of 
invariant subjects, I do not understand something that is happening here, 
and in comparing other subsets of the data.

In the data below, which has been adjusted to show this phenomenon 
clearly, the Subject random effect variance is comparable for A (1.63) and 
B (1.712), but the Item random effect variance comes out as 0.109 for B 
and essentially zero for A (5.00e-10).

Note that the only difference between data set A and data set B occurs on 
row 19, where a single instance of Response is changed.

Item			avg. Response in A		avg. Response in B
1				9%				9%
2				9%				9%
3				9%				9%
4				17%				17%
5				4%				4%
6				22%		<->		26%
7				17%				17%

Why does the Item random effect sometimes "crash" to zero? Surely
there is
some more reasonable estimate of the Item effect than zero. The items 
still have clearly different Response behavior.

If one compares the random effect estimates, in fact, one sees that they 
are in the correct proportion, with the expected signs. They are just 
approximately eight orders of magnitude too small. Is this a bug?

More broadly, is it hopeless to analyze this data in this manner, or else, 
what should I try doing differently? It would be very useful to be able to 
have reliable estimates of random effect sizes, even when they are rather 
small.

I've included replicable code below, sorry that I did not know how to make 
it more compact!

a1 <- c(0,0,0,0,0,0,0)
a2 <- c(0,0,0,0,0,0,0)
a3 <- c(0,0,0,0,0,0,0)
a4 <- c(0,0,0,0,0,0,0)
a5 <- c(0,0,0,0,0,0,0)
a6 <- c(0,0,0,0,0,0,0)
a7 <- c(0,0,0,0,0,0,0)
a8 <- c(0,0,0,0,0,0,0)
a9 <- c(0,0,0,0,0,0,0)
a10 <- c(0,0,0,0,0,0,0)
a11 <- c(0,0,0,0,0,0,0)
a12 <- c(0,0,0,0,0,0,0)
a13 <- c(0,0,0,0,0,0,1)
a14 <- c(0,0,0,0,0,0,1)
a15 <- c(0,0,0,0,0,1,0)
a16 <- c(0,0,0,0,1,0,0)
a17 <- c(0,0,0,1,0,0,0)
a18 <- c(0,0,1,0,0,0,0)
a19 <- c(0,1,0,0,0,0,0)  # differs
a20 <- c(0,1,0,0,0,1,0)
a21 <- c(0,0,0,1,0,1,1)
a22 <- c(1,0,0,1,0,1,1)
a23 <- c(1,0,1,1,0,1,0)
aa <- 
rbind(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23)

b1 <- c(0,0,0,0,0,0,0)
b2 <- c(0,0,0,0,0,0,0)
b3 <- c(0,0,0,0,0,0,0)
b4 <- c(0,0,0,0,0,0,0)
b5 <- c(0,0,0,0,0,0,0)
b6 <- c(0,0,0,0,0,0,0)
b7 <- c(0,0,0,0,0,0,0)
b8 <- c(0,0,0,0,0,0,0)
b9 <- c(0,0,0,0,0,0,0)
b10 <- c(0,0,0,0,0,0,0)
b11 <- c(0,0,0,0,0,0,0)
b12 <- c(0,0,0,0,0,0,0)
b13 <- c(0,0,0,0,0,0,1)
b14 <- c(0,0,0,0,0,0,1)
b15 <- c(0,0,0,0,0,1,0)
b16 <- c(0,0,0,0,1,0,0)
b17 <- c(0,0,0,1,0,0,0)
b18 <- c(0,0,1,0,0,0,0)
b19 <- c(0,1,0,0,0,1,0)  # differs
b20 <- c(0,1,0,0,0,1,0)
b21 <- c(0,0,0,1,0,1,1)
b22 <- c(1,0,0,1,0,1,1)
b23 <- c(1,0,1,1,0,1,0)
bb <- 
rbind(b1,b2,b3,b4,b5,b6,b7,b8,b9,b10,b11,b12,b13,b14,b15,b16,b17,b18,b19,b20,b21,b22,b23)

a <- array(0, c(161,3), list(NULL,
c("Subject","Item","Response")))
 	for (s in c(1:23))
 		for (i in c(1:7))
 			a[7*(s-1)+i,] <- c(s,i,aa[s,i])
A <- data.frame(a)

b <- array(0, c(161,3),
list(NULL,c("Subject","Item","Response")))
 	for (s in c(1:23))
 		for (i in c(1:7))
 			b[7*(s-1)+i,] <- c(s,i,bb[s,i])
B <- data.frame(b)

A.fit <- lmer(Response~(1|Subject)+(1|Item),A,binomial)
B.fit <- lmer(Response~(1|Subject)+(1|Item),B,binomial)
A.fit
B.fit
ranef(A.fit)$Item
ranef(B.fit)$Item

Generalized linear mixed model fit using Laplace
Formula: Response ~ (1 | Subject) + (1 | Item)
    Data: A
Family: binomial(logit link)
AIC BIC logLik deviance
120 129  -56.8      114
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.63e+00 1.28e+00
Item    (Intercept) 5.00e-10 2.24e-05
number of obs: 161, groups: Subject, 23; Item, 7

Estimated scale (compare to  1 )  0.83326

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.517      0.395   -6.38  1.8e-10 ***

    Data: B
Family: binomial(logit link)
AIC BIC logLik deviance
123 133  -58.6      117
Random effects:
Groups  Name        Variance Std.Dev.
Subject (Intercept) 1.712    1.308
Item    (Intercept) 0.109    0.330
number of obs: 161, groups: Subject, 23; Item, 7

Estimated scale (compare to  1 )  0.81445

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.498      0.415   -6.02  1.8e-09 ***
---
ranef(A.fit)$Item
   (Intercept)
1 -2.8011e-10
2 -2.8011e-10
3 -2.8011e-10
4  7.1989e-10
5 -7.8011e-10
6  1.2199e-09    # 5 1's, 18 0's
7  7.1989e-10
ranef(B.fit)$Item
   (Intercept)
1   -0.056937
2   -0.056937
3   -0.056937
4    0.120293
5   -0.146925
6    0.293893    # 6 1's, 17 0's
7    0.120293

Daniel Ezra Johnson

2006-Dec-31 18:50 UTC

head link

[R] zero random effect sizes with binomial lmer

I've found a way to make this problem, if it's not a bug, more clear. 
I've taken my original data set A and simply doubled it with 
AA<-rbind(A,A).

Doing so, instead of this:

Random effects: # A
  Groups  Name        Variance Std.Dev.
  Subject (Intercept) 1.63e+00 1.28e+00
  Item    (Intercept) 5.00e-10 2.24e-05
number of obs: 161, groups: Subject, 23; Item, 7

I get this:

Random effects: # AA
  Groups  Name        Variance Std.Dev.
  Subject (Intercept) 3.728    1.931
  Item    (Intercept) 0.319    0.565
number of obs: 322, groups: Subject, 23; Item, 7

I think I understand why the Subject effect increases. On a per-Response 
basis, there is more variation between Subjects. However, the 
probability (hence log odds, etc.) of the Response seems to be 
constant, so I am not sure. In any case, the ranef variance increases by 
a factor of 2.3, going from A to AA:

Response 1:0	Dataset A	Dataset AA
Subject  1	 0 :  7		 0 : 14
Subject  2	 0 :  7		 0 : 14
Subject  3	 0 :  7		 0 : 14
Subject  4	 0 :  7		 0 : 14
Subject  5	 0 :  7		 0 : 14
Subject  6	 0 :  7		 0 : 14
Subject  7	 0 :  7		 0 : 14
Subject  8	 0 :  7		 0 : 14
Subject  9	 0 :  7		 0 : 14
Subject 10	 0 :  7		 0 : 14
Subject 11	 0 :  7		 0 : 14
Subject 12	 0 :  7		 0 : 14
Subject 13	 1 :  6		 2 : 12
Subject 14	 1 :  6		 2 : 12
Subject 15	 1 :  6		 2 : 12
Subject 16	 1 :  6		 2 : 12
Subject 17	 1 :  6		 2 : 12
Subject 18	 1 :  6		 2 : 12
Subject 19	 1 :  6		 2 : 12
Subject 20	 2 :  5		 4 : 10
Subject 21	 3 :  4		 6 :  8
Subject 22	 4 :  3		 8 :  6
Subject 23	 4 :  3		 8 :  6

Ranef s^2	 1.630		 3.728

Comparing the Items, though, we have:

Response 1:0	Dataset A	Dataset AA
Item 1		 2 : 21		 4 : 42
Item 2		 2 : 21		 4 : 42
Item 3		 2 : 21		 4 : 42
Item 4		 4 : 19		 8 : 38
Item 5		 1 : 22		 2 : 44 
Item 6		 5 : 18		10 : 36 
Item 7		 4 : 19		 8 : 38

Ranef s^2:	 5.00xe-10	 0.319

Looking at this completely naively, I don't understand why whatever 
statistical difference doubling the data for each Item makes, shouldn't be 
similar to what happened above, for Subject. Instead, we have an estimate 
of zero for the smaller data set.

I realize that I don't understand the actual mathematics by which the 
BLUPs (and thus the variances) are calculated, but something seems wrong 
here. Obviously between A and AA nothing changes as far as the interaction 
of Subject and Item in particular combinations.

How should i best understand this behavior of lmer, and what is my best 
advice for obtaining reliable random effect size estimates from data like 
this? Are the estimates at least reliable when they are NOT zero?

Thanks,
Daniel

P.S. I'll point out again that the by multiplying the "zero" ranef
estimate for A by an enormous constant (see blow), it is almost identical 
to thet for AA. Obviously these coefficients derive from the marginal 
proportions in the data. And I understand that with small N's, the random 
effect may not be signifiicant. But that shouldn't mean that the estimate 
is zero, right (compare fixed effect coefficients, which are often 
non-zero, yet not significant...)

a1 <- c(0,0,0,0,0,0,0)
a2 <- c(0,0,0,0,0,0,0)
a3 <- c(0,0,0,0,0,0,0)
a4 <- c(0,0,0,0,0,0,0)
a5 <- c(0,0,0,0,0,0,0)
a6 <- c(0,0,0,0,0,0,0)
a7 <- c(0,0,0,0,0,0,0)
a8 <- c(0,0,0,0,0,0,0)
a9 <- c(0,0,0,0,0,0,0)
a10 <- c(0,0,0,0,0,0,0)
a11 <- c(0,0,0,0,0,0,0)
a12 <- c(0,0,0,0,0,0,0)
a13 <- c(0,0,0,0,0,0,1)
a14 <- c(0,0,0,0,0,0,1)
a15 <- c(0,0,0,0,0,1,0)
a16 <- c(0,0,0,0,1,0,0)
a17 <- c(0,0,0,1,0,0,0)
a18 <- c(0,0,1,0,0,0,0)
a19 <- c(0,1,0,0,0,0,0)
a20 <- c(0,1,0,0,0,1,0)
a21 <- c(0,0,0,1,0,1,1)
a22 <- c(1,0,0,1,0,1,1)
a23 <- c(1,0,1,1,0,1,0)
aa <- 
rbind(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23)

a <- array(0, c(161,3), list(NULL,
c("Subject","Item","Response")))
 	for (s in c(1:23))
 		for (i in c(1:7))
 			a[7*(s-1)+i,] <- c(s,i,aa[s,i])

A <- data.frame(a)
AA <- rbind(A,A)

A.fit <- lmer(Response~(1|Subject)+(1|Item),A,binomial)
AA.fit <- lmer(Response~(1|Subject)+(1|Item),B,binomial)

A.fit
AA.fit

ranef(A.fit)$Item
(ranef(A.fit)$Item)*578500000
ranef(AA.fit)$Item

Gregor Gorjanc

2006-Dec-31 21:58 UTC

head link

[R] zero random effect sizes with binomial lmer

Daniel Ezra Johnson <johnson4 <at> babel.ling.upenn.edu> writes:
...> More broadly, is it hopeless to analyze this data in this manner, or  
> else, what should I try doing differently? It would be very useful to  
> be able to have reliable estimates of random effect sizes, even when  
> they are rather small....

You might try with mcmcsamp to get a better view of posterior distributions of
your parameters. It might be the case that MLE for item variance is near 0,
while
its posterior distribution covers also values that are higher than 0.

Gregor

Daniel Ezra Johnson

2006-Dec-31 22:50 UTC

head link

[R] zero random effect sizes with binomial lmer

Gregor,

Thanks for your replies.

1) Yes, I have tweaked the data to show as clearly as I can that this is a 
bug, that a tiny change in initial conditions causes the collapse of a 
reasonable 'parameter' estimate.

2) mcmcsamp() does not work (currently) for binomial fitted models.

3) This is an issue of what happens when the sample is too small. For all 
larger data sets I have gotten a ranef variance between 0.05 and 1.00 or 
so.

It makes no sense to say that as the data set gets smaller, the systematic 
variation between Items goes away. It doesn't, as I've shown. In the
data
above, certain Items were still 10+ times as likely (log-odds wise) to 
have Response==1 as others.

It may make sense to say that the effect becomes unestimable, due to its 
small size. But my understanding is not that this should make the 
algorithm return zero as an estimated value.

D

Daniel Ezra Johnson

2007-Jan-02 01:59 UTC

head link

[R] zero random effect sizes with binomial lmer

> From: Andrew Robinson <A.Robinson_at_ms.unimelb.edu.au>
> Date: Mon 01 Jan 2007 - 19:19:29 GMT
>
> I tried an earlier version of R, on a different platform, and got quite 
> different results. Sadly, the *earlier* results are the ones that make 
> sense.
Andrew, I tried installing R 2.3.1, it seemed to work as you said, but I 
realized it was because the default method was PQL instead of Laplace. 
Switching the method to Laplace, the zero effect came back.

It seems like there's no easy way around this. I would like to compare 
random effect sizes between different subsets of my data, but this may not 
be possible using the current functions...

Daniel

Ken Beath

2007-Jan-02 02:36 UTC

head link

[R] zero random effect sizes with binomial lmer

On Sun, 31 Dec 2006, at 05:50 PM, Daniel Ezra Johnson  
<johnson4 at babel.ling.upenn.edu> wrote:
>
> Gregor,
>
> Thanks for your replies.
>
> 1) Yes, I have tweaked the data to show as clearly as I can that  
> this is a
> bug, that a tiny change in initial conditions causes the collapse of a
> reasonable 'parameter' estimate.
>
> 2) mcmcsamp() does not work (currently) for binomial fitted models.
>
> 3) This is an issue of what happens when the sample is too small.  
> For all
> larger data sets I have gotten a ranef variance between 0.05 and  
> 1.00 or
> so.
>
> It makes no sense to say that as the data set gets smaller, the  
> systematic
> variation between Items goes away. It doesn't, as I've shown. In  
> the data
> above, certain Items were still 10+ times as likely (log-odds wise) to
> have Response==1 as others.
>
> It may make sense to say that the effect becomes unestimable, due  
> to its
> small size. But my understanding is not that this should make the
> algorithm return zero as an estimated value.
>
> D
>
>
There is always the possibility that the Laplace approximation is  
proving too inaccurate for this problem but that seems unlikely, as  
there should be enough observations. The only way to check is to use  
a package that uses adaptive Gauss-Hermite for the integration,  
gllamm in Stata or NLMIXED in SAS may do it. PQL is even worse, so it  
is not an option.

The real problem is that there is not enough variation across your  
items,  and so the estimate of the random effect is close to zero.  
The only difference between your datasets is that one results in an  
estimate closer to zero. Fit the model without (Item|1), and the fit  
hardly worsens in either case, with resulting better AIC and BIC.

The responses by item can be checked with
 > apply(aa,2,sum)
[1] 2 2 2 4 1 5 4
 > apply(bb,2,sum)
[1] 2 2 2 4 1 6 4

This is not perfect as it ignores the subject variation, but they  
don't have a lot of variation.

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Dec 2006 - zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

[R] zero random effect sizes with binomial lmer

Seemingly Similar Threads