thr3ads.net - R help - [R] lme4/lmer: P-Values from mcmc samples or chi2-tests? [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Christoph Scherber

2007-Feb-13 13:45 UTC

[R] lme4/lmer: P-Values from mcmc samples or chi2-tests?

Dear R users,

I have now tried out several options of obtaining p-values for 
(quasi)poisson lmer models, including Markov-chain Monte Carlo sampling 
and single-term deletions with subsequent chi-square tests (although I 
am aware that the latter may be problematic).

However, I encountered several problems that can be classified as
(1) the quasipoisson lmer model does not give p-values when summary() is 
called (see below)
(2) dependence on the size of the mcmc sample
(3) lack of correspondence between different p-value estimates.

How can I proceed, left with these uncertainties in the estimations of 
the p-values?

Below is the corresponding R code with some output so that you can see 
it all for your own:

##
m1<-lmer(number_pollinators~logpatch+loghab+landscape_diversity+(1|site),primula,poisson,method="ML")
m2<-lmer(number_pollinators~logpatch+loghab+landscape_diversity+(1|site),primula,quasipoisson,method="ML")
summary(m1);summary(m2)

#m1: [...]
Fixed effects:
                     Estimate Std. Error z value Pr(>|z|)
(Intercept)         -0.40302    0.57403 -0.7021  0.48262
logpatch             0.10915    0.04111  2.6552  0.00793 **
loghab               0.08750    0.06128  1.4279  0.15331
landscape_diversity  1.02338    0.40604  2.5204  0.01172 *

#m2: [...] #for the quasipoisson model, no p values are shown?!
Fixed effects:
                     Estimate Std. Error t value
(Intercept)          -0.4030     0.6857 -0.5877
logpatch              0.1091     0.0491  2.2228
loghab                0.0875     0.0732  1.1954
landscape_diversity   1.0234     0.4850  2.1099

##

anova(m1)
#here, no p-values appear (only in the current version of lme4)

Analysis of Variance Table
                     Df  Sum Sq Mean Sq
logpatch             1 11.6246 11.6246
loghab               1  6.0585  6.0585
landscape_diversity  1  6.3024  6.3024

anova(m2)
Analysis of Variance Table
                     Df  Sum Sq Mean Sq
logpatch             1 11.6244 11.6244
loghab               1  6.0581  6.0581
landscape_diversity  1  6.3023  6.3023

So I am left with the p-values estimated from the poisson model; 
single-term deletion tests for each of the individual terms lead to 
different p-values; I restrict this here to m2:

##
m2a=update(m2,~.-loghab)
anova(m2,m2a,test="Chi")

Data: primula
Models:
m2a: number_pollinators ~ logpatch + landscape_diversity + (1 | site)
m2: number_pollinators ~ logpatch + loghab + landscape_diversity + (1 | 
site)
     Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
m2a  4  84.713  91.850 -38.357
m2   5  84.834  93.755 -37.417 1.8793      1     0.1704

##
m2b=update(m2,~.-logpatch)
anova(m2,m2b,test="Chi")

Data: primula
Models:
m2b: number_pollinators ~ loghab + landscape_diversity + (1 | site)
m2: number_pollinators ~ logpatch + loghab + landscape_diversity +
m2b:     (1 | site)
     Df     AIC     BIC  logLik Chisq Chi Df Pr(>Chisq)
m2b  4  90.080  97.217 -41.040
m2   5  84.834  93.755 -37.417 7.246      1   0.007106 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1

##
m2c=update(m2,~.-landscape_diversity)
anova(m2,m2c,test="Chi")

Data: primula
Models:
m2c: number_pollinators ~ logpatch + loghab + (1 | site)
m2: number_pollinators ~ logpatch + loghab + landscape_diversity +
m2c:     (1 | site)
     Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
m2c  4  89.036  96.173 -40.518
m2   5  84.834  93.755 -37.417 6.2023      1    0.01276 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1


The p-values from mcmc are:

##
markov1=mcmcsamp(m2,5000)

HPDinterval(markov1)
                             lower      upper
(Intercept)          -1.394287660  0.6023229
logpatch              0.031154910  0.1906861
loghab                0.002961281  0.2165285
landscape_diversity   0.245623183  1.6442544
log(site.(In))      -41.156007604 -1.6993996
attr(,"Probability")
[1] 0.95

##

mcmcpvalue(as.matrix(markov1[,1])) #i.e. the p value for the intercept
[1] 0.3668
 > mcmcpvalue(as.matrix(markov1[,2])) #i.e. the p-value for logpatch
[1] 0.004
 > mcmcpvalue(as.matrix(markov1[,3])) #i.e. the p-value for loghab
[1] 0.0598
 > mcmcpvalue(as.matrix(markov1[,4])) #i.e. the p-value for landscape.div
[1] 0.0074

If one runs the mcmcsamp function for, say, 50,000 runs, the p-values 
are slightly different (not shown here).

##here are the p-values summarized in tabular form:

(Intercept)        	0.3668
logpatch         	0.004
loghab            	0.0598
landscape_diversity    	0.0074


##and from the single-term deletions:

(Intercept)        N.A.
logpatch        	0.007106
loghab            	0.1704
landscape_diversity    	0.01276


## Compare this with the p-values from the poisson model:
Fixed effects:
                     Estimate Std. Error z value Pr(>|z|)
(Intercept)         -0.40302    0.57403 -0.7021  0.48262
logpatch             0.10915    0.04111  2.6552  0.00793 **
loghab               0.08750    0.06128  1.4279  0.15331
landscape_diversity  1.02338    0.40604  2.5204  0.01172 *

##

To summarize, at least for quasipoisson models, the p-values obtained 
from mcmcpvalue() are quite different from those obtained using 
single-term deletions followed by a chisquare test.

Especially in the case of "loghab", the difference is so huge that one
could tend to interpret one of the p-values as "marginally
significant".

However, the mcmc chains look allright.

What would your suggestions be on how to proceed?

Thanks a lot for your help!

Best wishes,
Christoph.



##
I am using R 2.4.1 and the lme4 version I use is  0.9975-11 (Date: 
2007-01-25)

Manuel Morales

2007-Feb-14 20:54 UTC

head link

[R] lme4/lmer: P-Values from mcmc samples or chi2-tests?

On Tue, 2007-02-13 at 14:45 +0100, Christoph Scherber
wrote:> Dear R users,
> 
> I have now tried out several options of obtaining p-values for 
> (quasi)poisson lmer models, including Markov-chain Monte Carlo sampling 
> and single-term deletions with subsequent chi-square tests (although I 
> am aware that the latter may be problematic).
> 
> However, I encountered several problems that can be classified as
> (1) the quasipoisson lmer model does not give p-values when summary() is 
> called (see below)
> (2) dependence on the size of the mcmc sample
> (3) lack of correspondence between different p-value estimates.
> 
> How can I proceed, left with these uncertainties in the estimations of 
> the p-values?
> 
> Below is the corresponding R code with some output so that you can see 
> it all for your own:
> 
> ##
>
m1<-lmer(number_pollinators~logpatch+loghab+landscape_diversity+(1|site),primula,poisson,method="ML")
>
m2<-lmer(number_pollinators~logpatch+loghab+landscape_diversity+(1|site),primula,quasipoisson,method="ML")
> summary(m1);summary(m2)
> 
> #m1: [...]
> Fixed effects:
>                      Estimate Std. Error z value Pr(>|z|)
> (Intercept)         -0.40302    0.57403 -0.7021  0.48262
> logpatch             0.10915    0.04111  2.6552  0.00793 **
> loghab               0.08750    0.06128  1.4279  0.15331
> landscape_diversity  1.02338    0.40604  2.5204  0.01172 *
<snip>> The p-values from mcmc are:
> 
> ##
> markov1=mcmcsamp(m2,5000)
> 
> HPDinterval(markov1)
>                              lower      upper
> (Intercept)          -1.394287660  0.6023229
> logpatch              0.031154910  0.1906861
> loghab                0.002961281  0.2165285
> landscape_diversity   0.245623183  1.6442544
> log(site.(In))      -41.156007604 -1.6993996
> attr(,"Probability")
> [1] 0.95
> 
> ##
> 
> mcmcpvalue(as.matrix(markov1[,1])) #i.e. the p value for the intercept
> [1] 0.3668
>  > mcmcpvalue(as.matrix(markov1[,2])) #i.e. the p-value for logpatch
> [1] 0.004
>  > mcmcpvalue(as.matrix(markov1[,3])) #i.e. the p-value for loghab
> [1] 0.0598
>  > mcmcpvalue(as.matrix(markov1[,4])) #i.e. the p-value for
landscape.div
> [1] 0.0074
> 
> If one runs the mcmcsamp function for, say, 50,000 runs, the p-values 
> are slightly different (not shown here).
> 
> ##here are the p-values summarized in tabular form:<snip>
[MCMC] > logpatch         	0.004
> loghab            	0.0598
> landscape_diversity    	0.0074<snip>
[single-term deletions] > logpatch        	0.007106
> loghab            	0.1704
> landscape_diversity    	0.01276
<snip> > To summarize, at least for quasipoisson models, the p-values obtained 
> from mcmcpvalue() are quite different from those obtained using 
> single-term deletions followed by a chisquare test.
> 
> Especially in the case of "loghab", the difference is so huge
that one
> could tend to interpret one of the p-values as "marginally
significant".
The P-values from the MCMC chain look suspiciously like 1/2 the others.
Are you effectively doing a 1-tailed test in your MCMC comparison?

-- 
Manuel A. Morales
http://mutualism.williams.edu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
https://stat.ethz.ch/pipermail/r-help/attachments/20070214/8ea87faa/attachment.bin

Reasonably Related Threads

Search for more apparently analagous threads

R help - Feb 2007 - lme4/lmer: P-Values from mcmc samples or chi2-tests?

[R] lme4/lmer: P-Values from mcmc samples or chi2-tests?

[R] lme4/lmer: P-Values from mcmc samples or chi2-tests?

Reasonably Related Threads