Dear all,
A question related to the following has been asked on R-help before, but
I could not find any answer to it. Input will be much appreciated.
I got an unexpected sign of the "slope" parameter associated with a
covariate (diam) using zeroinfl(). It led me to compare the estimates
given by zeroinfl() and hurdle():
The (significant) negative estimate here is surprising, given the
biology of the species:
> summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data =
valdaekar,
EM = TRUE))
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
3.74604 0.02635 142.2 <2e-16 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
21.7510 7.6525 2.842 0.00448 **
diam -1.1437 0.3941 -2.902 0.00371 **
Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df
The hurdle model gives the same estimates, but with opposite (and
expected) signs of the parameters:
summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
3.74604 0.02635 142.2 <2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
-21.7510 7.6525 -2.842 0.00448 **
diam 1.1437 0.3941 2.902 0.00371 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df
Why is this so?
thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03
Dear all,
A question related to the following has been asked on R-help before, but
I could not find any answer to it. Input will be much appreciated.
I got an unexpected sign of the "slope" parameter associated with a
covariate (diam) using zeroinfl(). It led me to compare the estimates
given by zeroinfl() and hurdle():
The (significant) negative estimate here is surprising, given the
biology of the species:
> summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data =
valdaekar,
EM = TRUE))
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604
0.02635 142.2 <2e-16 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept) 21.7510
7.6525 2.842 0.00448 **
diam -1.1437 0.3941 -2.902 0.00371 **
Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df
The hurdle model gives the same estimates, but with opposite (and
expected) signs of the parameters:
summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604
0.02635 142.2 <2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept) -21.7510
7.6525 -2.842 0.00448 **
diam 1.1437 0.3941 2.902 0.00371 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df
Why is this so?
thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03
--
Tord Sn?ll
Department of Ecology / Swedish Species Information Centre
Swedish University of Agricultural Sciences (SLU)
P.O. 7044, SE-750 07 Uppsala, Sweden
Office/Mobile/Fax
+46-18-672612/+46-76-7662612/+46-18-673537
www.ekol.slu.se/staff_tordsnall
www.artdata.slu.se/personal/fototsn.asp
Tord: The logistic zero-inflation portion of the zeroinfl()
implementation of ZIP or ZINB predict the probability of 0 rather than the
probability of 1 (>0 counts) so the signs of the coefficients are often
reversed from how you would expect them to be if you had just performed a
logistic regression. I'm guessing that the hurdle model as a two-stage
model is using a logistic regression predicting the probability of 1,
hence the reversed signs of the estimates in the logistic regression
portion of the model.
Brian
Brian S. Cade, PhD
U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO 80526-8818
email: brian_cade@usgs.gov
tel: 970 226-9326
From:
Tord Snäll <tord.snall@ekol.slu.se>
To:
r-help@r-project.org
Date:
10/23/2009 07:40 AM
Subject:
[R] opposite estimates from zeroinfl() and hurdle()
Sent by:
r-help-bounces@r-project.org
Dear all,
A question related to the following has been asked on R-help before, but
I could not find any answer to it. Input will be much appreciated.
I got an unexpected sign of the "slope" parameter associated with a
covariate (diam) using zeroinfl(). It led me to compare the estimates
given by zeroinfl() and hurdle():
The (significant) negative estimate here is surprising, given the
biology of the species:
> summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data =
valdaekar,
EM = TRUE))
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
3.74604 0.02635 142.2 <2e-16 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
21.7510 7.6525 2.842 0.00448 **
diam -1.1437 0.3941 -2.902 0.00371 **
Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df
The hurdle model gives the same estimates, but with opposite (and
expected) signs of the parameters:
summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
3.74604 0.02635 142.2 <2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|) (Intercept)
-21.7510 7.6525 -2.842 0.00448 **
diam 1.1437 0.3941 2.902 0.00371 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df
Why is this so?
thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Tord Sn?ll-4 wrote:> > Dear all, > A question related to the following has been asked on R-help before, but > I could not find any answer to it. Input will be much appreciated. > > I got an unexpected sign of the "slope" parameter associated with a > covariate (diam) using zeroinfl(). It led me to compare the estimates > given by zeroinfl() and hurdle(): > [snip] >The right thing to do in this case is to poke through the code of hurdle() and zeroinfl(), but a simple (?) demonstration shows that hurdle() and zeroinfl() are indeed reporting opposite values : hurdle reports -log(p/(1-p)) = -qlogis(p), where p is the probability of a zero count: z = rpois(500,lambda=3) z = (z[z>0])[1:90] z = c(z,rep(0,10)) hurdle(z~1) ## -qlogis(0.1) ## zero coefficient always == -qlogis(0.1) zeroinfl reports log(p/(1-p)), where p is the zero-inflation: z = rpois(90,lambda=3) z = c(z,rep(0,10)) zeroinfl(z~1) ## qlogis(0.1) tmpf = function() { z = rpois(90,lambda=3) z = c(z,rep(0,10)) coef(zeroinfl(z~1))[2] } rr = replicate(1000,tmpf()) hist(rr,breaks=1000) summary(rr) qlogis(0.1) Perhaps it would be worth sending an e-mail to the package maintainers to request a note to this effect in the documentation, particularly if this a FAQ ... -- View this message in context: http://www.nabble.com/opposite-estimates-from-zeroinfl%28%29-and-hurdle%28%29-tp26024735p26029131.html Sent from the R help mailing list archive at Nabble.com.