Lauria, Valentina
2013-Aug-13 15:06 UTC
[R] Problem with zero-inflated negative binomial model in sediment river dynamics
Dear All, I am running a negative binomial model in R using the package pscl in oder to estimate bed sediment movements versus river discharge. Currently we have deployed 4 different plates to test if a combination of more than one plate would better describe the sediment movements when the river discharge changes over time. My data are positively skewed and zero-inflated. I did run both zero-inflated Poisson and zero-inflated negative binomial regression and compared them using the VUONG test which showed that the negative binomial works better than a simple zero-inflated Poisson. My models look like: 1) plate1 ~ river discharge 2) (plate 1 + plate 2) ~ river discharge 3) (plate 1 + plate 2 +plate 3) ~ river discharge 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge My main problem as I am new to these type of models is that I get a different sign for the coefficent of discharge in the output of the zero-inflated negative binomial model (please see below). What does this mean? Also how could I compare the different models (1-4) i.e. what tells me which is performing best? Thank you very much in advance for any comments and suggestions!! Kind Regards, Valentina Call: zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist = "negbin", EM = TRUE) Pearson residuals: Min 1Q Median 3Q Max -0.6770 -0.3564 -0.2101 -0.0814 12.3421 Count model coefficients (negbin with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 2.557066 0.036593 69.88 <2e-16 *** discharge 0.064698 0.001983 32.63 <2e-16 *** Log(theta) -0.775736 0.012451 -62.30 <2e-16 *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 13.01011 0.22602 57.56 <2e-16 *** discharge -1.64293 0.03092 -53.14 <2e-16 *** Theta = 0.4604 Number of iterations in BFGS optimization: 1 Log-likelihood: -6.933e+04 on 5 Df [[alternative HTML version deleted]]
Cade, Brian
2013-Aug-13 19:46 UTC
[R] Problem with zero-inflated negative binomial model in sediment river dynamics
Lauria: For historical reasons the logistic regression (binomial with logit link) model portion of a zero-inflated count model is usually structured to predict the probability of the 0 counts rather than the nonzero (>=1) counts so the coefficients will be the negative of what you expect based on the count model portion (as in your output). It is simple to interpret the probability of the logistic regression portion as the probability of the nonzero counts by just taking the negative of the coefficient estimates provided for the probability of the zero counts. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: cadeb@usgs.gov <brian_cade@usgs.gov> tel: 970 226-9326 On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina < valentina.lauria@nuigalway.ie> wrote:> Dear All, > > I am running a negative binomial model in R using the package pscl in oder > to estimate bed sediment movements versus river discharge. Currently we > have deployed 4 different plates to test if a combination of more than one > plate would better describe the sediment movements when the river discharge > changes over time. > > My data are positively skewed and zero-inflated. I did run both > zero-inflated Poisson and zero-inflated negative binomial regression and > compared them using the VUONG test which showed that the negative binomial > works better than a simple zero-inflated Poisson. > > My models look like: > > > 1) plate1 ~ river discharge > 2) (plate 1 + plate 2) ~ river discharge > 3) (plate 1 + plate 2 +plate 3) ~ river discharge > 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge > > > My main problem as I am new to these type of models is that I get a > different sign for the coefficent of discharge in the output of the > zero-inflated negative binomial model (please see below). What does this > mean? Also how could I compare the different models (1-4) i.e. what tells > me which is performing best? Thank you very much in advance for any > comments and suggestions!! > > Kind Regards, > Valentina > > > Call: > zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist > "negbin", EM = TRUE) > Pearson residuals: > Min 1Q Median 3Q Max > -0.6770 -0.3564 -0.2101 -0.0814 12.3421 > > Count model coefficients (negbin with log link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) 2.557066 0.036593 69.88 <2e-16 *** > discharge 0.064698 0.001983 32.63 <2e-16 *** > Log(theta) -0.775736 0.012451 -62.30 <2e-16 *** > > Zero-inflation model coefficients (binomial with logit link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) 13.01011 0.22602 57.56 <2e-16 *** > discharge -1.64293 0.03092 -53.14 <2e-16 *** > Theta = 0.4604 > Number of iterations in BFGS optimization: 1 > Log-likelihood: -6.933e+04 on 5 Df > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]