Sarah Valencia
2009-Nov-29 05:48 UTC
[R] Convergence problem with zeroinfl() and hurdle() when interaction term added
Hello, I have a data frame with 1425 observations, 539 of which are zeros. I am trying to fit the following ZINB: f3<-formula(Nbr_Abs~ Zone * Year + Source) ZINB2<-zeroinfl(f3, dist="negbin", link= "logit", data=TheData, offset=log(trans.area), trace=TRUE) Zone is a factor with 4 levels, Year a factor with 27 levels, and Source a factor with 3 levels. Nbr_Abs is counts of a species that shows a high level of aggregation. These counts are offset by the area searched per transect. The trace output and error message are as follows: Zero-inflated Count Model count model: negbin with log link zero-inflation model: binomial with logit link dependent variable: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 539 125 72 41 33 35 35 31 15 22 22 11 13 16 13 .... (truncated for brevity)... 285 286 287 288 289 290 291 292 293 294 <NA> 0 0 0 0 0 0 0 0 0 1 0 generating starting values...done calling optim() for ML estimation: Error in optim(fn = loglikfun, gr = gradfun, par = c(start$count, start$zero, : non-finite value supplied by optim In addition: Warning message: In glm.fit(Z, as.integer(Y0), weights = weights, family binomial(link = linkstr)) : fitted probabilities numerically 0 or 1 occurred I get the same optim error when I run a similar call using hurdle instead of zeroinfl. However, both commands work fine when the interaction terms is removed ( Nbr_Abs~ Zone + Year + Source). Is this a case of some kind of linear relationship between my covariates? In addition, I can run a negative binomial glm with the interaction term, which I didn't think would be possible if that were the case. Any help would be much appreciated! Thanks, Sarah -- Sarah Valencia, PhD student Bren School of Environmental Science and Management University of California Santa Barbara, CA 93106 Lab: (805) 893-5054
Achim Zeileis
2009-Nov-29 09:03 UTC
[R] Convergence problem with zeroinfl() and hurdle() when interaction term added
On Sat, 28 Nov 2009, Sarah Valencia wrote:> Hello, > > I have a data frame with 1425 observations, 539 of which are zeros. I > am trying to fit the following ZINB: > > f3<-formula(Nbr_Abs~ Zone * Year + Source) > ZINB2<-zeroinfl(f3, dist="negbin", link= "logit", data=TheData, > offset=log(trans.area), trace=TRUE) > > Zone is a factor with 4 levels, Year a factor with 27 levels, and > Source a factor with 3 levels. Nbr_Abs is counts of a species that > shows a high level of aggregation. These counts are offset by the area > searched per transect. > > The trace output and error message are as follows: > > Zero-inflated Count Model > count model: negbin with log link > zero-inflation model: binomial with logit link > dependent variable: > 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 > 539 125 72 41 33 35 35 31 15 22 22 11 13 16 13 > .... (truncated for brevity)... > 285 286 287 288 289 290 291 292 293 294 <NA> > 0 0 0 0 0 0 0 0 0 1 0 > generating starting values...done > calling optim() for ML estimation: > Error in optim(fn = loglikfun, gr = gradfun, par = c(start$count, > start$zero, : > non-finite value supplied by optim > In addition: Warning message: > In glm.fit(Z, as.integer(Y0), weights = weights, family > binomial(link = linkstr)) : > fitted probabilities numerically 0 or 1 occurred > > > I get the same optim error when I run a similar call using hurdle > instead of zeroinfl. However, both commands work fine when the > interaction terms is removed ( Nbr_Abs~ Zone + Year + Source). Is this > a case of some kind of linear relationship between my covariates? In > addition, I can run a negative binomial glm with the interaction term, > which I didn't think would be possible if that were the case.My guess is that there is (quasi-)complete separtion when you add the interaction term, i.e., that in one of the interaction groups there are only zero or non-zero counts. See xtabs(~ factor(Nbr_Abs > 0) + Zone + Year, data = TheData) In this case the maximum likelihood estimate does not exist and the same warnings as above will occur when you try to fit a logit model for non-zero counts: glm(factor(Nbr_Abs > 0) ~ Zone * Year + Source, data = TheData, family = binomial) hth, Z> Any help would be much appreciated! > > Thanks, > Sarah > > -- > Sarah Valencia, PhD student > > Bren School of Environmental Science and Management > > University of California Santa Barbara, CA 93106 > > Lab: (805) 893-5054 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >