Hello, I am having a problem with the zero-inflated negative binomial (package pscl). I have 6 sites with plant populations, and I am trying to model the number of seeds produced as a function of their size and their site. There are a lot of zero's because many of my plants get eaten before flowering, thereby producing 0 seeds, and that varies by site. Because of that and because the variance exceeds the mean, I'm pretty sure the zero-inflated negative binomial is the appropriate model to use. Anyways, the code I have used for the zero-inflated binomial is: fit.fec=zeroinfl(seeds~site/size-1 | site-1,na.action=na.omit,dist="negbin",link="log",EM=TRUE) This code works fine on the data I have. The problem I'm having is when I bootstrap this equation, after about 250 iterations, I get the following error message: "Error: NA/NaN/Inf in foreign function call (arg 1)." The bootstrap code is: n.boot=1000 for(b.samp in 1:n.boot){ sample.boot=c(sample(1:104,replace=T),104+sample(1:70,replace=T),174+sample(1:71,replace=T), 245+sample(1:87,replace=T),332+sample(1:55,replace=T),387+sample(1:125,replace=T)) size.boot=size[sample.boot] seeds.boot=seeds[sample.boot] fit.fec.boot=zeroinfl(seeds.boot~site/size.boot-1 | site-1,na.action=na.omit,dist="negbin",link="log",EM=TRUE) infl.slopes.boot=fit.fec.boot$coef$zero[1:6] r.intercepts.boot=fit.fec.boot$coef$count[1:6] r.slopes.boot=fit.fec.boot$coef$count[7:12] ... } In the above code, size and seeds are both vectors ordered by the site vector, and the sample sizes for each site are found in the sample.boot line. I have looked at size.boot, seeds.boot, and site, and none of them have missing values. I'm really puzzled why it takes 250+ iterations for this problem to crop up. When I used a subset of this data, it took fewer iterations for this problem to occur - so maybe it has to do with sample size? However, the number of individuals per site remains constant through each iteration - only the values of size and seeds are changing. Therefore, it must be some problematic combination of values that's being chosen, though I can't for the life of me figure it out (for example, it doesn't seem like there are an outrageous number of 0's when it crashes). Also, I tried running the problematic size.boot and seeds.boot vectors with just a negative binomial model (glm.nb), and I get the same error. Can anyone provide some insight into what is going on? Thanks! Best, Melissa -- Ph.D. Candidate Department of Biology University of Virginia P.O. Box 400328 Charlottesville, VA 22904-4328 [[alternative HTML version deleted]]
Melissa Aikens <mla2j <at> virginia.edu> writes:> I am having a problem with the zero-inflated negative binomial (package > pscl). I have 6 sites with plant populations, and I am trying to model the > number of seeds produced as a function of their size and their site.[snip]> > Anyways, the code I have used for the zero-inflated binomial is: > fit.fec=zeroinfl(seeds~site/size-1 | > site-1,na.action=na.omit,dist="negbin",link="log",EM=TRUE) > > This code works fine on the data I have. The problem I'm having is when I > bootstrap this equation, after about 250 iterations, I get the following > error message: "Error: NA/NaN/Inf in foreign function call (arg 1)." > The bootstrap code is: >[snip]> I have looked at size.boot, seeds.boot, and site, and none of them have > missing values. I'm really puzzled why it takes 250+ iterations for this > problem to crop up. When I used a subset of this data, it took fewer > iterations for this problem to occur - so maybe it has to do with sample > size?[snip]> Therefore, it must be some problematic combination of values that's being > chosen, though I can't for the life of me figure it out (for example, it > doesn't seem like there are an outrageous number of 0's when it crashes). > Also, I tried running the problematic size.boot and seeds.boot vectors with > just a negative binomial model (glm.nb), and I get the same error. Can > anyone provide some insight into what is going on?This is not reproducible (see e.g. http://tinyurl.com/reproducible-000 ), so it's a little hard to say exactly. Can you post the problematic size.boot and seeds.boot vectors? You could also use 'try()' to skip over bad data sets ... (Using set.seed() for reproducibility is also a good idea.) Ben Bolker
As Ben pointed out, this is not reproducible. My guess is that the offending sample is degenerate in some way, e.g., there are no observations in one of the six sites or no non-zero responses. hth, Z On Mon, 26 Dec 2011, Melissa Aikens wrote:> Hello, > > I am having a problem with the zero-inflated negative binomial (package > pscl). I have 6 sites with plant populations, and I am trying to model the > number of seeds produced as a function of their size and their site. There > are a lot of zero's because many of my plants get eaten before flowering, > thereby producing 0 seeds, and that varies by site. Because of that and > because the variance exceeds the mean, I'm pretty sure the zero-inflated > negative binomial is the appropriate model to use. > > Anyways, the code I have used for the zero-inflated binomial is: > fit.fec=zeroinfl(seeds~site/size-1 | > site-1,na.action=na.omit,dist="negbin",link="log",EM=TRUE) > > This code works fine on the data I have. The problem I'm having is when I > bootstrap this equation, after about 250 iterations, I get the following > error message: "Error: NA/NaN/Inf in foreign function call (arg 1)." > The bootstrap code is: > > n.boot=1000 > for(b.samp in 1:n.boot){ > > sample.boot=c(sample(1:104,replace=T),104+sample(1:70,replace=T),174+sample(1:71,replace=T), > 245+sample(1:87,replace=T),332+sample(1:55,replace=T),387+sample(1:125,replace=T)) > size.boot=size[sample.boot] > seeds.boot=seeds[sample.boot] > fit.fec.boot=zeroinfl(seeds.boot~site/size.boot-1 | > site-1,na.action=na.omit,dist="negbin",link="log",EM=TRUE) > infl.slopes.boot=fit.fec.boot$coef$zero[1:6] > r.intercepts.boot=fit.fec.boot$coef$count[1:6] > r.slopes.boot=fit.fec.boot$coef$count[7:12] > > ... > } > > In the above code, size and seeds are both vectors ordered by the site > vector, and the sample sizes for each site are found in the sample.boot > line. > I have looked at size.boot, seeds.boot, and site, and none of them have > missing values. I'm really puzzled why it takes 250+ iterations for this > problem to crop up. When I used a subset of this data, it took fewer > iterations for this problem to occur - so maybe it has to do with sample > size? However, the number of individuals per site remains constant through > each iteration - only the values of size and seeds are changing. > Therefore, it must be some problematic combination of values that's being > chosen, though I can't for the life of me figure it out (for example, it > doesn't seem like there are an outrageous number of 0's when it crashes). > Also, I tried running the problematic size.boot and seeds.boot vectors with > just a negative binomial model (glm.nb), and I get the same error. Can > anyone provide some insight into what is going on? > > Thanks! > > Best, > Melissa > > -- > Ph.D. Candidate > Department of Biology > University of Virginia > P.O. Box 400328 > Charlottesville, VA 22904-4328 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >