Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=N) Model.Matrix=cbind(rep(1,length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%Beta) CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U<=P,0,Lambda.Draw) CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]]
On Thu, 26 Apr 2012, Christopher Desjardins wrote:> Hi, > I am trying to replicate Lambert (1992)'s simulation with zero-inflated > Poisson models. The citation is here: > > @article{lambert1992zero, > Author = {Lambert, D.}, > Journal = {Technometrics}, > Pages = {1--14}, > Publisher = {JSTOR}, > Title = {Zero-inflated {P}oisson regression, with an application to defects > in manufacturing}, > Year = {1992}} > > Specifically I am trying to recreate Table 2. But my estimates for Gamma > are not working out. Any ideas why?Your implementation of the inverse logit link is wrong, it should be exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by hand but either use qlogis()/plogis() or make.link("logit"). Your setting resulting in an almost constant probability of zero inflation and hence gamma was completely off. Below is my cleaned up code which nicely replicates the results for n = 100. The other two would require additional work because one would need to catch non-convergence etc. hth, Z ## data-generating process dgp <- function(nobs = 100) { gamma <- c(-1.5, 2) beta <- c(1.5, -2) x <- seq(0, 1, length.out = nobs) lambda <- exp(beta[1] + beta[2] * x) p <- plogis(gamma[1] + gamma[2] * x) y <- ifelse(runif(nobs) <= p, 0, rpois(nobs, lambda = lambda)) data.frame(y = y, x = x) } ## simulation of coefficients and standard errors sim <- function(nrep = 2000, ...) { onesim <- function(i) { d <- dgp(...) m <- zeroinfl(y ~ x, data = d) c(coef(m), sqrt(diag(vcov(m)))) } t(sapply(1:nrep, onesim)) } ## run simulation #3 library("pscl") set.seed(1) cfse <- sim(nobs = 100) ## mean coefficient estimates apply(cfse[, 1:4], 2, mean) ## median coefficient estimates apply(cfse[, 1:4], 2, median) ## sd of coefficient estimates apply(cfse[, 1:4], 2, sd) ## mean of the standard deviation estimates from observed information apply(cfse[, 5:8], 2, mean)> Please cc me on a reply! > Thanks, > Chris > > ## ZIP simulations based on Lambert (1992)'s conditions > > # Empty workspace > rm(list=ls()) > > # Load zero-inflation package > library(pscl) > > # Simulation conditions #3 > NumSimulations=2000 > X=seq(from=0,to=1,length.out=N) > Model.Matrix=cbind(rep(1,length(X)),X) > Gamma=c(-1.5,2) > Beta=c(1.5,-2) > P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) > Lambda=exp(Model.Matrix%*%Beta) > CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) > > for(i in 1 : NumSimulations){ > Lambda.Draw=rpois(N,Lambda) > U=runif(N) > Y=ifelse(U<=P,0,Lambda.Draw) > CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) > } > > # What were the estimates? > colMeans(CoefSimulations) # My gamma estimates aren't even close ... > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Sat, 5 May 2012, Christopher Desjardins wrote:> Hi, > I am a little confused at the output from predict() for a zeroinfl object. > > Here's my confusion: > > ## From zeroinfl package > fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin") > > > ## The raw zero-inflated overdispersed data > > table(bioChemists$art) > > ? 0?? 1?? 2?? 3?? 4?? 5?? 6?? 7?? 8?? 9? 10? 11? 12? 16? 19 > 275 246 178? 84? 67? 27? 17? 12?? 1?? 2?? 1?? 1?? 2?? 1?? 1 > > ## The default output from predict. It looks like it is doing a horrible > job. Does it really predict 7 zeros?No, see also this R-help post on "Zero-inflated regression models: predicting no 0s": https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html The predicted _mean_ of a negative binomial distribution is not the most likely outcome (i.e., the _mode_) of the distribution. The post above presents some hands on examples.> > table(round(predict(fm_zinb2)) ) > > ? 0?? 1?? 2?? 3?? 4?? 5?? 6? 10 > ? 7 354 487? 45? 12?? 6?? 3?? 1 > > ##? The output from predict using "count" > > table(round(predict(fm_zinb2,type="count"))) > > ? 1?? 2?? 3?? 4?? 5?? 6? 10 > 312 536? 45? 12?? 6?? 3?? 1 > > ## The output from predict using "zero", but here it predicts 24 > "structural" zeros? > > table(round(predict(fm_zinb2,type="zero"))) > > ? 0?? 1 > 891? 24 > > > So my question is how do I interpret these different outputs from the > zeroinf object? What are the differences? The help page just left me > confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y) > and would most accurately track table(bioChemists$art) but I am wrong. How > can I find the E(Y) that would most closely track the raw data? > > Thanks, > Chris > >
Reasonably Related Threads
- Obtaining a value of pie in a zero inflated model (fm-zinb2)
- Obtaining a value of pie in a zero inflated model (fm-zinb2)
- Prediction intervals for zero inflated Poisson regression
- Getting predicted values from a zero-inflated negative binomial using zeroinfl()
- zerinfl() vs. Stata's zinb