thr3ads.net - R help - [R] fitted.values from zeroinfl (pscl package) [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Sarah J Thomas

2008-Feb-18 06:50 UTC

[R] fitted.values from zeroinfl (pscl package)

Hello all:

I have a question regarding the fitted.values returned from the 
zeroinfl() function. The values seem to be nearly identical to those 
fitted.values returned by the ordinary glm(). Why is this, shouldn't 
they be more "zero-inflated"?

I construct a zero-inflated series of counts, called Y, like so:

b= as.vector(c(1.5, -2))
g= as.vector(c(-3, 1))
x <- runif(100) # x is the covariate
X <- cbind(1,x)

p <- exp(X%*%g)/(1+exp(X%*%g))
m <- exp(X%*%b)   # log-link for the mean process
                  # of the Poisson
Y <- rep(0, 100)

u <- runif(100)
for(i in 1:100) {
    if( u[i] < p[i] ) { Y[i] = 0 }
    else { Y[i] <- rpois(1, m[i]) }
}

# now let's compare the fitted.values from zeroinfl()
# and from glm()

z1 <- glm(Y ~ x, family=poisson)
z2 <- zeroinfl(Y ~ x|x) #poisson is the default

z1$fitted.values[1:20]
#1.3254209 0.7458029 2.0300505 1.1292954 1.4512862 #0.6513798 1.8980126 
0.6558228 1.5302057
#0.6993626 2.6875736 0.7586985 2.0622238 2.1009979 #1.4254607 1.8130159 
3.6603137 2.1330030
#2.9409379 3.3203350

z2$fitted.values[1:20]
#1.3587457 0.7254296 2.0730982 1.1497492 1.4902778 #0.6178648 1.9429778 
0.6229478 1.5717923
#0.6726527 2.7010395 0.7400369 2.1045779 2.1424025 #1.4634459 1.8583877 
3.5830697 2.1735319
#2.9354839 3.2800839


You can see that they are almost identical... and the fitted.values from 
zeroinfl don't seem to be zero-inflated at all! What is going on?

Ultimately I want these fitted.values for a goodness of fit type of test 
to see if the zeroinfl model is needed or not for a given data series. 
With these fitted.values as they are, I am rejecting assumption of a 
zero-inflated model even when the data really are zero-inflated.

many thanks,
Sarah Thomas

-- 
Sarah J. Thomas
Research Assistant, Department of Statistics
Rice University, Houston, TX

Achim Zeileis

2008-Feb-18 13:40 UTC

head link

[R] fitted.values from zeroinfl (pscl package)

On Mon, 18 Feb 2008, Sarah J Thomas wrote:
> Hello all:
>
> I have a question regarding the fitted.values returned from the
> zeroinfl() function. The values seem to be nearly identical to those
> fitted.values returned by the ordinary glm(). Why is this, shouldn't
> they be more "zero-inflated"?
>
> I construct a zero-inflated series of counts, called Y, like so:
To make this reproducible, I set the random seed to

set.seed(123)

in advance and then ran your source code

b= as.vector(c(1.5, -2))
g= as.vector(c(-3, 1))
x <- runif(100) # x is the covariate
X <- cbind(1,x)

p <- exp(X%*%g)/(1+exp(X%*%g))
m <- exp(X%*%b)   # log-link for the mean process
                  # of the Poisson
Y <- rep(0, 100)

u <- runif(100)
for(i in 1:100) {
    if( u[i] < p[i] ) { Y[i] = 0 }
    else { Y[i] <- rpois(1, m[i]) }
}

# now let's compare the fitted.values from zeroinfl()
# and from glm()

z1 <- glm(Y ~ x, family=poisson)
z2 <- zeroinfl(Y ~ x|x) #poisson is the default

[snip]
> You can see that they are almost identical... and the fitted.values from
> zeroinfl don't seem to be zero-inflated at all! What is going on?
Well, let's see how zero inflated your observations are:

R> sum(u < p)
[1] 2

Wow, two (!) observations that have been zero-inflated. Let's see how much
the probability for observing a zero would have been

R> dpois(0, m[u < p])
[1] 0.3147816 0.1409670

which is not so low, in particular for the first one.

Overall, you've got

R> sum(Y < 1)
[1] 23

zeros in that data set and the expected number of zeros in a Poisson GLM
is

R> sum(dpois(0, fitted(z1)))
[1] 23.35615

So you have observed *less* zeros than expected by a Poisson GLM. Surely,
this is not the kind of data that zero-inflated models have been developed
for.
> Ultimately I want these fitted.values for a goodness of fit type of test
> to see if the zeroinfl model is needed or not for a given data series.
> With these fitted.values as they are, I am rejecting assumption of a
> zero-inflated model even when the data really are zero-inflated.
Maybe you ought to think about useful data-generating processes first
before designing tests or criticizing software...
Z

Seemingly Similar Threads

Search for more reasonably related threads

R help - Feb 2008 - fitted.values from zeroinfl (pscl package)

[R] fitted.values from zeroinfl (pscl package)

[R] fitted.values from zeroinfl (pscl package)

Seemingly Similar Threads