thr3ads.net - R help - [R] When to use quasipoisson instead of poisson family [Apr 2007]

If this information is useful, please help other people find it:
Share via:

ronggui

2007-Apr-10 05:25 UTC

[R] When to use quasipoisson instead of poisson family

It seems that MASS suggest to judge on the basis of
sum(residuals(mode,type="pearson"))/df.residual(mode). My question: Is
there any rule of thumb of the cutpoiont value?

The paper "On the Use of Corrections for Overdispersion"  suggests
overdispersion exists if the deviance is at least twice the number of
degrees of freedom.

Are there any further hints? Thanks.

-- 
Ronggui Huang
Department of Sociology
Fudan University, Shanghai, China

Achim Zeileis

2007-Apr-10 07:11 UTC

head link

[R] When to use quasipoisson instead of poisson family

On Tue, 10 Apr 2007, ronggui wrote:
> It seems that MASS suggest to judge on the basis of
> sum(residuals(mode,type="pearson"))/df.residual(mode). My
question: Is
> there any rule of thumb of the cutpoiont value?
>
> The paper "On the Use of Corrections for Overdispersion" 
suggests
> overdispersion exists if the deviance is at least twice the number of
> degrees of freedom.
There are also formal tests for over-dispersion. I've implemented one for
a package which is not yet on CRAN (code/docs attached), another one is
implemented in odTest() in package "pscl". The latter also contains
further count data regression models which can deal with both
over-dispersion and excess zeros in count data. A vignette explaining the
tools is about to be released.

hth,
Z
> Are there any further hints? Thanks.
>
> --
> Ronggui Huang
> Department of Sociology
> Fudan University, Shanghai, China
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>-------------- next part --------------
\name{dispersiontest}
\alias{dispersiontest}
\title{Dispersion Test}
\description{
 Tests the null hypothesis of equidispersion in Poisson GLMs against
 the alternative of overdispersion and/or underdispersion.
}
\usage{
dispersiontest(object, trafo = NULL, alternative = c("greater",
"two.sided", "less"))
}
\arguments{
  \item{object}{a fitted Poisson GLM of class \code{"glm"} as fitted
    by \code{\link{glm}} with family \code{\link{poisson}}.}
 \item{trafo}{a specification of the alternative (see also details),
   can be numeric or a (positive) function or \code{NULL} (the default).}
 \item{alternative}{a character string specifying the alternative hypothesis:
   \code{"greater"} corresponds to overdispersion,
\code{"less"} to
   underdispersion and \code{"two.sided"} to either one.}
}
\details{
The standard Poisson GLM models the (conditional) mean
\eqn{\mathsf{E}[y] = \mu}{E[y] = mu} which is assumed to be equal to the
variance \eqn{\mathsf{VAR}[y] = \mu}{VAR[y] = mu}. \code{dispersiontest}
assesses the hypothesis that this assumption holds (equidispersion) against
the alternative that the variance is of the form:
  \deqn{\mathsf{VAR}[y] \quad = \quad \mu \; + \; \alpha \cdot
\mathrm{trafo}(\mu).}{VAR[y] = mu + alpha * trafo(mu).}
Overdispersion corresponds to \eqn{\alpha > 0}{alpha > 0} and
underdispersion to
\eqn{\alpha < 0}{alpha < 0}. The coefficient \eqn{\alpha}{alpha} can be
estimated
by an auxiliary OLS regression and tested with the corresponding t (or z)
statistic
which is asymptotically standard normal under the null hypothesis.

Common specifications of the transformation function \eqn{\mathrm{trafo}}{trafo}
are
\eqn{\mathrm{trafo}(\mu) = \mu^2}{trafo(mu) = mu^2} or \eqn{\mathrm{trafo}(\mu)
= \mu}{trafo(mu) = mu}.
The former corresponds to a negative binomial (NB) model with quadratic variance
function
(called NB2 by Cameron and Trivedi, 2005), the latter to a NB model with linear
variance
function (called NB1 by Cameron and Trivedi, 2005) or quasi-Poisson model with
dispersion
parameter, i.e.,
  \deqn{\mathsf{VAR}[y] \quad = \quad (1 + \alpha) \cdot \mu =
\mathrm{dispersion} \cdot \mu.}{VAR[y] = (1 + alpha) * mu = dispersion * mu.}
By default, for \code{trafo = NULL}, the latter dispersion formulation is used
in
\code{dispersiontest}. Otherwise, if \code{trafo} is specified, the test is
formulated
in terms of the parameter \eqn{\alpha}{alpha}. The transformation \code{trafo}
can either
be specified as a function or an integer corresponding to the function
\code{function(x) x^trafo},
such that \code{trafo = 1} and \code{trafo = 2} yield the linear and quadratic
formulations
respectively.
}

\value{An object of class \code{"htest"}.}

\references{
Cameron, A.C. and Trivedi, P.K. (1990). Regression-based Tests for
Overdispersion in the Poisson Model.
\emph{Journal of Econometrics}, \bold{46}, 347--364.

Cameron, A.C. and Trivedi, P.K. (1998). \emph{Regression Analysis of Count
Data}.
Cambridge: Cambridge University Press.

Cameron, A.C. and Trivedi, P.K. (2005). \emph{Microeconometrics: Methods and
Applications}.
Cambridge: Cambridge University Press.
}

\seealso{\code{\link{glm}}, \code{\link{poisson}}, \code{\link[MASS]{glm.nb}}}

\examples{
data("RecreationDemand")
fm <- glm(trips ~ ., data = RecreationDemand, family = poisson)

## linear specification (in terms of dispersion)
dispersiontest(fm)
## linear specification (in terms of alpha)
dispersiontest(fm, trafo = 1)
## quadratic specification (in terms of alpha)
dispersiontest(fm, trafo = 2)
dispersiontest(fm, trafo = function(x) x^2)

## further examples
data("DoctorVisits")
fm <- glm(visits ~ . + I(age^2), data = DoctorVisits, family = poisson)
dispersiontest(fm)

data("DebTrivedi")
fm <- glm(ofp ~ health + age + gender + married + faminc + privins, data =
DebTrivedi, family = poisson)
dispersiontest(fm)
}
\keyword{htest}
-------------- next part --------------
dispersiontest <- function(object, trafo = NULL, alternative =
c("greater", "two.sided", "less"))
{
  if(!inherits(object, "glm") || family(object)$family !=
"poisson")
    stop("only Poisson GLMs can be tested")
  alternative <- match.arg(alternative)
  otrafo <- trafo
  if(is.numeric(otrafo)) trafo <- function(x) x^otrafo
  
  y <- if(is.null(object$y)) model.response(model.frame(object)) else
object$y
  yhat <- fitted(object)
  aux <- ((y - yhat)^2 - y)/yhat
    
  if(is.null(trafo)) {
    STAT <- sqrt(length(aux)) * mean(aux)/sd(aux)
    NVAL <- c(dispersion = 1)
    EST <- c(dispersion = mean(aux) + 1)    
  } else {
    auxreg <- lm(aux ~ 0 + I(trafo(yhat)/yhat))
    STAT <- as.vector(summary(auxreg)$coef[1,3])
    NVAL <- c(alpha = 0)
    EST <- c(alpha = as.vector(coef(auxreg)[1]))
  }
  
  rval <- list(statistic = c(z = STAT),
               p.value = switch(alternative,
                           "greater" = pnorm(STAT, lower.tail =
FALSE),
		           "two.sided" = pnorm(abs(STAT), lower.tail = FALSE)*2,
		           "less" = pnorm(STAT)),
	       estimate = EST,
	       null.value = NVAL,
	       alternative = alternative,
	       method = switch(alternative,
	                  "greater" = "Overdispersion test",
			  "two.sided" = "Dispersion test",
			  "less" = "Underdispersion test"),
	       data.name = deparse(substitute(object)))
  class(rval) <- "htest"
  return(rval)
}

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Apr 2007 - When to use quasipoisson instead of poisson family

[R] When to use quasipoisson instead of poisson family

[R] When to use quasipoisson instead of poisson family

Possibly Parallel Threads