John Maindonald
2000-May-09 00:12 UTC
[R] Dispersion in summary.glm() with binomial & poisson link
Following p.206 of "Statistical Models in S", I wish to change the code for summary.glm() so that it estimates the dispersion for binomial & poisson models when the parameter dispersion is set to zero. The following changes [insertion of ||dispersion==0 at one point; and !is.null(dispersion) at another] will do the trick: "summary.glm" <- function(object, dispersion = NULL, correlation = FALSE, ...) { Qr <- .Alias(object$qr) est.disp <- FALSE df.r <- object$df.residual if(is.null(dispersion)||dispersion==0) # calculate dispersion if needed dispersion <- if(any(object$family$family == c("poisson", "binomial"))&&is.null(dispersion)) 1 else if(df.r > 0) { est.disp <- TRUE if(any(object$weights==0)) warning(paste("observations with zero weight", "not used for calculating dispersion")) sum(object$weights*object$residuals^2)/ df.r } else Inf ## calculate scaled and unscaled covariance matrix ...... Here is an example:> counts <- c(18,17,15,20,10,20,25,13,12) > outcome <- gl(3,1,9) > treatment <- gl(3,3) > print(d.AD <- data.frame(treatment, outcome, counts)) > glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) > anova(glm.D93) > summary(glm.D93,dispersion=0) #current code...... Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 3.045e+00 0.000e+00 Inf <2e-16 outcome2 -4.543e-01 0.000e+00 -Inf <2e-16 outcome3 -2.930e-01 0.000e+00 -Inf <2e-16 treatment2 1.924e-08 0.000e+00 Inf <2e-16 treatment3 8.383e-09 0.000e+00 Inf <2e-16 (Dispersion parameter for poisson family taken to be 0) Null deviance: 10.5814 on 8 degrees of freedom Residual deviance: 5.1291 on 4 degrees of freedom AIC: 56.761 Number of Fisher Scoring iterations: 3 Warning message: NAs introduced by coercion> summary(glm.D93,dispersion=0) # with above change....... Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.045e+00 1.943e-01 15.667 9.7e-05 outcome2 -4.543e-01 2.299e-01 -1.976 0.119 outcome3 -2.930e-01 2.192e-01 -1.337 0.252 treatment2 1.924e-08 2.274e-01 8.46e-08 1.000 treatment3 8.383e-09 2.274e-01 3.69e-08 1.000 (Dispersion parameter for poisson family taken to be 1.293029) Null deviance: 10.5814 on 8 degrees of freedom Residual deviance: 5.1291 on 4 degrees of freedom AIC: 56.761 Number of Fisher Scoring iterations: 3 John Maindonald email : john.maindonald at anu.edu.au Statistical Consulting Unit, phone : (6249)3998 c/o CMA, SMS, fax : (6249)5549 John Dedman Mathematical Sciences Building Australian National University Canberra ACT 0200 Australia -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2000-May-09 05:29 UTC
[R] Dispersion in summary.glm() with binomial & poisson link
On Tue, 9 May 2000, John Maindonald wrote:> Following p.206 of "Statistical Models in S", I wish to change > the code for summary.glm() so that it estimates the dispersion > for binomial & poisson models when the parameter dispersion is > set to zero. The following changes [insertion of ||dispersion==0 > at one point; and !is.null(dispersion) at another] will do the trick:I know S does that, but R is not documented to do so (so your example does works as documented). I think this is at best confusing. Once phi is estimated, they are quasi-likelihood models not binomial nor Poisson and in particular have no likelihood, so e.g. drop1 becomes inappropriate. And it is all too easy to have different treatments of dispersion in different ancilliary functions (as S managed for many years). My preference is to treat such models as quasi models. If enough people really want them as `binomial'' or `poisson'' then we need to make much wider changes to ensure consistency (and incompitibility with S). There are quite a lot of things in R''s glm that are not the same as S''s.> > "summary.glm" <- > function(object, dispersion = NULL, > correlation = FALSE, ...) > { > Qr <- .Alias(object$qr) > est.disp <- FALSE > df.r <- object$df.residual > if(is.null(dispersion)||dispersion==0) # calculate dispersion if needed > dispersion <- > if(any(object$family$family == c("poisson", > "binomial"))&&is.null(dispersion)) > 1 > else if(df.r > 0) { > est.disp <- TRUE > if(any(object$weights==0)) > warning(paste("observations with zero weight", > "not used for calculating dispersion")) > sum(object$weights*object$residuals^2)/ df.r > } else Inf > ## calculate scaled and unscaled covariance matrix > ...... > > Here is an example: > > > counts <- c(18,17,15,20,10,20,25,13,12) > > outcome <- gl(3,1,9) > > treatment <- gl(3,3) > > print(d.AD <- data.frame(treatment, outcome, counts)) > > glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) > > anova(glm.D93) > > summary(glm.D93,dispersion=0) #current code > ...... > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 3.045e+00 0.000e+00 Inf <2e-16 > outcome2 -4.543e-01 0.000e+00 -Inf <2e-16 > outcome3 -2.930e-01 0.000e+00 -Inf <2e-16 > treatment2 1.924e-08 0.000e+00 Inf <2e-16 > treatment3 8.383e-09 0.000e+00 Inf <2e-16 > > (Dispersion parameter for poisson family taken to be 0) > > Null deviance: 10.5814 on 8 degrees of freedom > Residual deviance: 5.1291 on 4 degrees of freedom > AIC: 56.761 > > Number of Fisher Scoring iterations: 3 > > Warning message: > NAs introduced by coercion > > > summary(glm.D93,dispersion=0) # with above change > ....... > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 3.045e+00 1.943e-01 15.667 9.7e-05 > outcome2 -4.543e-01 2.299e-01 -1.976 0.119 > outcome3 -2.930e-01 2.192e-01 -1.337 0.252 > treatment2 1.924e-08 2.274e-01 8.46e-08 1.000 > treatment3 8.383e-09 2.274e-01 3.69e-08 1.000 > > (Dispersion parameter for poisson family taken to be 1.293029) > > Null deviance: 10.5814 on 8 degrees of freedom > Residual deviance: 5.1291 on 4 degrees of freedom > AIC: 56.761 > > Number of Fisher Scoring iterations: 3 > > > John Maindonald email : john.maindonald at anu.edu.au > Statistical Consulting Unit, phone : (6249)3998 > c/o CMA, SMS, fax : (6249)5549 > John Dedman Mathematical Sciences Building > Australian National University > Canberra ACT 0200 > Australia > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
John Maindonald
2000-May-09 05:46 UTC
[R] Dispersion in summary.glm() with binomial & poisson link
> From ripley at stats.ox.ac.uk Tue May 9 15:30:33 2000 > Date: Tue, 9 May 2000 06:29:10 +0100 (BST)> > Following p.206 of "Statistical Models in S", I wish to change > > the code for summary.glm() so that it estimates the dispersion > > for binomial & poisson models when the parameter dispersion is > > set to zero. The following changes [insertion of ||dispersion==0 > > at one point; and !is.null(dispersion) at another] will do the trick: > > I know S does that, but R is not documented to do so (so your example does > works as documented). I think this is at best confusing. Once phi is > estimated, they are quasi-likelihood models not binomial nor Poisson and in > particular have no likelihood, so e.g. drop1 becomes inappropriate. And it > is all too easy to have different treatments of dispersion in different > ancilliary functions (as S managed for many years).That is why I did not submit a bug report. The problem is that in many application areas phi is much greater than one.> My preference is to treat such models as quasi models. If enough > people really want them as `binomial'' or `poisson'' then we need to > make much wider changes to ensure consistency (and incompitibility > with S).I agree with the sentiments. So would it be feasible to have quasi-poisson and quasi-binomial errors? Would an immediate recourse be to create functions summary.quasi and predict.quasi? Or perhaps summary.phi, etc. John Maindonald email : john.maindonald at anu.edu.au Statistical Consulting Unit, phone : (6249)3998 c/o CMA, SMS, fax : (6249)5549 John Dedman Mathematical Sciences Building Australian National University Canberra ACT 0200 Australia -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2000-May-09 05:57 UTC
[R] Dispersion in summary.glm() with binomial & poisson link
On Tue, 9 May 2000, John Maindonald wrote:> > From ripley at stats.ox.ac.uk Tue May 9 15:30:33 2000 > > Date: Tue, 9 May 2000 06:29:10 +0100 (BST) > > > > Following p.206 of "Statistical Models in S", I wish to change > > > the code for summary.glm() so that it estimates the dispersion > > > for binomial & poisson models when the parameter dispersion is > > > set to zero. The following changes [insertion of ||dispersion==0 > > > at one point; and !is.null(dispersion) at another] will do the trick: > > > > I know S does that, but R is not documented to do so (so your example does > > works as documented). I think this is at best confusing. Once phi is > > estimated, they are quasi-likelihood models not binomial nor Poisson and in > > particular have no likelihood, so e.g. drop1 becomes inappropriate. And it > > is all too easy to have different treatments of dispersion in different > > ancilliary functions (as S managed for many years). > > That is why I did not submit a bug report. The problem is that in > many application areas phi is much greater than one. > > > My preference is to treat such models as quasi models. If enough > > people really want them as `binomial'' or `poisson'' then we need to > > make much wider changes to ensure consistency (and incompitibility > > with S). > > I agree with the sentiments. So would it be feasible to have > quasi-poisson and quasi-binomial errors? Would an immediate recourse > be to create functions summary.quasi and predict.quasi? Or perhaps > summary.phi, etc.All you need is a function to estimate dispersion as you want (and there are other ways for gammas, e.g. in the MASS library). So you call, e.g. summary(object, dispersion=chisquare.dispersion(object)) but a simpler interface is desirable, and yes, quasibinomial and quasipoisson look an excellent idea. Because I was aware of the (in)consistency issues I am looking into this. Expect something along these lines for 1.1.x. (BTW, now is a good time to raise such design issues. As you will see from http://developer.r-project.org 1.1 is about a month off, and our plans are for 1.1.1 to be the version for teaching in academic year 2000/1 in Europe and N. America.) Brian -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jim Lindsey
2000-May-09 13:25 UTC
[R] Dispersion in summary.glm() with binomial & poisson link
> > > From ripley at stats.ox.ac.uk Tue May 9 15:30:33 2000 > > Date: Tue, 9 May 2000 06:29:10 +0100 (BST) > > > > Following p.206 of "Statistical Models in S", I wish to change > > > the code for summary.glm() so that it estimates the dispersion > > > for binomial & poisson models when the parameter dispersion is > > > set to zero. The following changes [insertion of ||dispersion==0 > > > at one point; and !is.null(dispersion) at another] will do the trick: > > > > I know S does that, but R is not documented to do so (so your example does > > works as documented). I think this is at best confusing. Once phi is > > estimated, they are quasi-likelihood models not binomial nor Poisson and in > > particular have no likelihood, so e.g. drop1 becomes inappropriate. And it > > is all too easy to have different treatments of dispersion in different > > ancilliary functions (as S managed for many years). > > That is why I did not submit a bug report. The problem is that in > many application areas phi is much greater than one.The gnlr function in my gnlm library (at www.luc.ac.be/~jlindsey/rcode.html) will fit quite a variety of different overdispersed Poisson- and binomial-based distributions (i.e. phi different from one) using their exact likelihoods. Jim -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._