Menelaos Stavrinides
2009-Mar-02 23:50 UTC
[R] Unrealistic dispersion parameter for quasibinomial
I am running a binomial glm with response variable the no of mites of two species y->cbind(mitea,miteb) against two continuous variables (temperature and predatory mites) - see below. My model shows overdispersion as the residual deviance is 48.81 on 5 degrees of freedom. If I use quasibinomial to account for overdispersion the dispersion parameter estimate is 2501139, which seems unrealistic. Any ideas as to why I am getting such a huge dispersion parameter?> y<-cbind(psmno,wsmno) > ldhours<-log(idhours+1) > lwpm<-log(wpm2wkb+1) > ypsmno wsmno [1,] 1 4 [2,] 0 54 [3,] 8 1 [4,] 0 63 [5,] 0 28 [6,] 4 291 [7,] 46 3 [8,] 117 85> ldhours[1] 0.000000 2.308567 5.078473 4.875035 2.339399 3.723039 5.572344 5.250384> lwpm[1] 0.6931472 2.1972246 0.0000000 0.6931472 2.3025851 0.0000000 0.0000000 [8] 0.0000000> model<-glm(y~ldhours+lwpm,binomial) > summary(model)Call: glm(formula = y ~ ldhours + lwpm, family = binomial) Deviance Residuals: 1 2 3 4 5 6 5.5586025 -0.0007385 2.4925511 -1.4198734 -0.0004242 -0.9438916 7 8 2.7298663 -1.1576062 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -14.4029 1.3705 -10.509 < 2e-16 *** ldhours 2.8357 0.2656 10.677 < 2e-16 *** lwpm -5.1188 1.4689 -3.485 0.000492 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 441.20 on 7 degrees of freedom Residual deviance: 48.81 on 5 degrees of freedom AIC: 70.398 Number of Fisher Scoring iterations: 8> model2<-glm(y~ldhours+lwpm,quasibinomial) > summary(model2)Call: glm(formula = y ~ ldhours + lwpm, family = quasibinomial) Deviance Residuals: 1 2 3 4 5 6 5.5586025 -0.0007385 2.4925511 -1.4198734 -0.0004242 -0.9438916 7 8 2.7298663 -1.1576062 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -14.403 2167.435 -0.007 0.995 ldhours 2.836 420.015 0.007 0.995 lwpm -5.119 2323.044 -0.002 0.998 (Dispersion parameter for quasibinomial family taken to be 2501139) Null deviance: 441.20 on 7 degrees of freedom Residual deviance: 48.81 on 5 degrees of freedom AIC: NA Number of Fisher Scoring iterations: 8 Thanks, Mel -- Menelaos Stavrinides Ph.D. Candidate Environmental Science, Policy and Management 137 Mulford Hall MC #3114 University of California Berkeley, CA 94720-3114 USA Tel: 510 717 5249 [[alternative HTML version deleted]]
Menelaos Stavrinides <menstav <at> gmail.com> writes:> > I am running a binomial glm with response variable the no of mites of two > species y->cbind(mitea,miteb) against two continuous variables (temperature > and predatory mites) - see below. My model shows overdispersion as the > residual deviance is 48.81 on 5 degrees of freedom. If I use quasibinomial > to account for overdispersion the dispersion parameter estimate is 2501139, > which seems unrealistic. Any ideas as to why I am getting such a huge > dispersion parameter? >The dispersion parameter depends on the Pearson residuals, not the deviance residuals (i.e., scaled by expected variance). I haven't checked into this in great detail, but the Pearson residual of your first data set is huge, probably because the fitted value is tiny (and hence the expected variance is tiny) and the observed value is 0.2. dfr <- df.residual(model2) deviance(model2)/dfr d2 <- sum(residuals(model2,"pearson")^2) (disp2 <- d2/dfr) fitted(model2) residuals(model2,"pearson") Ben Bolker
Prof Brian Ripley
2009-Mar-04 18:01 UTC
[R] Unrealistic dispersion parameter for quasibinomial
For the record> residuals(model)1 2 3 4 5 5.55860143 -0.00073852 2.49255235 -1.41987341 -0.00042425 6 7 8 -0.94389158 2.72987046 -1.15760836> residuals(model, "pearson")1 2 3 4 5 3.5362e+03 -5.2222e-04 2.3366e+00 -1.0080e+00 -2.9999e-04 6 7 8 -8.8378e-01 2.4038e+00 -1.1646e+00> fitted(model)1 2 3 4 5 1.5994e-08 5.0502e-09 4.9946e-01 1.5873e-02 3.2140e-09 6 7 8 2.0924e-02 8.0191e-01 6.1900e-01 so according to the model a very rare event occurred. That is what is 'unrealistic' (and Ben Bolker supposed correctly). How dispersion should be estimated is a matter of some debate (see e.g. McCullagh and Nelder), but the model here is simply inadequate. On Mon, 2 Mar 2009, Menelaos Stavrinides wrote:> I am running a binomial glm with response variable the no of mites of two > species y->cbind(mitea,miteb) against two continuous variables (temperature > and predatory mites) - see below. My model shows overdispersion as the > residual deviance is 48.81 on 5 degrees of freedom. If I use quasibinomial > to account for overdispersion the dispersion parameter estimate is 2501139, > which seems unrealistic. Any ideas as to why I am getting such a huge > dispersion parameter? > >> y<-cbind(psmno,wsmno) >> ldhours<-log(idhours+1) >> lwpm<-log(wpm2wkb+1) >> y > psmno wsmno > [1,] 1 4 > [2,] 0 54 > [3,] 8 1 > [4,] 0 63 > [5,] 0 28 > [6,] 4 291 > [7,] 46 3 > [8,] 117 85 >> ldhours > [1] 0.000000 2.308567 5.078473 4.875035 2.339399 3.723039 5.572344 5.250384 >> lwpm > [1] 0.6931472 2.1972246 0.0000000 0.6931472 2.3025851 0.0000000 0.0000000 > [8] 0.0000000 >> model<-glm(y~ldhours+lwpm,binomial) >> summary(model) > > Call: > glm(formula = y ~ ldhours + lwpm, family = binomial) > > Deviance Residuals: > 1 2 3 4 5 6 > 5.5586025 -0.0007385 2.4925511 -1.4198734 -0.0004242 -0.9438916 > 7 8 > 2.7298663 -1.1576062 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -14.4029 1.3705 -10.509 < 2e-16 *** > ldhours 2.8357 0.2656 10.677 < 2e-16 *** > lwpm -5.1188 1.4689 -3.485 0.000492 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 441.20 on 7 degrees of freedom > Residual deviance: 48.81 on 5 degrees of freedom > AIC: 70.398 > > Number of Fisher Scoring iterations: 8 > >> model2<-glm(y~ldhours+lwpm,quasibinomial) >> summary(model2) > > Call: > glm(formula = y ~ ldhours + lwpm, family = quasibinomial) > > Deviance Residuals: > 1 2 3 4 5 6 > 5.5586025 -0.0007385 2.4925511 -1.4198734 -0.0004242 -0.9438916 > 7 8 > 2.7298663 -1.1576062 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -14.403 2167.435 -0.007 0.995 > ldhours 2.836 420.015 0.007 0.995 > lwpm -5.119 2323.044 -0.002 0.998 > > (Dispersion parameter for quasibinomial family taken to be 2501139) > > Null deviance: 441.20 on 7 degrees of freedom > Residual deviance: 48.81 on 5 degrees of freedom > AIC: NA > > Number of Fisher Scoring iterations: 8 > > Thanks, > Mel > > -- > Menelaos Stavrinides > Ph.D. Candidate > Environmental Science, Policy and Management > 137 Mulford Hall MC #3114 > University of California > Berkeley, CA 94720-3114 USA > Tel: 510 717 5249 > > [[alternative HTML version deleted]] > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595