I have fitted the faults.data to glm.nb and to the function negbin from the package aod. The output of both is the following: summary(glm.nb(n~ll, data=faults)) Call: glm.nb(formula = n ~ ll, data = faults, init.theta = 8.667407437, link = log) Deviance Residuals: Min 1Q Median 3Q Max -2.0470 -0.7815 -0.1723 0.4275 2.0896 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.7951 1.4577 -2.603 0.00923 ** ll 0.9378 0.2280 4.114 3.89e-05 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for Negative Binomial(8.6674) family taken to be 1) Null deviance: 50.28 on 31 degrees of freedom Residual deviance: 30.67 on 30 degrees of freedom AIC: 181.39 Number of Fisher Scoring iterations: 1 Theta: 8.67 Std. Err.: 4.17 2 x log-likelihood: -175.387 the output of the function negbin with a global dispersion parameter should - when i understood it right - yield the same estimates as glm.nb. it does, with slightly little differences.> negbin(n~ll,~1, data=faults)Negative-binomial model ----------------------- negbin(formula = n ~ ll, random = ~1, data = faults) Convergence was obtained after 112 iterations. Fixed-effect coefficients: Estimate Std. Error z value Pr(> |z|) (Intercept) -3.795e+00 1.421e+00 -2.671e+00 7.570e-03 ll 9.378e-01 2.221e-01 4.222e+00 2.417e-05 Overdispersion coefficients: Estimate Std. Error z value Pr(> z) phi.(Intercept) 1.154e-01 5.56e-02 2.076e+00 1.895e-02 Log-likelihood statistics Log-lik nbpar df res. Deviance AIC AICc -8.77e+01 3 29 5.209e+01 1.814e+02 1.822e+02 The thing i really dont understand is why there is such a big difference between the deviances? (glm.nb = 30.67 and negbin=52.09?) Shouldnt they be nearly the same?? thanks for your help, sabine -- View this message in context: http://r.789695.n4.nabble.com/Comparison-of-glm-nb-and-negbin-from-the-package-aod-tp3299679p3299679.html Sent from the R help mailing list archive at Nabble.com.
sabwo <sabsiw <at> gmx.at> writes: [big snip; comparing aod::negbin and MASS::glm.nb fits]> The thing i really dont understand is why there is such a big difference > between the deviances? (glm.nb = 30.67 and negbin=52.09?) Shouldnt they be > nearly the same?? >I don't have time to dig into this right now, but calculations of log-likelihoods or deviances often drop additive constants (such as the normalizing constant in a probability distribution), and different implementations often make different choices about which constant terms to include or not. If you dig around in the code you should be able to find out which terms are included or not (although admittedly this would be a nice thing to have included in the documentation). This does make it hard to compare across fits in different packages. The important thing (and the thing I'm fairly certain of, since I've used both packages and they both seem to be well-written) is that the **differences** in deviances when comparing models A and B both fitted in the same package should be the same (because the additive constants that are included or not cancel out in this case). Ben Bolker
Matthieu Lesnoff
2011-Feb-12 09:03 UTC
[R] Comparison of glm.nb and negbin from the package aod
Dear Sabine In negbin(aod), the deviance is calculated by: # full model logL.max <- sum(dpois(x = y, lambda = y, log = TRUE)) # fitted model logL <- -res$value dev <- -2 * (logL - logL.max) (the log-Lik contain all the constants) As Ben Bolker said, whatever the formula used for deviance, differences between deviances of two models should be the same Regards -- ------------------ Matthieu Lesnoff On 10/02/2011 18:00, sabwo wrote:> > I have fitted the faults.data to glm.nb and to the function negbin from the > package aod. The output of both is the following: > > summary(glm.nb(n~ll, data=faults)) > > Call: > glm.nb(formula = n ~ ll, data = faults, init.theta = 8.667407437, > link = log) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.0470 -0.7815 -0.1723 0.4275 2.0896 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.7951 1.4577 -2.603 0.00923 ** > ll 0.9378 0.2280 4.114 3.89e-05 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for Negative Binomial(8.6674) family taken to be 1) > > Null deviance: 50.28 on 31 degrees of freedom > Residual deviance: 30.67 on 30 degrees of freedom > AIC: 181.39 > > Number of Fisher Scoring iterations: 1 > > > Theta: 8.67 > Std. Err.: 4.17 > > 2 x log-likelihood: -175.387 > > the output of the function negbin with a global dispersion parameter should > - when i understood it right - yield the same estimates as glm.nb. it does, > with slightly little differences. > >> negbin(n~ll,~1, data=faults) > Negative-binomial model > ----------------------- > negbin(formula = n ~ ll, random = ~1, data = faults) > > Convergence was obtained after 112 iterations. > > Fixed-effect coefficients: > Estimate Std. Error z value Pr(> |z|) > (Intercept) -3.795e+00 1.421e+00 -2.671e+00 7.570e-03 > ll 9.378e-01 2.221e-01 4.222e+00 2.417e-05 > > Overdispersion coefficients: > Estimate Std. Error z value Pr(> z) > phi.(Intercept) 1.154e-01 5.56e-02 2.076e+00 1.895e-02 > > Log-likelihood statistics > Log-lik nbpar df res. Deviance AIC AICc > -8.77e+01 3 29 5.209e+01 1.814e+02 1.822e+02 > > The thing i really dont understand is why there is such a big difference > between the deviances? (glm.nb = 30.67 and negbin=52.09?) Shouldnt they be > nearly the same?? > > thanks for your help, > sabine