Martin Maechler
1998-Feb-03 10:49 UTC
glm(.) / summary.glm(.); [over]dispersion and returning AIC..
I have been implementing a proposal of Jim Lindsey for glm(.) to return AIC values, and print.glm(.) and print.summary.glm(.) printing them.... however:>>>>> "Jim" == Jim Lindsey <jlindsey@luc.ac.be> writes:Jim> The problem still remains of getting the correct AIC when the user Jim> wants the scale parameter to be fixed. (The calculation should use Jim> the fixed, supplied value, instead of estimating it from the Jim> deviance.) As I mentioned earlier, the problem is that this is Jim> specified in summary, after the AIC has already been Jim> calculated. Would it not be possible to add an option in glm Jim> itself, allowing the scale parameter to be specified there? Jim> Otherwise, the AIC should in fact be recalculated every time Jim> summary is called. Yes, I think that's what should really happen. For binomial and poisson, there are even three possibilities: 1. no dispersion (as by the proper GLM) 2. overdispersion estimated by the deviance (ratio) 3. overdispersion specified by the user S has adopted the concept that the glm(.) model is always the same, the dispersion being an orthogonal nuisance parameter, which the user should specify in summary(....) , i.e., summary.glm(object, dispersion = NULL, correlation=FALSE, ..) ^^^^^^^^^^^^^^^^^ [but wouldn't the dispersion also be used in predict.glm(..., se = TRUE) ?]. As a consequence, glm(.) wouldn't (and shouldn't ??) have a `dispersion = ' argument, and print.glm(.) maybe also shouldn't print the AIC BTW, V&R's MASS library contains the following functions > apropos("[Aa][Ii][Cc]") [1] "extractAIC" "extractAIC.aov" "extractAIC.coxph" [4] "extractAIC.glm" "extractAIC.lm" "extractAIC.negbin" [7] "extractAIC.survreg" "stepAIC" where "stepAIC" is the main function, calling the generic "extractAIC" (and one of its methods). Maybe we should try look adopt what they've done. (haven't looked at it really). ------- Opinions, proposals, please ? Martin Maechler <maechler@stat.math.ethz.ch> <>< ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND http://www.stat.math.ethz.ch/~maechler/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thomas Lumley
1998-Feb-03 16:56 UTC
glm(.) / summary.glm(.); [over]dispersion and returning AIC..
On Tue, 3 Feb 1998, Martin Maechler wrote:> I have been implementing a proposal of Jim Lindsey for glm(.) > to return AIC values, and > print.glm(.) and print.summary.glm(.) printing them.... > however:<snip>> > For binomial and poisson, > there are even three possibilities: > > 1. no dispersion (as by the proper GLM) > 2. overdispersion estimated by the deviance (ratio) > 3. overdispersion specified by the user >How about overdispersion as estimated by the mean squared Pearson residual. Unlike the deviance estimate this works for quasilikelihood (ie when only first two moments are specified correctly), at least according to McCullagh & Nelder. Thomas Lumley ------------------------------------------------------+------ Biostatistics : "Never attribute to malice what : Uni of Washington : can be adequately explained by : Box 357232 : incompetence" - Hanlon's Razor : Seattle WA 98195-7232 : : ------------------------------------------------------------ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Martin Maechler
1998-Feb-04 16:00 UTC
glm(.) / summary.glm(.); [over]dispersion and returning AIC..
>>>>> "Jim" == Jim Lindsey <jlindsey@luc.ac.be> writes:MM>> S has adopted the concept that the glm(.) model is always the same, MM>> the dispersion being an orthogonal nuisance parameter, MM>> which the user should specify in MM>> summary(....) , i.e., MM>> summary.glm(object, dispersion = NULL, correlation=FALSE, ..) MM>> ^^^^^^^^^^^^^^^^^ Jim> But in fact it is unity for binomial and poisson so some action must Jim> be taken in summary. The orthogonality is a characteristic of Jim> exponential dispersion models. Sure; the above meant: - dispersion *is* an optional parameter to summary.glm(.). - it has a default depending on the model fitted (e.g. =1 for binomial) - if the user specifies a different value, that one is used ... MM>> [but wouldn't the dispersion also be used in predict.glm(..., se=TRUE) ?] Jim> Dispersion does not affect predictions, only their precision. Exactly, and the point is that ``predict( ... , se = TRUE ) ^^^^^^^^^^ asks for 'standard errors' ! it is short for 'se.fit' and actually not (yet) available in R, but in S-plus predict.lm & predict.glm. The question remains (actually an S-plus only question for the current R): How to get standard errors for predicted values WITH user-specified / non-standard estimated dispersion value? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._