Sebastien Bihorel
2012-May-11 19:34 UTC
[R] Difference of AIC computation between R (>2.12) and Splus (7.0.6) during stepwise GAM analysis
Dear R Users, I was wondering if some members of the list could shed some light on the difference in AIC computation existing between R (>2.12; gam package) and Splus (7.0.6). Because I am not a statistician by training, I would like to apologize in advance if I use wrong terms or dot not describe GAM appropriately. As far as I understand, stepwise GAM analysis, as implemented in the gam package, relies on the gam, step.gam and gam.fit functions. The computation of AIC, which is used as the primary criterion to advance to the next step, is delegated to the family function provided by the user or set to "gaussian" by default. If one uses the "gaussian" default, AIC will be computed as: AIC <- aic + 2*(n - fit$df.residual) where: - aic is the results of the function aic <- function(y, n, mu, wt, dev) { nobs <- length(y) nobs * (log(dev/nobs * 2 * pi) + 1) + 2 - sum(log(wt)) } - y is the vector of observations - n is the number of observations associated in non-null weights - mu is the vector of the fitted values - wt is the vector of weights - dev is the deviance - fit is the object containing fittig information Stepwise GAM analysis, as implemented in Splus, relies on similar but somewhat different gam and step.gam functions. For instance, the computation of AIC does not depend on any family function in Splus. It is hard-coded and performed in the gam.step function and is based upon the formula given by Hastie and Pregibon (Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S. eds J. M. Chambers and T J. Hastie, Wadsworth & Brooks/Cole.): AIC <- dev + 2*(n - fit$df.residual)*deviance.lm(fit)/fit$df.resid After running several GAM analysis in R and Splus, there are obvious differences in AIC computation and, thus, final model selection. Overall, this looks to me like R relies on a maximum likelihood estimate of the dispersion, while Splus uses a non-parametric description of the dispersion. Is that right? I look into the help pages but could find something specific on this point. I guess my issue boils down to the following questions: - is there a reference in the literature that would indicate the benefits and inconvenient of the two approaches? - is there a way one can provide arguments to the R gam function so it behaves like the Splus function? Thank you in advance for your feedback and you time. Sebastien