I have a question about stepAIC and extractAIC and why they can
produce different answers.
Here's a stepAIC result (slightly edited - I removed the warning
about noninteger #successes):
stepAIC(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort +
Cohort2, family = binomial, data = ghs_70_79, subset =
ghs_70_full),direction = c("backward"))
Start: AIC=3151.41
(Morbid_70_79/Present_70_79) ~ 1 + Cohort + Cohort2
Df Deviance AIC
<none> 1797.6 3151.4
- Cohort 1 1826.2 3178.0
- Cohort2 1 1826.3 3178.2
Call: glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort + Cohort2,
family = binomial, data = ghs_70_79, subset = ghs_70_full)
Coefficients:
(Intercept) Cohort Cohort2
-0.54094 0.35295 -0.01659
Degrees of Freedom: 2722 Total (i.e. Null); 2720 Residual
(2015 observations deleted due to missingness)
Null Deviance: 1826
Residual Deviance: 1798 AIC: 3151
based upon the above, note that the following models have these AIC scores:
1 + Cohort + Cohort2 3151.4
1 + Cohort2 3178.0
1 + Cohort 3178.2
Now consider the direct calculation of AIC
> logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort +
>Cohort2, family = binomial, data = ghs_70_79, subset = ghs_70_full))
'log Lik.' -1572.703 (df=3)
> -2*-1572.703 + 6
[1] 3151.406
this matches the stepAIC result.
> logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort2,
>family = binomial, data = ghs_70_79, subset = ghs_70_full))
'log Lik.' -1599.126 (df=2)
> -2*-1599.126 + 4
[1] 3202.252
this does not match the stepAIC result (= 3178.0).
> logLik(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort,
>family = binomial, data = ghs_70_79, subset = ghs_70_full))
'log Lik.' -1599.264 (df=2)
> -2*-1599.264 + 4
[1] 3202.528
this does not match the stepAIC result (=3178.2).
as you know, stepAIC uses extractAIC, e.g.> extractAIC(glm(formula = (Morbid_70_79/Present_70_79) ~ 1 + Cohort,
>family = binomial, data = ghs_70_79, subset = ghs_70_full))
[1] 2.000 3202.527
why are the AIC results from stepAIC different from those calculated
directly? of course, AIC is only calculated up to an arbitrary
constant. So, the issue is that some of the AIC values match and some
don't.
many thanks!
--
Steven Orzack
The Fresh Pond Research Institute
173 Harvey Street
Cambridge, MA. 02140
617 864-4307
www.freshpond.org
[[alternative HTML version deleted]]