thr3ads.net - R help - [R] AIC questions [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Joe Yarmus

2007-Sep-26 15:18 UTC

[R] AIC questions

In accordance with Venables and Ripley, SAS documentation and other 
sources AIC with sigma^2 unknown is calculated as:
AIC = -2LL + 2* #parameters =  n log(RSS/n) + 2p
For the fitness data: 
(http://support.sas.com/ctx/samples/index.jsp?sid=927), SAS gets an AIC 
of 64.534 with model oxygen = runtime. (SAS STAT User's Guide. Chapter 
61. pp 3956, the REG Procedure). This value of AIC accords with p = 2.

When I run the same problem in R ver 2.5.1, I get

 > rt.glm =glm(oxy ~ runtime, data=fitness)
 > rt.glm
Call:  glm(formula = oxy ~ runtime, data = fitness)

Coefficients:
(Intercept)      runtime 
     82.422       -3.311 

Degrees of Freedom: 30 Total (i.e. Null);  29 Residual
Null Deviance:        851.4
Residual Deviance: 218.5     AIC: 154.5

I get very close to what R gets if the constant term is included in 
-2LL, (31*Log(2*pi)+n-1), divide RSS by n-1 and the number of parameters 
is 3 (the predictor, the intercept and the error term)
 > 31 * (log(2*pi)+log(sum(rt.glm$res^2)/30)) + 30 + 2 * 3
[1] 154.5248
 > AIC(rt.glm)
[1] 154.5083

3 questions:
1) Why the discrepancy between SAS and R?
2) Why the slight difference between my calculation in R and R's AIC?
3) How should AIC be computed if row weights are used in the linear model?

Thanks!

    -joe yarmus

Wensui Liu

2007-Sep-26 15:40 UTC

head link

[R] AIC questions

joe,
some procs in SAs calculates log likelihood differently than what it
is supposed to be. try using proc nlmixed and specifying the LL
explicitly.
in your case, I has stronger faith in R result instead of SAS result.

On 9/26/07, Joe Yarmus <joseph.yarmus at oracle.com>
wrote:> In accordance with Venables and Ripley, SAS documentation and other
> sources AIC with sigma^2 unknown is calculated as:
> AIC = -2LL + 2* #parameters =  n log(RSS/n) + 2p
> For the fitness data:
> (http://support.sas.com/ctx/samples/index.jsp?sid=927), SAS gets an AIC
> of 64.534 with model oxygen = runtime. (SAS STAT User's Guide. Chapter
> 61. pp 3956, the REG Procedure). This value of AIC accords with p = 2.
>
> When I run the same problem in R ver 2.5.1, I get
>
>  > rt.glm =glm(oxy ~ runtime, data=fitness)
>  > rt.glm
> Call:  glm(formula = oxy ~ runtime, data = fitness)
>
> Coefficients:
> (Intercept)      runtime
>      82.422       -3.311
>
> Degrees of Freedom: 30 Total (i.e. Null);  29 Residual
> Null Deviance:        851.4
> Residual Deviance: 218.5     AIC: 154.5
>
> I get very close to what R gets if the constant term is included in
> -2LL, (31*Log(2*pi)+n-1), divide RSS by n-1 and the number of parameters
> is 3 (the predictor, the intercept and the error term)
>  > 31 * (log(2*pi)+log(sum(rt.glm$res^2)/30)) + 30 + 2 * 3
> [1] 154.5248
>  > AIC(rt.glm)
> [1] 154.5083
>
> 3 questions:
> 1) Why the discrepancy between SAS and R?
> 2) Why the slight difference between my calculation in R and R's AIC?
> 3) How should AIC be computed if row weights are used in the linear model?
>
> Thanks!
>
>     -joe yarmus
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
=============================="I am dying with the help of too many
physicians." - Alexander the Great, on his deathbed
==============================WenSui Liu
(http://spaces.msn.com/statcompute/blog)

Joe Yarmus

2007-Sep-27 13:56 UTC

head link

[R] AIC questions

Digging into the R-code behind AIC for gaussian family models, I see:
AIC = nobs * (log(dev/nobs * 2 * pi) + 1) + 2 - sum(log(wt)) + 2 * p
dev = sum(wt * (y - mean(y))^2
For the unweighted case, this translates directly to -2LL with the 
penalty number of parameters including both intercept and error term (as 
represented by the constant + 2) and the unknown sigma-squared = sum((y 
- mean(y)^2)/ nobs (rather than nobs-1). However, with weights, I am at 
a loss to understand the expression, because, given
-2LL = nobs * (log(2 * pi * sigma^2) + sum(wt * (y - mean(y))^2/sigma^2
if sigma^2 = sum(wt * (y - mean(y))/sum(wt)
then
-2LL = nobs * (log(2 * pi *dev/nobs) + log(nobs) - log(sum(wt)) + sum(wt)
so if wt = 1 all is fine because
-2LL = nobs * (log(2 * pi * dev/nobs) + 1)
What am I missing? Thanks!

   -joe yarmus

Apparently Analagous Threads

Search for more reasonably related threads

R help - Sep 2007 - AIC questions

[R] AIC questions

[R] AIC questions

[R] AIC questions

Apparently Analagous Threads