-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at
stat.math.ethz.ch] On Behalf Of Aaron MacNeil
Sent: 20 February 2006 15:17
To: r-help at stat.math.ethz.ch
Subject: [R] Nested AIC
Greetings,
I have recently come into some confusion over weather or not AIC
results for comparing among models requires that they be nested.
Reading Burnham & Anderson (2002) they are explicit that nested models are
not required, but other respected statisticians have suggested that nesting is a
pre-requisite for comparison. Could anyone who feels strongly regarding either
position post their arguments for or against nested models and AIC? This would
assist me greatly in some analysis I am currently conducting.
Many thanks,
Aaron
----
Hi, Aaron, Burnham & Anderson are explicit but they do not go into any depth
regarding this issue. Akaike's colleagues Sakamoto, Ishiguro, and Kitagawa
(Akaike Information Criterion Statistics, 1986, KTK Scientific Publishers) do no
either, deal with it directly, and the examples they present that I have
examined (not even half of the total in the book), are all of nested models.
However, by reading some of Akaike's papers and the book quoted above it
does not appear to me that there is any restriction on the use of the AIC
related to nestedness. In fact, the theory does not preclude the comparison of
models with different *probability densities (or mass)* as long as you keep all
constants (like 1/sqrt(2pi) in the normal) in the calculation.
Akaike (1973) wrote in the first sentence of his paper his general principle,
which he called an extension of the maximum likelihood principle:
"Given a set of estimates theta_hat's of the vector of parameters theta
of a probability distribution with density f(x|theta) we adopt as our final
estimate the one which will give the maximum of the expected log-likelihood,
which is by definition
E(log f(X|theta_hat))=E(INTEGRAL f(x|theta)log f(x|theta_hat)dx)
Where X is a random variable following the distribution with the density
function f(x|theta) and is independent of theta_hat".
All subsequent derivations in the paper, like the choice of distance measure,
class of estimates, and elimination of the true parameter value, revolve around
this principle. Now, nestedness is a mathematical property of what Burnham &
Anderson call "the structural model", whereas Akaike's principle
only concerns the probabilistic model f(x|theta) where the structural model is
embedded.
I reply to you even though I do not feel strongly about this issue and you asked
for replies from people who feel strongly about this issue.
Ruben