Hofert Marius
2008-Apr-24 05:57 UTC
[R] Coefficient of determination in a regression model with AR(1) residuals
Dear R-users, I used lm() to fit a standard linear regression model to a given data set, which led to a coefficient of determination (R^2) of about 0.96. After checking the residuals I realized that they follow an autoregressive process (AR) of order 1 (and therefore contradicting the i.i.d. assumption of the regression model). I then used gls() [library nlme] to fit a linear regression model with AR(1)-residuals. The residuals look perfect (residual plot, ACF, PACF, QQPlot, Ljung- Box test). As mentioned on en.wikipedia.org/wiki Coefficient_of_determination (citation [2008-04-24]: "For cases other than fitting by ordinary least squares, the R^2 statistic can be calculated as above" and later: "Values for R^2 can be calculated for any type of predictive model"), I tried to calculate the standard R^2 for the model with AR(1) residuals. However, I ended up with R^2 larger than 1! As mentioned on the German wikipedia page (de.wikipedia.org wiki/Bestimmtheitsma?), in models fitted using Maximum Likelihood Estimation (MLE), the coefficient of determination does _not_ exist (citation [2008-04-24]: "Bei bestimmten statistischen Modellen, z.B. bei Maximum-Likelihood-Sch?tzungen, existiert das Bestimmtheitsma? R^2 nicht"). Any comments on that? The German Wikipedia page mentions McFadden's pseudo-coefficient of determination, the English Wikipedia page the one of Nagelkerke. I know there are others, too. Is there a general agreement on which "coefficient of determination" (or goodness-of-fit measure in general) to use for a regression model with autocorrelated errors? Is there a possibility to compare (non-graphically) the standard regression model with the model with AR(1) residuals to justify the better fit of the latter? Any comments are appreciated. Best regards. Marius