James
The main reason for the adjusted R^2 (Fisher) is that it is less biased than
the ordinary R^2. The ordinary R^2 has a positive bias that is a function of
the true Rho^2, the number of predictors p, and the sample size n. The maximum
bias occurs at Rho^2 = 0, where the expected R^2 is p/(n-1). The adjusted R^2
has a slightly negative bias (max being on the order of -1/2n at Rho^2 = .5)
which is not a function of p.
In your example, the R^2 for the 1st equation will be 1 even if Rho^2 = 0, but
the expected R^2 will be Rho^2 + .04 for the second. ( I am interpreting
"parameters" as "predictors", which is strictly speaking not
true, as the regression intercept and error variance are also parameters.) The
adjR^2 will have max expected bias of -.1 in the first and -0.005 in the second.
Any between-regression comparions using R^2 will founder on differences in bias
induced by differences in Rho^2, p, and n. Any between-regressions comparisons
using adjR^2 will be founder on differences in bias induced by differences in
Rho^2 and n. However, the maximum possible difference in bias for adjR^2 may not
be large.
Note also
1.
The standard errors of the estimators should also be taken in account in such
comparisons.
2.
There is an unbiased estimator of Rho^2 (Olkin and Pratt); and
3.
There is another adjR^2 has slightly better MSE than the Fisher adjR^2.
4.
There is a difference in results if the predictors are considered fixed rather
than multivariate normal (Barten)
Joe
References
Barten AP. Note on unbiased estimation of the squared multiple correlation
coefficient. Statistica Neerlandica, 1962, 16, 151-163
Fisher RA. The influence of rainfall in the yield of wheat at Rothamstead.
Philosophical Transactions of the Royal Society of London, Series B, 1924, 213,
89-142.
Lucke JF and Embreston SE. Biases and mean squared errors of estimators of
multinormal squared multiple correlation. Journal of Education Statistics, 1984,
9(3), 183-192.
Olkin I and Pratt JW. Unbiased estimation of certain correlation coefficients.
Annals of Mathematical Statistics, 1958, 29, 201-211.
________________________________
From: r-help-bounces@stat.math.ethz.ch on behalf of James Salsman
Sent: Fri 6/17/2005 4:16 PM
To: r-help@stat.math.ethz.ch
Subject: [R] adjusted R^2 vs. ordinary R^2
I thought the point of adjusting the R^2 for degrees of
freedom is to allow comparisons about goodness of fit between
similar models with different numbers of data points. Someone
has suggested to me off-list that this might not be the case.
Is an ADJUSTED R^2 for a four-parameter, five-point model
reliably comparable to the adjusted R^2 of a four-parameter,
100-point model? If such values can't be reliably compared
with one another, then what is the reasoning behind adjusting
R^2 for degrees of freedom?
What are the good published authorities on this topic?
Sincerely,
James Salsman
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[[alternative HTML version deleted]]