thr3ads.net - R help - [R] adjusted R^2 vs. ordinary R^2 [Jun 2005]

If this information is useful, please help other people find it:
Share via:

James Salsman

2005-Jun-17 21:16 UTC

[R] adjusted R^2 vs. ordinary R^2

I thought the point of adjusting the R^2 for degrees of
freedom is to allow comparisons about goodness of fit between
similar models with different numbers of data points.  Someone
has suggested to me off-list that this might not be the case.

Is an ADJUSTED R^2 for a four-parameter, five-point model
reliably comparable to the adjusted R^2 of a four-parameter,
100-point model?  If such values can't be reliably compared
with one another, then what is the reasoning behind adjusting
R^2 for degrees of freedom?

What are the good published authorities on this topic?

Sincerely,
James Salsman

Peter Dalgaard

2005-Jun-17 22:42 UTC

head link

[R] adjusted R^2 vs. ordinary R^2

James Salsman <james at bovik.org> writes:
> I thought the point of adjusting the R^2 for degrees of
> freedom is to allow comparisons about goodness of fit between
> similar models with different numbers of data points.  Someone
> has suggested to me off-list that this might not be the case.
> 
> Is an ADJUSTED R^2 for a four-parameter, five-point model
> reliably comparable to the adjusted R^2 of a four-parameter,
> 100-point model?  If such values can't be reliably compared
> with one another, then what is the reasoning behind adjusting
> R^2 for degrees of freedom?

Well, the adjusted R^2 is the percent variance explained by
covariates. So it compares the conditional variance (given covariates)
to the marginal variance. This is less sensitive to DF issues than the
usual R^2, but it does still require that both quantities make sense.
This is not a given, and in particular the R^2 (either one) is quite
dubious when the covariates are chosen by design.

 > What are the good published authorities on this topic?
Dunno. Common sense should really suffice in this matter.


-- 
   O__  ---- Peter Dalgaard             ??ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Lucke, Joseph F

2005-Jun-20 13:42 UTC

head link

[R] adjusted R^2 vs. ordinary R^2

James
 
The main reason for the adjusted R^2  (Fisher) is that it is less biased than
the ordinary R^2.  The ordinary R^2 has a positive bias that is a function of
the true Rho^2, the number of predictors p, and the sample size n.  The maximum
bias occurs at Rho^2 = 0, where the expected R^2 is p/(n-1). The adjusted R^2
has a slightly negative bias (max being on the order of -1/2n at Rho^2 = .5)
which is not a function of p.
 
In your example, the R^2 for the 1st equation will be 1 even if Rho^2 = 0, but
the expected R^2 will be Rho^2 + .04 for the second.  ( I am interpreting
"parameters" as "predictors", which is strictly speaking not
true, as the regression intercept and error variance are also parameters.)  The
adjR^2 will have max expected bias of -.1 in the first and -0.005 in the second.
 
Any between-regression comparions using R^2 will founder on differences in bias
induced by differences in Rho^2, p, and n. Any between-regressions comparisons
using adjR^2 will be founder on differences in bias induced by differences in
Rho^2 and n. However, the maximum possible difference in bias for adjR^2 may not
be large.
 
Note also 

1.	
	The standard errors of the estimators should also be taken in account in such
comparisons.
2.	
	There is an unbiased estimator of Rho^2 (Olkin and Pratt); and
3.	
	There is another adjR^2 has slightly better MSE than the Fisher adjR^2.
4.	
	There is a difference in results if the predictors are considered fixed rather
than multivariate normal (Barten)

Joe
 
References
 
Barten AP. Note on unbiased estimation of the squared multiple correlation
coefficient. Statistica Neerlandica, 1962, 16, 151-163
 
Fisher RA. The influence of rainfall in the yield of wheat at Rothamstead. 
Philosophical Transactions of the Royal Society of London, Series B, 1924, 213,
89-142.
 
Lucke JF and Embreston SE. Biases and mean squared errors of estimators of
multinormal squared multiple correlation. Journal of Education Statistics, 1984,
9(3), 183-192.
 
Olkin I and Pratt JW. Unbiased estimation of certain correlation coefficients. 
Annals of Mathematical Statistics, 1958, 29, 201-211.

________________________________

From: r-help-bounces@stat.math.ethz.ch on behalf of James Salsman
Sent: Fri 6/17/2005 4:16 PM
To: r-help@stat.math.ethz.ch
Subject: [R] adjusted R^2 vs. ordinary R^2



I thought the point of adjusting the R^2 for degrees of
freedom is to allow comparisons about goodness of fit between
similar models with different numbers of data points.  Someone
has suggested to me off-list that this might not be the case.

Is an ADJUSTED R^2 for a four-parameter, five-point model
reliably comparable to the adjusted R^2 of a four-parameter,
100-point model?  If such values can't be reliably compared
with one another, then what is the reasoning behind adjusting
R^2 for degrees of freedom?

What are the good published authorities on this topic?

Sincerely,
James Salsman

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more maybe matching threads

R help - Jun 2005 - adjusted R^2 vs. ordinary R^2

[R] adjusted R^2 vs. ordinary R^2

[R] adjusted R^2 vs. ordinary R^2

[R] adjusted R^2 vs. ordinary R^2

Seemingly Similar Threads