thr3ads.net - R help - [R] odd behavior of summary()$r.squared [Oct 2004]

If this information is useful, please help other people find it:
Share via:

J.R. Lockwood

2004-Oct-06 18:27 UTC

[R] odd behavior of summary()$r.squared

I may be missing something obvious here, but consider the following simple
dataset simulating repeated measures on 5 individuals with pretty strong
between-individual variance.

set.seed(1003)
n<-5
v<-rep(1:n,each=2)
d<-data.frame(factor(v),v+rnorm(2*n))
names(d)<-c("id","y")

Now consider the following two linear models that provide identical fitted
values, residuals, and estimated residual variance:
  
m1<-lm(y~id,data=d)
m2<-lm(y~id-1,data=d)
print(max(abs(fitted(m1)-fitted(m2))))

The r-squared reported by summary(m1) appears to be correct in that it is
equal to the squared correlation between the fitted and observed values:

print(summary(m1)$r.squared - cor(fitted(m1),d$y)^2)

However, the same is not true of m2.

print(summary(m2)$r.squared - cor(fitted(m2),d$y)^2)
> R.version         _
platform i686-pc-linux-gnu
arch     i686
os       linux-gnu
system   i686, linux-gnu
status
major    1
minor    9.0
year     2004
month    04
day      12
language R


J.R. Lockwood
412-683-2300 x4941
lockwood at rand.org
http://www.rand.org/methodology/stat/members/lockwood/

--------------------

This email message is for the sole use of the intended recipient(s) and
may contain privileged information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.

Sundar Dorai-Raj

2004-Oct-06 19:21 UTC

head link

[R] odd behavior of summary()$r.squared

J.R. Lockwood wrote:
> I may be missing something obvious here, but consider the following simple
> dataset simulating repeated measures on 5 individuals with pretty strong
> between-individual variance.
> 
> set.seed(1003)
> n<-5
> v<-rep(1:n,each=2)
> d<-data.frame(factor(v),v+rnorm(2*n))
> names(d)<-c("id","y")
> 
> Now consider the following two linear models that provide identical fitted
> values, residuals, and estimated residual variance:
>   
> m1<-lm(y~id,data=d)
> m2<-lm(y~id-1,data=d)
> print(max(abs(fitted(m1)-fitted(m2))))
> 
> The r-squared reported by summary(m1) appears to be correct in that it is
> equal to the squared correlation between the fitted and observed values:
> 
> print(summary(m1)$r.squared - cor(fitted(m1),d$y)^2)
> 
> However, the same is not true of m2.
> 
> print(summary(m2)$r.squared - cor(fitted(m2),d$y)^2)
> 
> 
>>R.version
> 
>          _
> platform i686-pc-linux-gnu
> arch     i686
> os       linux-gnu
> system   i686, linux-gnu
> status
> major    1
> minor    9.0
> year     2004
> month    04
> day      12
> language R
I think what you're trying to do is better accomplished by looking at 
the anova table of the two results

a1 <- anova(m1)
a2 <- anova(m2)
r2.1 <- a1[1, 2]/sum(a1[, 2])
r2.2 <- a2[1, 2]/sum(a2[, 2])

summary(m1)$r.squared - r2.1
summary(m2)$r.squared - r2.2

The result you used above using "cor" still adjusts your data for the 
grand mean, which m2 doesn't fit.

HTH,

--sundar

Reasonably Related Threads

Search for more maybe matching threads

R help - Oct 2004 - odd behavior of summary()$r.squared

[R] odd behavior of summary()$r.squared

[R] odd behavior of summary()$r.squared

Reasonably Related Threads