thr3ads.net - R help - [R] Average R-squared of model1 to model n [Jun 2004]

If this information is useful, please help other people find it:
Share via:

kan Liu

2004-Jun-06 09:47 UTC

[R] Average R-squared of model1 to model n

Hi,
 
We got a question about interpretating R-suqared.
 
The actual outputs for a test dataset is X=(x1,x2, ..., xn).
model 1 predicted the outputs as Y1=(y11,y12,..., y1n)
model n predicted the outputs as Y2=(y21,y22,..., y2n)
 
... 
model m predicted the outputs as Ym=(ym1,ym2,..., ymn)
 
Now we have two ways to calculate R squared to evaluate the average performance
of committee model.
 
(a) Calculate R squared between (X, Y1), (X, Y2), ..., (X,Ym), and then
averaging the R squared
(b) Calculate average Y=(Y1+Y2, + ... Ym)/m, and then calculate the R squared
between (X, Y).
 
We found it seemed that R squared calculated in (b) is 'always' higher
than that in (a).
 
Does this result depends on the test dataset or this happened by chance?Can you
advise me any reference for this issue?

Many thanks in advance!

Kan

 


		
---------------------------------


	[[alternative HTML version deleted]]

Gabor Grothendieck

2004-Jun-06 18:01 UTC

head link

[R] Average R-squared of model1 to model n

Suppose m=2, Y1=Y and Y2= -Y.  Then (b) is zero so (a) must be
greater or equal to (b).  Thus (b) is not necessarily greater 
than (a).


kan Liu <kan_liu1 <at> yahoo.com> writes:

: 
: Hi,
: 
: We got a question about interpretating R-suqared.
: 
: The actual outputs for a test dataset is X=(x1,x2, ..., xn).
: model 1 predicted the outputs as Y1=(y11,y12,..., y1n)
: model n predicted the outputs as Y2=(y21,y22,..., y2n)
: 
: ... 
: model m predicted the outputs as Ym=(ym1,ym2,..., ymn)
: 
: Now we have two ways to calculate R squared to evaluate the average 
performance of committee model.
: 
: (a) Calculate R squared between (X, Y1), (X, Y2), ..., (X,Ym), and then 
averaging the R squared
: (b) Calculate average Y=(Y1+Y2, + ... Ym)/m, and then calculate the R 
squared between (X, Y). 
: 
: We found it seemed that R squared calculated in (b) is 'always' higher
than
that in (a).
: 
: Does this result depends on the test dataset or this happened by chance?Can 
you advise me any reference for
: this issue? 
: 
: Many thanks in advance!
: 
: Kan
: 
: 
: 		
: ---------------------------------
: 
: 	[[alternative HTML version deleted]]
: 
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

kan Liu

2004-Jun-07 09:16 UTC

head link

[R] Average R-squared of model1 to model n

Hi,
 
We got a question about interpretating R-suqared.
 
The actual outputs for a test dataset is X=(x1,x2,
..., xn).
model 1 predicted the outputs as Y1=(y11,y12,..., y1n)
model n predicted the outputs as Y2=(y21,y22,..., y2n)
 
...
model m predicted the outputs as Ym=(ym1,ym2,..., ymn)
 
Now we have two ways to calculate R squared to
evaluate the average performance of committee model.
 
(a) Calculate R squared between (X, Y1), (X, Y2), ...,
(X,Ym), and then averaging the R squared
(b) Calculate average Y=(Y1+Y2, + ... Ym)/m, and then
calculate the R squared between (X, Y).
 
We found it seemed that R squared calculated in (b) is
'always' higher than that in (a).
 
Does this result depends on the test dataset or this
happened by chance?Can you advise me any reference for
this issue? 

Many thanks in advance!

Kan

Liaw, Andy

2004-Jun-07 12:35 UTC

head link

[R] Average R-squared of model1 to model n

The Y1, Y2, etc. that Kan mentioned are predicted values of a test set data
from models that supposedly were fitted to the same (or similar) data.  It's
hard for me to imagine the outcome would be as `severe' as Y1 = -Y2.

That said, I do not think that the R-squared (or q-squared as some call it)
of the aggregate model is necessarily larger or equal to the average
R-squared of the component models.  It obviously depends on how the
component models are generated.  As a hypothetical example (because I
haven't acutally tried it, just speculating):  Suppose the data are
generated from a step function, the sort that would be perfect for
regression trees.  If one grows several well-pruned trees, I'd guess that
the average R-squared of the individual trees has a chance of being larger
than the R-squared of the averaged model.

Best,
Andy
> From: Gabor Grothendieck
> 
> Suppose m=2, Y1=Y and Y2= -Y.  Then (b) is zero so (a) must be
> greater or equal to (b).  Thus (b) is not necessarily greater 
> than (a).
> 
> 
> kan Liu <kan_liu1 <at> yahoo.com> writes:
> 
> : 
> : Hi,
> : 
> : We got a question about interpretating R-suqared.
> : 
> : The actual outputs for a test dataset is X=(x1,x2, ..., xn).
> : model 1 predicted the outputs as Y1=(y11,y12,..., y1n)
> : model n predicted the outputs as Y2=(y21,y22,..., y2n)
> : 
> : ... 
> : model m predicted the outputs as Ym=(ym1,ym2,..., ymn)
> : 
> : Now we have two ways to calculate R squared to evaluate the average 
> performance of committee model.
> : 
> : (a) Calculate R squared between (X, Y1), (X, Y2), ..., 
> (X,Ym), and then 
> averaging the R squared
> : (b) Calculate average Y=(Y1+Y2, + ... Ym)/m, and then 
> calculate the R 
> squared between (X, Y). 
> : 
> : We found it seemed that R squared calculated in (b) is 
> 'always' higher than 
> that in (a).
> : 
> : Does this result depends on the test dataset or this 
> happened by chance?Can 
> you advise me any reference for
> : this issue? 
> : 
> : Many thanks in advance!
> : 
> : Kan
> : 
> : 
> : 		
> : ---------------------------------
> : 
> : 	[[alternative HTML version deleted]]
> : 
> : ______________________________________________
> : R-help <at> stat.math.ethz.ch mailing list
> : https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> : PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> : 
> :
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Jun 2004 - Average R-squared of model1 to model n

[R] Average R-squared of model1 to model n

[R] Average R-squared of model1 to model n

[R] Average R-squared of model1 to model n

[R] Average R-squared of model1 to model n

Possibly Parallel Threads