thr3ads.net - R help - [R] Strange R squared, possible error [Mar 2011]

If this information is useful, please help other people find it:
Share via:

derek

2011-Mar-16 19:49 UTC

[R] Strange R squared, possible error

k=lm(y~x)
summary(k)
returns R^2=0.9994

lm(y~x) is supposed to find coef. a anb b in y=a*x+b

l=lm(y~x+0)
summary(l)
returns R^2=0.9998
lm(y~x+0) is supposed to find coef. a in y=a*x+b while setting b=0

The question is why do I get better R^2, when it should be otherwise?

Im sorry to use the word "MS exel" here, but I verified it in exel and
it
gives: 
R^2=0.9994 when y=a*x+b is used
R^2=0.99938 when y=a*x+0 is used

--
View this message in context:
http://r.789695.n4.nabble.com/Strange-R-squared-possible-error-tp3382818p3382818.html
Sent from the R help mailing list archive at Nabble.com.

Ista Zahn

2011-Mar-16 20:35 UTC

head link

[R] Strange R squared, possible error

Hi Derek,
R^2 doesn't mean the same thing when you omit the intercept, as has
been discussed on this list before. See
http://r.789695.n4.nabble.com/lm-without-intercept-td3312429.html

Best,
Ista

On Wed, Mar 16, 2011 at 3:49 PM, derek <jan.kacaba at gmail.com>
wrote:> k=lm(y~x)
> summary(k)
> returns R^2=0.9994
>
> lm(y~x) is supposed to find coef. a anb b in y=a*x+b
>
> l=lm(y~x+0)
> summary(l)
> returns R^2=0.9998
> lm(y~x+0) is supposed to find coef. a in y=a*x+b while setting b=0
>
> The question is why do I get better R^2, when it should be otherwise?
>
> Im sorry to use the word "MS exel" here, but I verified it in
exel and it
> gives:
> R^2=0.9994 when y=a*x+b is used
> R^2=0.99938 when y=a*x+0 is used
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Strange-R-squared-possible-error-tp3382818p3382818.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Bert Gunter

2011-Mar-16 20:40 UTC

head link

[R] Strange R squared, possible error

?summary.lm

The R^2 section  explains that R^2 is computed differently depending
on whether or not an intercept is in the model.

-- Bert

On Wed, Mar 16, 2011 at 12:49 PM, derek <jan.kacaba at gmail.com>
wrote:> k=lm(y~x)
> summary(k)
> returns R^2=0.9994
>
> lm(y~x) is supposed to find coef. a anb b in y=a*x+b
>
> l=lm(y~x+0)
> summary(l)
> returns R^2=0.9998
> lm(y~x+0) is supposed to find coef. a in y=a*x+b while setting b=0
>
> The question is why do I get better R^2, when it should be otherwise?
>
> Im sorry to use the word "MS exel" here, but I verified it in
exel and it
> gives:
> R^2=0.9994 when y=a*x+b is used
> R^2=0.99938 when y=a*x+0 is used
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Strange-R-squared-possible-error-tp3382818p3382818.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Bert Gunter
Genentech Nonclinical Biostatistics

JLucke at ria.buffalo.edu

2011-Mar-16 20:43 UTC

head link

[R] Strange R squared, possible error

lm(y~x+0) yields the regression on x without the constant, i.e., y=bx+e, 
not y = a +e





derek <jan.kacaba@gmail.com> 
Sent by: r-help-bounces@r-project.org
03/16/2011 03:49 PM

To
r-help@r-project.org
cc

Subject
[R] Strange R squared, possible error






k=lm(y~x)
summary(k)
returns R^2=0.9994

lm(y~x) is supposed to find coef. a anb b in y=a*x+b

l=lm(y~x+0)
summary(l)
returns R^2=0.9998
lm(y~x+0) is supposed to find coef. a in y=a*x+b while setting b=0

The question is why do I get better R^2, when it should be otherwise?

Im sorry to use the word "MS exel" here, but I verified it in exel and
it
gives: 
R^2=0.9994 when y=a*x+b is used
R^2=0.99938 when y=a*x+0 is used

--
View this message in context: 
http://r.789695.n4.nabble.com/Strange-R-squared-possible-error-tp3382818p3382818.html

Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


	[[alternative HTML version deleted]]

Ben Bolker

2011-Mar-16 22:02 UTC

head link

[R] Strange R squared, possible error

<JLucke <at> ria.buffalo.edu> writes:
> 
> lm(y~x+0) yields the regression on x without the constant, i.e., y=bx+e, 
> not y = a +e
> 
> derek <jan.kacaba <at> gmail.com> 
> Sent by: r-help-bounces <at> r-project.org
> 03/16/2011 03:49 PM
> 
  Would someone like to (please!) write this up and submit it to
Kurt Hornik for inclusion in the R FAQ?

  Ben Bolker

Gabor Grothendieck

2011-Mar-17 15:36 UTC

head link

[R] Strange R squared, possible error

On Wed, Mar 16, 2011 at 3:49 PM, derek <jan.kacaba at gmail.com>
wrote:> k=lm(y~x)
> summary(k)
> returns R^2=0.9994
>
> lm(y~x) is supposed to find coef. a anb b in y=a*x+b
>
> l=lm(y~x+0)
> summary(l)
> returns R^2=0.9998
> lm(y~x+0) is supposed to find coef. a in y=a*x+b while setting b=0
>
> The question is why do I get better R^2, when it should be otherwise?
>
> Im sorry to use the word "MS exel" here, but I verified it in
exel and it
> gives:
> R^2=0.9994 when y=a*x+b is used
> R^2=0.99938 when y=a*x+0 is used
>
The idea is that if you have a positive quantity that can be broken
down into two nonnegative quantities: X = X1 + X2 then it makes sense
to ask what proportion X1 is of X.   For example: 10 = 6 + 4 and 6 is
.6 of the total.

Now, in the case of a model with an intercept its a mathematical fact
that the variance of the response equals the variance of the fitted
model plus the variance of the residuals.  Thus it makes sense to ask
what fraction of the variance of the response is represented by the
variance of the fitted model (this fraction is R^2).

But if there is no intercept then that mathematical fact breaks down.
That is, its no longer true that the variance of the response equals
the variance of the fitted model plus the variance of the residuals.
Thus how meaningful is it to ask what proportion the variance of the
fitted model is of the variance of the response in the first place?
In this case we need to rethink the entire approach which is why a
different formula is required.

Also, maybe the real problem is not this at all. That is perhaps you
are not really trying to find the goodness of fit but rather you are
trying to compare two particular models: one with intercept and one
without.  In that case R^2 is not really what you want.  Instead use
the R anova command. For example, using the built in BOD data frame:
> fm <- lm(demand ~ Time, BOD)
> fm0 <- lm(demand ~ Time - 1, BOD)
> anova(fm, fm0)Analysis of Variance Table

Model 1: demand ~ Time
Model 2: demand ~ Time - 1
  Res.Df     RSS Df Sum of Sq      F  Pr(>F)
1      4  38.069
2      5 135.820 -1   -97.751 10.271 0.03275 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Here we see that the residual sum of squares is much less for the full
model than for the reduced model and its significant at the 3.275%
level.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Mar 2011 - Strange R squared, possible error

[R] Strange R squared, possible error

[R] Strange R squared, possible error

[R] Strange R squared, possible error

[R] Strange R squared, possible error

[R] Strange R squared, possible error

[R] Strange R squared, possible error

Possibly Parallel Threads