thr3ads.net - R help - [R] lm and R-squared (newbie) [Dec 2011]

If this information is useful, please help other people find it:
Share via:

PtitBleu

2011-Dec-15 13:35 UTC

[R] lm and R-squared (newbie)

Hello,

I've two data.frames (data1 and data4), dec="." and
sep=";".
http://r.789695.n4.nabble.com/file/n4199964/data1.txt data1.txt 
http://r.789695.n4.nabble.com/file/n4199964/data4.txt data4.txt 

When I do
plot(data1$nx,data1$ny, col="red")
points(data4$nx,data4$ny, col="blue")
,  results seem very similar (at least to me) but the R-squared of
summary(lm(data1$ny ~ data1$nx))
and
summary(lm(data4$ny ~ data4$nx))
are very different (0.48 against 0.89).

Could someone explain me the reason?

To be complete, I am looking for an simple indicator telling me if it is
worthwhile to keep the values provided by lm. I thought that R-squared could
do the job. For me, if R-squared is far from 1, the data are not good enough
to perform a linear fit.
It seems that I'm wrong.

Thanks for your explainations.
Ptit Bleu.


 


--
View this message in context:
http://r.789695.n4.nabble.com/lm-and-R-squared-newbie-tp4199964p4199964.html
Sent from the R help mailing list archive at Nabble.com.

Gabor Grothendieck

2011-Dec-15 14:20 UTC

head link

[R] lm and R-squared (newbie)

On Thu, Dec 15, 2011 at 8:35 AM, PtitBleu <ptit_bleu at yahoo.fr>
wrote:> Hello,
>
> I've two data.frames (data1 and data4), dec="." and
sep=";".
> http://r.789695.n4.nabble.com/file/n4199964/data1.txt data1.txt
> http://r.789695.n4.nabble.com/file/n4199964/data4.txt data4.txt
>
> When I do
> plot(data1$nx,data1$ny, col="red")
> points(data4$nx,data4$ny, col="blue")
> , ?results seem very similar (at least to me) but the R-squared of
> summary(lm(data1$ny ~ data1$nx))
> and
> summary(lm(data4$ny ~ data4$nx))
> are very different (0.48 against 0.89).
>
> Could someone explain me the reason?
>
> To be complete, I am looking for an simple indicator telling me if it is
> worthwhile to keep the values provided by lm. I thought that R-squared
could
> do the job. For me, if R-squared is far from 1, the data are not good
enough
> to perform a linear fit.
> It seems that I'm wrong.
The problem is the outliers. Try using a robust measure instead.  If
we replace Pearson correlations with Spearman (rank) correlations they
are much closer:
> # R^2 based on Pearson correlations
> cor(fitted(lm(ny ~ nx, data4)), data4$ny)^2
[1] 0.8916924> cor(fitted(lm(ny ~ nx, data1)), data1$ny)^2
[1] 0.4868575>
> # R^2 based on Spearman (rank) correlations
> cor(fitted(lm(ny ~ nx, data4)), data4$ny, method = "spearman")^2
[1] 0.8104026> cor(fitted(lm(ny ~ nx, data1)), data1$ny, method = "spearman")^2[1] 0.7266705

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

David Winsemius

2011-Dec-15 14:35 UTC

head link

[R] lm and R-squared (newbie)

On Dec 15, 2011, at 8:35 AM, PtitBleu wrote:
> Hello,
>
> I've two data.frames (data1 and data4), dec="." and
sep=";".
> http://r.789695.n4.nabble.com/file/n4199964/data1.txt data1.txt
> http://r.789695.n4.nabble.com/file/n4199964/data4.txt data4.txt
>
> When I do
> plot(data1$nx,data1$ny, col="red")
> points(data4$nx,data4$ny, col="blue")
> ,  results seem very similar (at least to me) but the R-squared of
> summary(lm(data1$ny ~ data1$nx))
> and
> summary(lm(data4$ny ~ data4$nx))
> are very different (0.48 against 0.89).
>
> Could someone explain me the reason?
Because you failed to do an adequate assessment of your data. Try this  
ploting exercsie and I think you will see the reason for the  
differences:

plot(data1$nx,data1$ny, col="red", xlim=range(c(data1$nx,data4$nx)),  
ylim=range(c(data1$ny,data4$ny)) )

-- 
David.
>
> To be complete, I am looking for an simple indicator telling me if  
> it is
> worthwhile to keep the values provided by lm. I thought that R- 
> squared could
> do the job. For me, if R-squared is far from 1, the data are not  
> good enough
> to perform a linear fit.
> It seems that I'm wrong.
>
> Thanks for your explainations.
> Ptit Bleu.
>
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/lm-and-R-squared-newbie-tp4199964p4199964.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT

Reasonably Related Threads

Search for more possibly parallel threads

R help - Dec 2011 - lm and R-squared (newbie)

[R] lm and R-squared (newbie)

[R] lm and R-squared (newbie)

[R] lm and R-squared (newbie)

Reasonably Related Threads