Joseph LeBouton
2006-Aug-06 22:21 UTC
[R] removing intercept from lm() results in oddly high Rsquared
Can anyone help me understand why an lm model summary would return an r.squared of ~0.18 with an intercept term, and an r.squared of ~0.98 without the intercept? The fit is NOT that much better, according to plot.lm: residuals are similar between the two models, and a plot of observed x predicted is almost identical. Thanks, -Joseph -- ************************************ Joseph P. LeBouton Forest Ecology PhD Candidate Department of Forestry Michigan State University East Lansing, Michigan 48824 Office phone: 517-355-7744 email: lebouton at msu.edu
Christoph Buser
2006-Aug-07 07:26 UTC
[R] removing intercept from lm() results in oddly high Rsquared
Dear Joseph Have a look at the questions and answers in the two links below. There the topic has been discussed. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/68905.html http://finzi.psych.upenn.edu/R/Rhelp02a/archive/6943.html Best regards, Christoph Buser -------------------------------------------------------------- Christoph Buser <buser at stat.math.ethz.ch> Seminar fuer Statistik, LEO C13 ETH Zurich 8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -------------------------------------------------------------- Joseph LeBouton writes: > Can anyone help me understand why an lm model summary would return an > r.squared of ~0.18 with an intercept term, and an r.squared of ~0.98 > without the intercept? The fit is NOT that much better, according to > plot.lm: residuals are similar between the two models, and a plot of > observed x predicted is almost identical. > > Thanks, > > -Joseph > > -- > ************************************ > Joseph P. LeBouton > Forest Ecology PhD Candidate > Department of Forestry > Michigan State University > East Lansing, Michigan 48824 > > Office phone: 517-355-7744 > email: lebouton at msu.edu > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dieter Menne
2006-Aug-07 07:38 UTC
[R] removing intercept from lm() results in oddly high Rsquared
Joseph LeBouton <lebouton <at> msu.edu> writes:> > Can anyone help me understand why an lm model summary would return an > r.squared of ~0.18 with an intercept term, and an r.squared of ~0.98 > without the intercept? The fit is NOT that much better, according to > plot.lm: residuals are similar between the two models, and a plot of > observed x predicted is almost identical.There are reasons why the standard textbooks and Bill Venables http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf tell you that removing Intercepts can be dangerous for your health. Dieter ## set.seed(10) x = runif(20,5,10) y = 2 * x + rnorm(20,0,0.3) # a fit with good data summary(lm(y~x))$r.squared # 0.98 # add one outlier at 0 x = c(x,0) y = c(y,20) summary(lm(y~x))$r.squared # 0.00008 # removing the intercept: perfect correlation again summary(lm(y~x-1))$r.squared # 0.91 #... because it is similar to adding MANY data points # at (0,0) x = c(x,rep(0,1000)) y = c(y,rep(0,1000)) summary(lm(y~x))$r.squared # 0.90