Wolfgang Raffelsberger
2007-Nov-28 16:14 UTC
[R] alternatives to traditional least squares method in linear regression ?
Dear list, I have encountered a special case for searching a linear regression where I'm not satisfied with the results obtained using the traditional least squares method (sometimes called OLS) for estimating/optimizing the residues to the regression line (see code below). Basically, a group of my x-y data are a bit off the diagonal line (in my case the diagonal represents the ideal or theoretical fit between x and y, which are in the same scale) and thus these points have sufficient power to impose a slope deviating (too much) from the diagonal. Using rlm() didn't help since this is not a problem of rare outliers. From a pragmatic point of view using a linear regression approach does fit very well the nature of the data & comparison I'd like to perform, so that's why I'd like to stay with something linear. Has anybody already implemented a function or package in R allowing to modify the exponent (of the least squares method) or more general allowing to define the model to be used for estimating/optimizing the residues ? Thank's in advance Wolfgang Raffelsberger > plot(x,y) # x and y are my data > regr <- lm(y~x) > abline(regr) > # I'm not satisfied with the line since there is one group of points following very well the diagonal but the regression is deviated by another group of points ... > > sessionInfo() R version 2.6.0 (2007-10-03) i386-pc-mingw32 locale: LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 attached base packages: [1] stats graphics grDevices datasets tcltk utils methods [8] base other attached packages: [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5 svIDE_0.9-5 loaded via a namespace (and not attached): [1] tools_2.6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et G?nomique Int?gratives CNRS UMR7104, IGBMC 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger at igbmc.u-strasbg.fr
Gabor Grothendieck
2007-Nov-28 16:42 UTC
[R] alternatives to traditional least squares method in linear regression ?
You could use the weights= argument of lm or if these points represent a different factor you could add a dummy variable which is one for those points and 0 otherwise. Also check out quantile regression in the quantreg package. On Nov 28, 2007 11:14 AM, Wolfgang Raffelsberger <wraff at titus.u-strasbg.fr> wrote:> Dear list, > > I have encountered a special case for searching a linear regression > where I'm not satisfied with the results obtained using the traditional > least squares method (sometimes called OLS) for estimating/optimizing > the residues to the regression line (see code below). Basically, a > group of my x-y data are a bit off the diagonal line (in my case the > diagonal represents the ideal or theoretical fit between x and y, which > are in the same scale) and thus these points have sufficient power to > impose a slope deviating (too much) from the diagonal. Using rlm() > didn't help since this is not a problem of rare outliers. > From a pragmatic point of view using a linear regression approach does > fit very well the nature of the data & comparison I'd like to perform, > so that's why I'd like to stay with something linear. > > Has anybody already implemented a function or package in R allowing to > modify the exponent (of the least squares method) or more general > allowing to define the model to be used for estimating/optimizing the > residues ? > > Thank's in advance > Wolfgang Raffelsberger > > > > plot(x,y) # x and y are my data > > regr <- lm(y~x) > > abline(regr) > > # I'm not satisfied with the line since there is one group of points > following very well the diagonal but the regression is deviated by > another group of points ... > > > > sessionInfo() > R version 2.6.0 (2007-10-03) > i386-pc-mingw32 > > locale: > LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 > > attached base packages: > [1] stats graphics grDevices datasets tcltk utils methods > [8] base > > other attached packages: > [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5 svIDE_0.9-5 > > loaded via a namespace (and not attached): > [1] tools_2.6.0 > > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > > Wolfgang Raffelsberger, PhD > Laboratoire de BioInformatique et G?nomique Int?gratives > CNRS UMR7104, IGBMC > 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France > Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 > wolfgang.raffelsberger at igbmc.u-strasbg.fr > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Liaw, Andy
2007-Nov-30 19:05 UTC
[R] alternatives to traditional least squares method in linear regression ?
Coming to this late, but hopefully not too late... You may want to try mixture of regression models: install.packages("flexmix") require("flexmix") ## simulate some data x1 <- rnorm(100, sd=5) y1 <- rnorm(100, mean=x1) x2 <- rnorm(50, sd=5) y2 <- rnorm(50, mean=-5 + 0.5 * x2) x <- c(x1, x2) y <- c(y1, y2) plot(x, y) fit <- flexmix(y ~ x, k=2) parameters(fit) Andy From: Wolfgang Raffelsberger> > Dear list, > > I have encountered a special case for searching a linear regression > where I'm not satisfied with the results obtained using the > traditional > least squares method (sometimes called OLS) for estimating/optimizing > the residues to the regression line (see code below). Basically, a > group of my x-y data are a bit off the diagonal line (in my case the > diagonal represents the ideal or theoretical fit between x > and y, which > are in the same scale) and thus these points have sufficient power to > impose a slope deviating (too much) from the diagonal. Using rlm() > didn't help since this is not a problem of rare outliers. > From a pragmatic point of view using a linear regression > approach does > fit very well the nature of the data & comparison I'd like to > perform, > so that's why I'd like to stay with something linear. > > Has anybody already implemented a function or package in R > allowing to > modify the exponent (of the least squares method) or more general > allowing to define the model to be used for estimating/optimizing the > residues ? > > Thank's in advance > Wolfgang Raffelsberger > > > > plot(x,y) # x and y are my data > > regr <- lm(y~x) > > abline(regr) > > # I'm not satisfied with the line since there is one group > of points > following very well the diagonal but the regression is deviated by > another group of points ... > > > > sessionInfo() > R version 2.6.0 (2007-10-03) > i386-pc-mingw32 > > locale: > LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_M > ONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 > > attached base packages: > [1] stats graphics grDevices datasets tcltk utils > methods > [8] base > > other attached packages: > [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5 > svIDE_0.9-5 > > loaded via a namespace (and not attached): > [1] tools_2.6.0 > > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > . . . . . > > Wolfgang Raffelsberger, PhD > Laboratoire de BioInformatique et G?nomique Int?gratives > CNRS UMR7104, IGBMC > 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France > Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 > wolfgang.raffelsberger at igbmc.u-strasbg.fr > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachme...{{dropped:15}}