Wolfgang Raffelsberger
2007-Nov-28 16:14 UTC
[R] alternatives to traditional least squares method in linear regression ?
Dear list, I have encountered a special case for searching a linear regression where I'm not satisfied with the results obtained using the traditional least squares method (sometimes called OLS) for estimating/optimizing the residues to the regression line (see code below). Basically, a group of my x-y data are a bit off the diagonal line (in my case the diagonal represents the ideal or theoretical fit between x and y, which are in the same scale) and thus these points have sufficient power to impose a slope deviating (too much) from the diagonal. Using rlm() didn't help since this is not a problem of rare outliers. From a pragmatic point of view using a linear regression approach does fit very well the nature of the data & comparison I'd like to perform, so that's why I'd like to stay with something linear. Has anybody already implemented a function or package in R allowing to modify the exponent (of the least squares method) or more general allowing to define the model to be used for estimating/optimizing the residues ? Thank's in advance Wolfgang Raffelsberger > plot(x,y) # x and y are my data > regr <- lm(y~x) > abline(regr) > # I'm not satisfied with the line since there is one group of points following very well the diagonal but the regression is deviated by another group of points ... > > sessionInfo() R version 2.6.0 (2007-10-03) i386-pc-mingw32 locale: LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 attached base packages: [1] stats graphics grDevices datasets tcltk utils methods [8] base other attached packages: [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5 svIDE_0.9-5 loaded via a namespace (and not attached): [1] tools_2.6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et G?nomique Int?gratives CNRS UMR7104, IGBMC 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger at igbmc.u-strasbg.fr
Gabor Grothendieck
2007-Nov-28 16:42 UTC
[R] alternatives to traditional least squares method in linear regression ?
You could use the weights= argument of lm or if these points represent a different factor you could add a dummy variable which is one for those points and 0 otherwise. Also check out quantile regression in the quantreg package. On Nov 28, 2007 11:14 AM, Wolfgang Raffelsberger <wraff at titus.u-strasbg.fr> wrote:> Dear list, > > I have encountered a special case for searching a linear regression > where I'm not satisfied with the results obtained using the traditional > least squares method (sometimes called OLS) for estimating/optimizing > the residues to the regression line (see code below). Basically, a > group of my x-y data are a bit off the diagonal line (in my case the > diagonal represents the ideal or theoretical fit between x and y, which > are in the same scale) and thus these points have sufficient power to > impose a slope deviating (too much) from the diagonal. Using rlm() > didn't help since this is not a problem of rare outliers. > From a pragmatic point of view using a linear regression approach does > fit very well the nature of the data & comparison I'd like to perform, > so that's why I'd like to stay with something linear. > > Has anybody already implemented a function or package in R allowing to > modify the exponent (of the least squares method) or more general > allowing to define the model to be used for estimating/optimizing the > residues ? > > Thank's in advance > Wolfgang Raffelsberger > > > > plot(x,y) # x and y are my data > > regr <- lm(y~x) > > abline(regr) > > # I'm not satisfied with the line since there is one group of points > following very well the diagonal but the regression is deviated by > another group of points ... > > > > sessionInfo() > R version 2.6.0 (2007-10-03) > i386-pc-mingw32 > > locale: > LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 > > attached base packages: > [1] stats graphics grDevices datasets tcltk utils methods > [8] base > > other attached packages: > [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5 svIDE_0.9-5 > > loaded via a namespace (and not attached): > [1] tools_2.6.0 > > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > > Wolfgang Raffelsberger, PhD > Laboratoire de BioInformatique et G?nomique Int?gratives > CNRS UMR7104, IGBMC > 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France > Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 > wolfgang.raffelsberger at igbmc.u-strasbg.fr > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Liaw, Andy
2007-Nov-30 19:05 UTC
[R] alternatives to traditional least squares method in linear regression ?
Coming to this late, but hopefully not too late...
You may want to try mixture of regression models:
install.packages("flexmix")
require("flexmix")
## simulate some data
x1 <- rnorm(100, sd=5)
y1 <- rnorm(100, mean=x1)
x2 <- rnorm(50, sd=5)
y2 <- rnorm(50, mean=-5 + 0.5 * x2)
x <- c(x1, x2)
y <- c(y1, y2)
plot(x, y)
fit <- flexmix(y ~ x, k=2)
parameters(fit)
Andy
From: Wolfgang Raffelsberger>
> Dear list,
>
> I have encountered a special case for searching a linear regression
> where I'm not satisfied with the results obtained using the
> traditional
> least squares method (sometimes called OLS) for estimating/optimizing
> the residues to the regression line (see code below). Basically, a
> group of my x-y data are a bit off the diagonal line (in my case the
> diagonal represents the ideal or theoretical fit between x
> and y, which
> are in the same scale) and thus these points have sufficient power to
> impose a slope deviating (too much) from the diagonal. Using rlm()
> didn't help since this is not a problem of rare outliers.
> From a pragmatic point of view using a linear regression
> approach does
> fit very well the nature of the data & comparison I'd like to
> perform,
> so that's why I'd like to stay with something linear.
>
> Has anybody already implemented a function or package in R
> allowing to
> modify the exponent (of the least squares method) or more general
> allowing to define the model to be used for estimating/optimizing the
> residues ?
>
> Thank's in advance
> Wolfgang Raffelsberger
>
>
> > plot(x,y) # x and y are my data
> > regr <- lm(y~x)
> > abline(regr)
> > # I'm not satisfied with the line since there is one group
> of points
> following very well the diagonal but the regression is deviated by
> another group of points ...
> >
> > sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_M
> ONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>
> attached base packages:
> [1] stats graphics grDevices datasets tcltk utils
> methods
> [8] base
>
> other attached packages:
> [1] svSocket_0.9-5 svIO_0.9-5 R2HTML_1.58 svMisc_0.9-5
> svIDE_0.9-5
>
> loaded via a namespace (and not attached):
> [1] tools_2.6.0
>
>
>
>
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> . . . . .
>
> Wolfgang Raffelsberger, PhD
> Laboratoire de BioInformatique et G?nomique Int?gratives
> CNRS UMR7104, IGBMC
> 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France
> Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
> wolfgang.raffelsberger at igbmc.u-strasbg.fr
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachme...{{dropped:15}}