Guy Green
2010-Feb-22 12:46 UTC
[R] Alternatives to linear regression with multiple variables
I wonder if someone can give some pointers on alternatives to linear regression (e.g. Loess) when dealing with multiple variables. Taking any simple table with three variables, you can very easily get the intercept and coefficients with: summary(lm(read_table)) For obvious reasons, the coefficients in a multiple regression are quite different from what you get if you calculate regressions for the single variables separately. Alternative approaches such as Loess seem straightforward when you have only one variable, and have the advantage that they can cope even if the relationship is not linear. My question is: how can you extend a flexible approach like Loess to a multi-variable scenario? I assume that any non-parametric calculation becomes very resource-intensive very quickly. Can anyone suggest alternatives (preferably R-based) that cope with multiple variables, even when the relationship (linear, etc) is not known in advance? Thanks, Guy -- View this message in context: http://n4.nabble.com/Alternatives-to-linear-regression-with-multiple-variables-tp1564370p1564370.html Sent from the R help mailing list archive at Nabble.com.
Liaw, Andy
2010-Feb-22 17:50 UTC
[R] Alternatives to linear regression with multiple variables
You can try the locfit package, which I believe can handle up to 5 variables. E.g., R> library(locfit) Loading required package: akima Loading required package: lattice locfit 1.5-6 2010-01-20 R> x <- matrix(runif(1000 * 3), 1000, 3) R> y <- rnorm(1000) R> mydata <- data.frame(x, y) R> str(mydata) 'data.frame': 1000 obs. of 4 variables: $ X1: num 0.21 0.769 0.661 0.978 0.15 ... $ X2: num 0.426 0.132 0.214 0.774 0.472 ... $ X3: num 0.971 0.659 0.474 0.867 0.479 ... $ y : num -0.496 -0.636 1.778 -0.876 0.657 ... R> fit <- locfit(y ~ lf(X1, X2, X3), data=mydata) R> plot(fit) Andy> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Guy Green > Sent: Monday, February 22, 2010 7:47 AM > To: r-help at r-project.org > Subject: [R] Alternatives to linear regression with multiple variables > > > I wonder if someone can give some pointers on alternatives to linear > regression (e.g. Loess) when dealing with multiple variables. > > Taking any simple table with three variables, you can very > easily get the > intercept and coefficients with: > summary(lm(read_table)) > > For obvious reasons, the coefficients in a multiple > regression are quite > different from what you get if you calculate regressions for > the single > variables separately. Alternative approaches such as Loess seem > straightforward when you have only one variable, and have the > advantage that > they can cope even if the relationship is not linear. > > My question is: how can you extend a flexible approach like Loess to a > multi-variable scenario? I assume that any non-parametric calculation > becomes very resource-intensive very quickly. Can anyone suggest > alternatives (preferably R-based) that cope with multiple > variables, even > when the relationship (linear, etc) is not known in advance? > > Thanks, > > Guy > -- > View this message in context: > http://n4.nabble.com/Alternatives-to-linear-regression-with-mu > ltiple-variables-tp1564370p1564370.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Notice: This e-mail message, together with any attachme...{{dropped:10}}
Dieter Menne
2010-Feb-23 07:42 UTC
[R] Alternatives to linear regression with multiple variables
Guy Green wrote:> > I wonder if someone can give some pointers on alternatives to linear > regression (e.g. Loess) when dealing with multiple variables. > >For two variables, there is also interp.loess in package tcp. It can be rather slow depending on the parameters, so I fear a generalization to more dimensions would require a better algorithm. Dieter -- View this message in context: http://n4.nabble.com/Alternatives-to-linear-regression-with-multiple-variables-tp1564370p1565576.html Sent from the R help mailing list archive at Nabble.com.
Greg Snow
2010-Feb-25 21:13 UTC
[R] Alternatives to linear regression with multiple variables
Well, the help page for the loess function says that the formula can include up to 4 predictor variables. There are also additive models (mgcv or gam (or other) package). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Guy Green > Sent: Monday, February 22, 2010 5:47 AM > To: r-help at r-project.org > Subject: [R] Alternatives to linear regression with multiple variables > > > I wonder if someone can give some pointers on alternatives to linear > regression (e.g. Loess) when dealing with multiple variables. > > Taking any simple table with three variables, you can very easily get > the > intercept and coefficients with: > summary(lm(read_table)) > > For obvious reasons, the coefficients in a multiple regression are > quite > different from what you get if you calculate regressions for the single > variables separately. Alternative approaches such as Loess seem > straightforward when you have only one variable, and have the advantage > that > they can cope even if the relationship is not linear. > > My question is: how can you extend a flexible approach like Loess to a > multi-variable scenario? I assume that any non-parametric calculation > becomes very resource-intensive very quickly. Can anyone suggest > alternatives (preferably R-based) that cope with multiple variables, > even > when the relationship (linear, etc) is not known in advance? > > Thanks, > > Guy > -- > View this message in context: http://n4.nabble.com/Alternatives-to- > linear-regression-with-multiple-variables-tp1564370p1564370.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2010-Feb-25 22:15 UTC
[R] Alternatives to linear regression with multiple variables
On Feb 22, 2010, at 7:46 AM, Guy Green wrote:> > I wonder if someone can give some pointers on alternatives to linear > regression (e.g. Loess) when dealing with multiple variables. > > Taking any simple table with three variables, you can very easily > get the > intercept and coefficients with: > summary(lm(read_table)) > > For obvious reasons, the coefficients in a multiple regression are > quite > different from what you get if you calculate regressions for the > single > variables separately. Alternative approaches such as Loess seem > straightforward when you have only one variable, and have the > advantage that > they can cope even if the relationship is not linear. > > My question is: how can you extend a flexible approach like Loess to a > multi-variable scenario? I assume that any non-parametric calculation > becomes very resource-intensive very quickly. Can anyone suggest > alternatives (preferably R-based) that cope with multiple variables, > even > when the relationship (linear, etc) is not known in advance?Frank Harrell illustrates several methods for appropriate consideration and computation of non-linear relationships in a regression framework. His book "Regression Modeling Strategies" has been uniformly praised by the people to whom I have recommended it. At one point he compares graphically the effect measures using a 2-d loess fit to that achieved with a crossed regression spline approach. Another text that demonstrates R-implemented multiple dimensional non- (or semi-)parametric regression approaches is Simon Wood's "Generalized Linear Models". I have less experience with the methods in that text, but hope to increase my familiarity in the future, since it would extend the types of models I would have access to. And Andy has mentioned "Local Regression and Likelihood" by Loader, which if you use Bookfinder.com will save you $30 off the $90 price in Amazon at the moment. (No financial interests to declare.) I surnise that the geospatial applications are of necessity dealing with 2 and 3 dimensional data arrangements so you might took at their Task View and mailing list archive for worked examples and advice. -- David> > Thanks, > > Guy > -- > View this message in context: http://n4.nabble.com/Alternatives-to-linear-regression-with-multiple-variables-tp1564370p1564370.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT