Hi all, I just cannot think of how to do it: I want to take the first variable (column) of a data frame and regress it against all other variables. bla <- function (dat) { reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) return(reg) } What kind of function do I have to take instead of the whateverthefirstofthevariablenamesis, eval(), substitute(), get(), ... to correctly compute this regression? With lm(get(names(dat)[1] ~., data=dat) there are no errors, but the first variable also shows up among the regressors. Thanks for help. Christian Hoffmann -- Dr.sc.math.Christian W. Hoffmann, http://www.wsl.ch/staff/christian.hoffmann Mathematics + Statistical Computing e-mail: christian.hoffmann at wsl.ch Swiss Federal Research Institute WSL Tel: ++41-44-73922- -77 (office) CH-8903 Birmensdorf, Switzerland -11(exchange), -15 (fax)
You might be trying too hard:> dat <- data.frame(y=rnorm(10), x1=rnorm(10), x2=rnorm(10)) > fit <- lm(dat) > summary(fit)Call: lm(formula = dat) Residuals: Min 1Q Median 3Q Max -1.11643 -0.42746 -0.01442 0.55902 1.04890 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.7804 0.2700 2.891 0.0233 * x1 0.7469 0.2925 2.553 0.0379 * x2 -0.1099 0.2155 -0.510 0.6257 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.8234 on 7 degrees of freedom Multiple R-Squared: 0.5016, Adjusted R-squared: 0.3592 F-statistic: 3.522 on 2 and 7 DF, p-value: 0.08742 HTH, Andy> From: Christian Hoffmann > > Hi all, > > I just cannot think of how to do it: > I want to take the first variable (column) of a data frame > and regress > it against all other variables. > > bla <- function (dat) { > reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) > return(reg) > } > > What kind of function do I have to take instead of the > whateverthefirstofthevariablenamesis, > > eval(), substitute(), get(), ... > > to correctly compute this regression? > > With lm(get(names(dat)[1] ~., data=dat) there are no > errors, but the > first variable also shows up among the regressors. > > Thanks for help. > Christian Hoffmann > > -- > Dr.sc.math.Christian W. Hoffmann, > http://www.wsl.ch/staff/christian.hoffmann > Mathematics + Statistical Computing e-mail: > christian.hoffmann at wsl.ch > Swiss Federal Research Institute WSL Tel: ++41-44-73922- > -77 (office) > CH-8903 Birmensdorf, Switzerland -11(exchange), -15 (fax) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
On Wed, 24 Mar 2004, Christian Hoffmann wrote:> Hi all, > > I just cannot think of how to do it: > I want to take the first variable (column) of a data frame and regress > it against all other variables. > > bla <- function (dat) { > reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) > return(reg) > } > > What kind of function do I have to take instead of the > whateverthefirstofthevariablenamesis, > > eval(), substitute(), get(), ... > > to correctly compute this regression? > > With lm(get(names(dat)[1] ~., data=dat) there are no errors, but the > first variable also shows up among the regressors.Andy Liaw has pointed out that lm(dat) happens to work. But for a more generalizable solution try bla <- function (dat) eval(substitute(lm(foo ~., data=dat), list(foo=as.name(names(dat)[1])))) which has the advantage of embedding a clean value of $call. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Christian Hoffmann <christian.hoffmann <at> wsl.ch> writes:> > Hi all, > > I just cannot think of how to do it: > I want to take the first variable (column) of a data frame and regress > it against all other variables. > > bla <- function (dat) { > reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) > return(reg) > }Andy has already given a particularly concise solution but if your variable is not in first position then you could rearrange the order of the variables to allow his solution or use this which works for any specified position of the dependent variable: data(longley) lm( longley[,7] ~. , data = longley[,-7] )
First: Thanks to everyone who develops R, maintains r-help, and participates in the list :) This is a silly follow up question.>From Andy Liaw: > dat <- data.frame(y=rnorm(10), x1=rnorm(10), x2=rnorm(10))(Silly question - if the answer is on the lm or formula help page I didn't get it:) Why does lm | formula treat dat[,1] slightly differently than dat$y? I see what it is doing - I am curious as to why:)> lm(dat$y ~ .,data = dat)Call: lm(formula = dat$y ~ ., data = dat) Coefficients: (Intercept) x1 x2 -0.08754 -0.04456 -0.16905> lm(dat[,1] ~ .,data = dat)Call: lm(formula = dat[, 1] ~ ., data = dat) Coefficients: (Intercept) y x1 x2 -5.266e-17 1.000e+00 4.121e-17 -3.274e-17 As Gabor Grothendieck pointed out: lm(formula = dat[, 1] ~ ., data = dat[,-1]) works like lm(formula = dat$y ~ ., data = dat) Curious Minds Want to Know Bob -----Original Message----- From: Gabor Grothendieck [mailto:ggrothendieck at myway.com] Sent: Wednesday, March 24, 2004 12:51 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] First Variable in lm Christian Hoffmann <christian.hoffmann <at> wsl.ch> writes:> > Hi all, > > I just cannot think of how to do it: > I want to take the first variable (column) of a data frame and regress > it against all other variables. > > bla <- function (dat) { > reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) > return(reg) > }Andy has already given a particularly concise solution but if your variable is not in first position then you could rearrange the order of the variables to allow his solution or use this which works for any specified position of the dependent variable: data(longley) lm( longley[,7] ~. , data = longley[,-7] ) ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> I want to take the first variable (column) of a data frame and regress> it against all other variables. > > bla <- function (dat) { > reg <- lm(whateverthefirstofthevariablenamesis ~., data=dat) > return(reg) > } > > Thanks to all who answered my question: Prof. Brian Ripley: ------------------- bla <- function (dat) eval(substitute(lm(foo ~., data=dat), list(foo=as.name(names(dat)[1])))) which has the advantage of embedding a clean value of $call. andy_liaw at merck.com: -------------------- lm(formula = dat) rolf at math.unb.ca: ----------------- for(j in 1:ncol(dat)) { fff <- as.formula(paste(names(dat)[j],"~", paste(names(dat)[-j],collapse="+"))) nm <- paste("rslt",j,sep=".") assign(nm,lm(fff,data=dat)) } ggrothendieck at myway.com ----------------------- data(longley) lm( longley[,7] ~. , data = longley[,-7] ) You cannot call data() inside a function: data(dat) reg <- lm(dat[,1] ~ dat[,-1]) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type In addition: Warning message: Data set 'dat' not found in: data(dat) Regards Christian -- Dr.sc.math.Christian W. Hoffmann, http://www.wsl.ch/staff/christian.hoffmann Mathematics + Statistical Computing e-mail: christian.hoffmann at wsl.ch Swiss Federal Research Institute WSL Tel: ++41-44-73922- -77 (office) CH-8903 Birmensdorf, Switzerland -11(exchange), -15 (fax)
Gabor Grothendieck
2004-Mar-25 12:12 UTC
[R] Calling data within a function (was First Variable in lm)
Christian Hoffmann <christian.hoffmann <at> wsl.ch> writes:> ggrothendieck <at> myway.com > ----------------------- > data(longley) > lm( longley[,7] ~. , data = longley[,-7] ) > > You cannot call data() inside a function:To call data() within a function: f <- function() { data(longley, envir = environment()) lm( longley[,7] ~. , data = longley[,-7] ) }