Hello, Say I want to make a multiple regression model with the following expression: lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) It gets boring to type in the whole independent variables, in this case x_i. Is there any simple way to do the metaprogramming for this? (There are different cases where the names of the independent variables might sometimes have apparent patterns or not)
The special name "." may be used on the right side of the "~" operator, to stand for all the variables in a data.frame other than the response. --John Chambers, Statistical Models in S, p. 101 So, if the y and Xi (in your case) were the only variables in mydata, then lm(y ~ . , data = mydata) would be of use. Erik June Kim wrote:> Hello, > > Say I want to make a multiple regression model with the following expression: > > lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) > > It gets boring to type in the whole independent variables, in this > case x_i. Is there any simple way to do the metaprogramming for this? > (There are different cases where the names of the independent > variables might sometimes have apparent patterns or not) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Two possible ways around this are 1. If the x's are *all* the other variables in your data frame you can use a dot: fm <- lm(y ~ ., data = myData) 2. Here is another idea> as.formula(paste("y~", paste("x",1:10, sep="", collapse="+")))y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10>(You bore easily!) Bill Venables http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of June Kim Sent: Thursday, 13 November 2008 10:27 AM To: r-help at r-project.org Subject: [R] metaprogramming with lm Hello, Say I want to make a multiple regression model with the following expression: lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) It gets boring to type in the whole independent variables, in this case x_i. Is there any simple way to do the metaprogramming for this? (There are different cases where the names of the independent variables might sometimes have apparent patterns or not) ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You can construct the formula on the fly. Say you have a data frame with
columns: y, x1,...x10:
dat <- data.frame(matrix(rnorm(1100), ncol=11,
dimnames=list(NULL,c("y",
paste("x", 1:10, sep="")))))
Then you could construct the formula using:
form <- formula(paste("y ~ ", paste(names(dat)[which(names(dat)
!"y")], collapse="+")))
fit <- lm(form, data=dat)
HTH,
Simon.
On Thu, 2008-11-13 at 09:27 +0900, June Kim wrote:> Hello,
>
> Say I want to make a multiple regression model with the following
expression:
>
> lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)
>
> It gets boring to type in the whole independent variables, in this
> case x_i. Is there any simple way to do the metaprogramming for this?
> (There are different cases where the names of the independent
> variables might sometimes have apparent patterns or not)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Simon Blomberg, BSc (Hons), PhD, MAppStat.
Lecturer and Consultant Statistician
Faculty of Biological and Chemical Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au
Policies:
1. I will NOT analyse your data for you.
2. Your deadline is your problem.
The combination of some data and an aching desire for
an answer does not ensure that a reasonable answer can
be extracted from a given body of data. - John Tukey.
Maybe Matching Threads
- nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)
- Calculation of e^{z^2/2} for a normal deviate z
- multiple summation
- Orthogonalization with different inner products
- How to obtain the original indices of elements after sorting