Hello, Say I want to make a multiple regression model with the following expression: lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) It gets boring to type in the whole independent variables, in this case x_i. Is there any simple way to do the metaprogramming for this? (There are different cases where the names of the independent variables might sometimes have apparent patterns or not)
The special name "." may be used on the right side of the "~" operator, to stand for all the variables in a data.frame other than the response. --John Chambers, Statistical Models in S, p. 101 So, if the y and Xi (in your case) were the only variables in mydata, then lm(y ~ . , data = mydata) would be of use. Erik June Kim wrote:> Hello, > > Say I want to make a multiple regression model with the following expression: > > lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) > > It gets boring to type in the whole independent variables, in this > case x_i. Is there any simple way to do the metaprogramming for this? > (There are different cases where the names of the independent > variables might sometimes have apparent patterns or not) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Two possible ways around this are 1. If the x's are *all* the other variables in your data frame you can use a dot: fm <- lm(y ~ ., data = myData) 2. Here is another idea> as.formula(paste("y~", paste("x",1:10, sep="", collapse="+")))y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10>(You bore easily!) Bill Venables http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of June Kim Sent: Thursday, 13 November 2008 10:27 AM To: r-help at r-project.org Subject: [R] metaprogramming with lm Hello, Say I want to make a multiple regression model with the following expression: lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) It gets boring to type in the whole independent variables, in this case x_i. Is there any simple way to do the metaprogramming for this? (There are different cases where the names of the independent variables might sometimes have apparent patterns or not) ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You can construct the formula on the fly. Say you have a data frame with columns: y, x1,...x10: dat <- data.frame(matrix(rnorm(1100), ncol=11, dimnames=list(NULL,c("y", paste("x", 1:10, sep=""))))) Then you could construct the formula using: form <- formula(paste("y ~ ", paste(names(dat)[which(names(dat) !"y")], collapse="+"))) fit <- lm(form, data=dat) HTH, Simon. On Thu, 2008-11-13 at 09:27 +0900, June Kim wrote:> Hello, > > Say I want to make a multiple regression model with the following expression: > > lm(y~x1 + x2 + x3 + ... + x_n,data=mydata) > > It gets boring to type in the whole independent variables, in this > case x_i. Is there any simple way to do the metaprogramming for this? > (There are different cases where the names of the independent > variables might sometimes have apparent patterns or not) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey.
Reasonably Related Threads
- nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)
- Calculation of e^{z^2/2} for a normal deviate z
- multiple summation
- Orthogonalization with different inner products
- How to obtain the original indices of elements after sorting