How about: x <- data.frame(matrix(rnorm(1550),c(50,31))) model <- step(lm(x[,1] ~ as.matrix(x[,2:31]))) --Matt -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of rongguiwong Sent: Monday, September 20, 2004 20:52 PM To: r-help at stat.math.ethz.ch Subject: [R] how to take this experiment with R? This message uses a character set that is not supported by the Internet Service. To view the original message content, open the attached message. If the text doesn't display correctly, save the attachment to disk, and then open it using a viewer that can display the original character set.
You can express your model like this: lm(X1 ~., x) ?formula gives some help on formulas although the . notation above does not seem to be referred to there. Date: Tue, 21 Sep 2004 11:52:04 +0800 From: rongguiwong <0034058 at fudan.edu.cn> To: <r-help at stat.math.ethz.ch> Subject: [R] how to take this experiment with R? i want to generate 30 independent variables and 1 dependent variable,each has 50 draws from a unit normal distribution. then, searching for the independent variables that together would do the best job for fitting the denpendent variabe. my way to generate the data is. x<-data.frame(matrix(rnorm(1550),c(50,31))) but is there more better way to do it? i want to use the followling is to search the model. model<-step(lm(X1~X2+X3+X4.....)) but i don't know the to express the formula with lm function.i think there is a way the express it efficently. i try ?lm .but on result be found. any help is welcome!
Suppose you just wanted variables 2,4 and 7 (which are in columns 3, 5 and 8, respectively). Then you could do this: lm(X1 ~ X2 + X4 + X7, x) or lm(X1 ~., x[,c(2,4,7)+1]) Date: Tue, 21 Sep 2004 12:27:38 +0800 From: rongguiwong <0034058 at fudan.edu.cn> To: <r-help at stat.math.ethz.ch> Subject: Re: [R] how to take this experiment with R? ÔÚ 2004Äê9ÔÂ21ÈÕ ÐÇÆÚ¶þ 12:10£¬Gabor Grothendieck дµÀ£º it works. but i come across anather problem. i just wnat to select 3 of the best indepent variables. but the reslut from step(lm(X1 ~., x)) is : Step: AIC= -37.64 X1 ~ X2 + X3 + X4 + X8 + X10 + X11 + X13 + X14 + X15 + X18 + X21 + X22 + X25 + X26 + X27 + X29 + X30 + X31 Df Sum of Sq RSS AIC <none> 11.014 -37.642 - X8 1 0.559 11.574 -37.165 - X27 1 0.867 11.881 -35.853 - X2 1 0.971 11.986 -35.416 - X18 1 0.978 11.992 -35.390 - X31 1 1.122 12.136 -34.793 - X10 1 1.400 12.414 -33.659 - X15 1 1.563 12.578 -33.005 - X29 1 1.608 12.622 -32.828 - X13 1 1.805 12.819 -32.055 - X30 1 1.880 12.894 -31.763 - X4 1 2.368 13.382 -29.906 - X14 1 2.495 13.509 -29.433 - X3 1 2.983 13.997 -27.659 - X22 1 2.999 14.013 -27.600 - X21 1 3.377 14.391 -26.271 - X25 1 4.323 15.338 -23.086 - X11 1 6.775 17.789 -15.672 - X26 1 7.232 18.246 -14.403> You can express your model like this: > > > > lm(X1 ~., x) > > > > ?formula gives some help on formulas although the . notation > > above does not seem to be referred to there. > > > > > > Date: Tue, 21 Sep 2004 11:52:04 +0800 > > From: rongguiwong <0034058 at fudan.edu.cn> > > To: <r-help at stat.math.ethz.ch> > > Subject: [R] how to take this experiment with R? > > > > i want to generate 30 independent variables and 1 dependent variable,each > has > > 50 draws from a unit normal distribution. > > then, searching for the independent variables that together would do the > best > > job for fitting the denpendent variabe. > > > > my way to generate the data is. > > x<-data.frame(matrix(rnorm(1550),c(50,31))) > > but is there more better way to do it? > > > > i want to use the followling is to search the model. > > model<-step(lm(X1~X2+X3+X4.....)) > > but i don't know the to express the formula with lm function.i think there > is > > a way the express it efficently. > > i try ?lm .but on result be found. > > > > any help is welcome! >
My mistake, you can't use the structure I proposed with lm() in combination with step(). If what was suggested earlier by Gabor was not what you wanted, and instead you want the 'best' three variable model, then it may be easier to use the leaps package.> library(leaps) > x <- data.frame(matrix(rnorm(1550),c(50,31))) > model <- regsubsets(y=x[,1], x=x[,2:31]) > m1 <-summary(model, matrix=TRUE, matrix.logical=TRUE) > apply(m1$outmat, 1, which)[3]$"3 ( 1 )" X10 X14 X15 9 13 14> lm.fit(x=as.matrix(x[,as.vector(unlist(apply(m1$outmat, 1,which)[3])+1)]), y=x[,1]) $coefficients X10 X14 X15 -0.2694923 -0.4055546 -0.2692063 --Matt -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Gabor Grothendieck Sent: Monday, September 20, 2004 21:57 PM To: 0034058 at fudan.edu.cn; r-help at stat.math.ethz.ch Subject: Re: [R] how to take this experiment with R? Suppose you just wanted variables 2,4 and 7 (which are in columns 3, 5 and 8, respectively). Then you could do this: lm(X1 ~ X2 + X4 + X7, x) or lm(X1 ~., x[,c(2,4,7)+1]) Date: Tue, 21 Sep 2004 12:27:38 +0800 From: rongguiwong <0034058 at fudan.edu.cn> To: <r-help at stat.math.ethz.ch> Subject: Re: [R] how to take this experiment with R? ???? 2004????9????21???? ???????????? 12:10????Gabor Grothendieck ???????????? it works. but i come across anather problem. i just wnat to select 3 of the best indepent variables. but the reslut from step(lm(X1 ~., x)) is : Step: AIC= -37.64 X1 ~ X2 + X3 + X4 + X8 + X10 + X11 + X13 + X14 + X15 + X18 + X21 + X22 + X25 + X26 + X27 + X29 + X30 + X31 Df Sum of Sq RSS AIC <none> 11.014 -37.642 - X8 1 0.559 11.574 -37.165 - X27 1 0.867 11.881 -35.853 - X2 1 0.971 11.986 -35.416 - X18 1 0.978 11.992 -35.390 - X31 1 1.122 12.136 -34.793 - X10 1 1.400 12.414 -33.659 - X15 1 1.563 12.578 -33.005 - X29 1 1.608 12.622 -32.828 - X13 1 1.805 12.819 -32.055 - X30 1 1.880 12.894 -31.763 - X4 1 2.368 13.382 -29.906 - X14 1 2.495 13.509 -29.433 - X3 1 2.983 13.997 -27.659 - X22 1 2.999 14.013 -27.600 - X21 1 3.377 14.391 -26.271 - X25 1 4.323 15.338 -23.086 - X11 1 6.775 17.789 -15.672 - X26 1 7.232 18.246 -14.403> You can express your model like this: > > > > lm(X1 ~., x) > > > > ?formula gives some help on formulas although the . notation > > above does not seem to be referred to there. > > > > > > Date: Tue, 21 Sep 2004 11:52:04 +0800 > > From: rongguiwong <0034058 at fudan.edu.cn> > > To: <r-help at stat.math.ethz.ch> > > Subject: [R] how to take this experiment with R? > > > > i want to generate 30 independent variables and 1 dependent variable,each > has > > 50 draws from a unit normal distribution. > > then, searching for the independent variables that together would do the > best > > job for fitting the denpendent variabe. > > > > my way to generate the data is. > > x<-data.frame(matrix(rnorm(1550),c(50,31))) > > but is there more better way to do it? > > > > i want to use the followling is to search the model. > > model<-step(lm(X1~X2+X3+X4.....)) > > but i don't know the to express the formula with lm function.i think there > is > > a way the express it efficently. > > i try ?lm .but on result be found. > > > > any help is welcome! >______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html