Hi, Im using the lm() function where the formula is quite big (300 arguments) and the data is a frame of 3000 values. This is running in a loop where in each step the formula is reduced by one argument, and the lm command is called again (to check which arguments are useful) . This takes 1-2 minutes. Is there a way to speed this up? i checked the code of the lm function and its seems that its preparing the data and then calls lm.Fit(). i thought about just doing this praparing stuff first and only call lm.fit() 300 times. -- View this message in context: http://www.nabble.com/stats-lm%28%29-function-tp22482054p22482054.html Sent from the R help mailing list archive at Nabble.com.
Hi, Im using the lm() function where the formula is quite big (300 arguments) and the data is a frame of 3000 values. This is running in a loop where in each step the formula is reduced by one argument, and the lm command is called again (to check which arguments are useful) . This takes 1-2 minutes. Is there a way to speed this up? i checked the code of the lm function and its seems that its preparing the data and then calls lm.Fit(). i thought about just doing this praparing stuff first and only call lm.fit() 300 times. [[alternative HTML version deleted]]
yes, indeed, you can certainly speed things up, by just changing the design matrix X and feeding it back to lm.fit(). In addition, if you just need the least squares estimates, then you gain a bit more by using constructs of the form: XtX <- crossprod(X) Xty <- crossprod(X, y) betas <- solve(XtX, Xty) I hope it helps. Best, Dimitris Paul Hermes wrote:> Hi, > > Im using the lm() function where the formula is quite big (300 arguments) and the data is a frame of 3000 values. > > This is running in a loop where in each step the formula is reduced by one argument, and the lm command is called again (to check which arguments are useful) . > > This takes 1-2 minutes. > Is there a way to speed this up? > i checked the code of the lm function and its seems that its preparing the data and then calls lm.Fit(). i thought about just doing this praparing stuff first and only call lm.fit() 300 times. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
Look at: ?update For example: lm.obj <- lm (y ~ x1 + ... + x300) lm.obj1 <- update(lm.obj, . ~ . - x1) lm.obj2 <- update(lm.obj1, . ~ . - x2) Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu ----- Original Message ----- From: ph84 <masterodspam at gmx.de> Date: Thursday, March 12, 2009 3:28 pm Subject: [R] stats lm() function To: r-help at r-project.org> Hi, > > Im using the lm() function where the formula is quite big (300 arguments) > and the data is a frame of 3000 values. > > This is running in a loop where in each step the formula is reduced > by one > argument, and the lm command is called again (to check which > arguments are > useful) . > > This takes 1-2 minutes. > Is there a way to speed this up? > i checked the code of the lm function and its seems that its > preparing the > data and then calls lm.Fit(). i thought about just doing this praparing > stuff first and only call lm.fit() 300 times. > > > -- > View this message in context: > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code.
I think you will find that many readers of this list would rather try to dissuade you from this misguided strategy. You are unlikely to get to a sensible solution in using step-down procedures with this sort of situation (large number of predictors with modest size of data). -- David Winsemius On Mar 12, 2009, at 1:59 PM, Paul Hermes wrote:> Hi, > > Im using the lm() function where the formula is quite big (300 > arguments) and the data is a frame of 3000 values. > > This is running in a loop where in each step the formula is reduced > by one argument, and the lm command is called again (to check which > arguments are useful) . > > This takes 1-2 minutes. > Is there a way to speed this up? > i checked the code of the lm function and its seems that its > preparing the data and then calls lm.Fit(). i thought about just > doing this praparing stuff first and only call lm.fit() 300 times. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT