thr3ads.net - R help - [R] fastest OLS w/ NA's and need for SE's [Sep 2009]

If this information is useful, please help other people find it:
Share via:

ivo welch

2009-Sep-14 21:19 UTC

[R] fastest OLS w/ NA's and need for SE's

dear R wizards:  apologies for two queries in one day.   I have a long form
data set, which identifies about 5,000 regressions, each with about 1,000
observations.

unit  date   y x
1    20060101 <two values>
1    20060102 <two values>
...
5000   20081230  <two values>
5000   20081231  <two values>

I need to run such regressions many many times, because they are part of an
optimization.  thus, getting my code to be fast is paramount.   I will need
to pick off the 5,000 coefficients on x (i.e., b) and the standard errors of
b's.  I can ignore the 5,000 intercept.

    by( dataset, as.factor(dataset$unit), function(x) coef(lm( y ~ x,
data=x)) )
gives me the coefficients.  of course, I could use the summary method to lm
to pick off the coefficient standard errors, too.  my guess is that this
would be slow.

I think the alternative would be to delete all NAs first, and then use a
building block function (such as lm.fit(), or solve(qr(),y)).  this would be
fast for getting the coefficients, but I wonder whether there is a *FAST*
way to obtain the standard error of b.  (I do know slow ways, but this would
defeat the purpose.)  is this the right idea?  or will I just end up with
more code but not more speed than I would with summary(lm())?  can someone
tell me the "fastest" way to generate b and se(b)?

is there anything else that comes to mind as a recommended way to speed this
up in R, short of writing everything in C?

as always, advice highly appreciated.

/iaw
-- 
Ivo Welch (ivo.welch@brown.edu, ivo.welch@gmail.com)

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Sep 2009 - fastest OLS w/ NA's and need for SE's

[R] fastest OLS w/ NA's and need for SE's

Apparently Analagous Threads

Wisdom of the Ancients