Dear list,
I would like to do something like:
# Simulate some example data
tmp <- matrix(rnorm(50), ncol=5)
colnames(tmp) <- c("y","x1","x2",
"x3", "x4", "x5")
# Fit a linear model with random noise added to x5 n times
n <- 100
replicate(n, lm(y ~ x1+x2+x3+x4+I(x5+rnorm(nrow(tmp))),
data=as.data.frame(tmp)))
I am wondering about ways to speed up this procedure (the data dimensions will
be lot larger in my real examples, so each iteration does take a bit of time).
The procedure is of course trivial to parallelize, but I'm also interested
in ways to speed up the actual fitting of the linear model. I am aware of the
fact that lm.fit is faster than lm as well as the fact that there are faster
ways to to do the linear model program if you use Cholesky decomposition (as is
nicely described in Douglas Bates Comparisons vignette for the Matrix package).
What I would be very happy to get help and ideas about is if there are clever
ways to use the fact that the RHS is almost the same in each iteration (I will
always add noise to just one independent variable). Can this fact in any way be
used to speed up the calculations? Or are there other ways in which the fits can
be made faster?
Thanks for any help!
Best regards,
Martin.
[[alternative HTML version deleted]]