Stefan Mischke
2005-Jun-16 21:05 UTC
[R] regressing each column of a matrix on all other columns
DeaR list I would like to predict the values of each column of a matrix A by regressing it on all other columns of the same matrix A. I do this with a for loop: A <- B <- matrix(round(runif(10*3,1,10),0),10) A for (i in 1:length(A[1,])) B[,i] <- as.matrix(predict(lm( A[,i] ~ A[,-i] ))) B It works fine, but I need it to be faster. I've looked at *apply but just can't seem to figure it out. Maybe the solution could look somewhat like this: mylm <- function(y,ci) { x <- A[,-ci] b <- lm(y~x) } B <- apply(A,2,mylm,ci=current_column_index(A)) Is there a way to pass the index of the current column in apply to my function? Am I on the right path at all? Thanks for your help. Regards, Stefan
Prof Brian Ripley
2005-Jun-17 07:49 UTC
[R] regressing each column of a matrix on all other columns
apply() is just a for() loop internally so why do you expect it to be faster? Some comments: 1) Here predict() is just extracting the fitted values. 2) Using lm.fit will be faster if fitted values is all you want. 3) You are actually regressing each column on all other columns plus an intercept. 4) The as.matrix is wasteful. So it would be faster to use A1 <- cbind(1, A) for (i in 2:ncol(A)) B[,i-1] <- lm.fit(A1[,-i], A1[,i])$fitted You can do this reasonably efficiently in matrix algebra: one way is to form the inverse of X^TX after removing column means and use the Goodnight sweep operation on each column in turn. On Thu, 16 Jun 2005, Stefan Mischke wrote:> DeaR list > > I would like to predict the values of each column of a matrix A by > regressing it on all other columns of the same matrix A. I do this with > a for loop: > > A <- B <- matrix(round(runif(10*3,1,10),0),10) > A > for (i in 1:length(A[1,])) B[,i] <- as.matrix(predict(lm( A[,i] ~ > A[,-i] ))) > B > > It works fine, but I need it to be faster. I've looked at *apply but > just can't seem to figure it out. > Maybe the solution could look somewhat like this: > > mylm <- function(y,ci) { > x <- A[,-ci] > b <- lm(y~x) > } > B <- apply(A,2,mylm,ci=current_column_index(A)) > > Is there a way to pass the index of the current column in apply to my > function? Am I on the right path at all? > Thanks for your help.-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595