Søren Højsgaard
2012-Jun-24 20:50 UTC
[R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?
Dear all, Indexing matrices from the Matrix package with [i,j] seems to be very slow. For example: library(rbenchmark) library(Matrix) mm <- matrix(c(1,0,0,0,0,0,0,0), nr=20, nc=20) MM <- as(mm, "Matrix") lookup <- function(mat){ for (i in 1:nrow(mat)){ for (j in 1:ncol(mat)){ mat[i,j] } } } benchmark(lookup(mm), lookup(MM), columns=c("test", "replications", "elapsed", "relative"), replications=50) test replications elapsed relative 1 lookup(mm) 50 0.01 1 2 lookup(MM) 50 8.77 877 I would have expected a small overhead when indexing a matrix from the Matrix package, but this result is really surprising... Does anybody know if there are faster alternatives to [i,j] ? Best regards S?ren
Duncan Murdoch
2012-Jun-25 09:27 UTC
[R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?
On 12-06-24 4:50 PM, S?ren H?jsgaard wrote:> Dear all, > > Indexing matrices from the Matrix package with [i,j] seems to be very slow. For example: > > library(rbenchmark) > library(Matrix) > mm<- matrix(c(1,0,0,0,0,0,0,0), nr=20, nc=20) > MM<- as(mm, "Matrix") > lookup<- function(mat){ > for (i in 1:nrow(mat)){ > for (j in 1:ncol(mat)){ > mat[i,j] > } > } > } > > benchmark(lookup(mm), lookup(MM), columns=c("test", "replications", "elapsed", "relative"), replications=50) > test replications elapsed relative > 1 lookup(mm) 50 0.01 1 > 2 lookup(MM) 50 8.77 877 > > I would have expected a small overhead when indexing a matrix from the Matrix package, but this result is really surprising... > Does anybody know if there are faster alternatives to [i,j] ?There's also a large overhead when indexing a dataframe, though Matrix appears to be slower. It's designed to work on whole matrices at a time, not single entries. So I'd suggest that if you need to use [i,j] indexing, then try to arrange your code to localize the access, and extract a submatrix as a regular fast matrix first. (Or if it will fit in memory, convert the whole thing to a matrix just for the access. If I just add the line mat <- as.matrix(mat) at the start of your lookup function, it becomes several hundred times faster.)
Reasonably Related Threads
- Data Extraction - benchmark()
- Fast way of finding top-n values of a long vector
- match function causing bad performance when using table function on factors with multibyte characters on Windows
- [WISH / PATCH] possibility to split string literals across multiple lines
- updating elements of a vector sequentially - is there a faster way?