Benilton Carvalho
2012-Mar-11 16:18 UTC
[R] Efficient access to elements of a list of lists
Hi, I have a long list of lists from which I want to efficiently extract and rbind elements. So I'm using the approach below: f <- function(i){ out <- replicate(5, list(matrix(rnorm(80), nc=20))) names(out) <- letters[1:5] out } set.seed(1) lst <- lapply(1:1.5e6, f) (t0 <- system.time(tmp <- do.call(rbind, lapply(lst, '[[', 'b')))) Is there anything better/faster than the do.call+rbind+lapply combo above? On this example, the combo takes roughly 20s on my machine... but on the data I'm working with, it takes more than 1 minute... And given that I need to repeat the task several times, the cumul. amount of time is significant for me. Thank you for any suggestion/comment, benilton
Henrik Bengtsson
2012-Mar-11 18:34 UTC
[R] Efficient access to elements of a list of lists
On Sun, Mar 11, 2012 at 9:18 AM, Benilton Carvalho <beniltoncarvalho at gmail.com> wrote:> Hi, > > I have a long list of lists from which I want to efficiently extract > and rbind elements. So I'm using the approach below: > > > f <- function(i){ > ? ?out <- replicate(5, list(matrix(rnorm(80), nc=20))) > ? ?names(out) <- letters[1:5] > ? ?out > } > set.seed(1) > lst <- lapply(1:1.5e6, f) > (t0 <- system.time(tmp <- do.call(rbind, lapply(lst, '[[', 'b')))) > > > Is there anything better/faster than the do.call+rbind+lapply combo > above?The "[[" function involves method dispatching. You can avoid that by using .subset2(). That may save you some (micro?)seconds. Now, if all extracted elements are truly of the same dimensions;> bList <- lapply(lst, FUN='[[', 'b') > str(head(bList))List of 6 $ : num [1:4, 1:20] 0.936 -0.844 -0.221 -0.581 -2.513 ... $ : num [1:4, 1:20] -0.2618 0.0259 -1.3131 -0.0547 -0.3296 ... $ : num [1:4, 1:20] -1.589 0.844 -1.121 0.21 -0.846 ... $ : num [1:4, 1:20] -1.192 -1.268 1.688 -0.295 0.466 ... $ : num [1:4, 1:20] 2.504 -0.833 -1.751 1.117 -0.775 ... $ : num [1:4, 1:20] 0.119 -0.313 1.741 0.403 -0.261 ... then you can avoid the rbind(), by doing an unlist()/dim()/aperm(), e.g. # Extract 'b' as an 4-by-20-by-1.5e6 array dim <- dim(bList[[1]]); n <- length(bList); bArray <- unlist(bList, use.names=FALSE); dimA <- c(dim, n); dim(bArray) <- dimA; # If you really need a matrix, then... # Turing into a (4*1.5e6)-by-20 array dimM <- dim; dimM[1] <- n*dimM[1]; bMatrix <- aperm(bArray, perm=c(1,3,2)); dim(bMatrix) <- dimM; You owe me a beer ;) /Henrik> On this example, the combo takes roughly 20s on my machine... > but on the data I'm working with, it takes more than 1 minute... And > given that I need to repeat the task several times, the cumul. amount > of time is significant for me. > > Thank you for any suggestion/comment, > > benilton > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.