thr3ads.net - R devel - [Rd] (PR#11064) how to reproduce... [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Twitter
Facebook
Email

simon.debernard at altrabio.com

2008-Jun-13 16:45 UTC

[Rd] (PR#11064) how to reproduce...

You can try this:

data <- cbind("a"=sample(1:100000), "b"=sample(1:100000))
fact <- sample(rep(1:10000, each=10))
system.time(std <- by(data, fact, colSums))
by.matrix <- function (data, INDICES, FUN, ...) {
     if (!is.list(INDICES)) {
         IND <- vector("list", 1)
         IND[[1]] <- INDICES
         names(IND) <- deparse(substitute(INDICES))[1]
     }
     else IND <- INDICES
     FUNx <- function(x) FUN(data[x, , drop = FALSE], ...)
     nd <- nrow(data)
     ans <- eval(substitute(tapply(1:nd, IND, FUNx)), as.data.frame 
(data))
     attr(ans, "call") <- match.call()
     class(ans) <- "by"
     ans
}
system.time(mod <- by(data, fact, colSums))
all.equal(std, mod)

I get a 30x speed up
(I'm not sure why the attributes differ, but I'm sure this can be  
fixed...)

R devel - Jun 2008 - (PR#11064) how to reproduce...

[Rd] (PR#11064) how to reproduce...