Benoit Thieurmel
2015-Jul-27 13:16 UTC
[Rd] parallel performance inline code vs using function ?
Hi, I really try to understand why working with parallel package, code seems to be slower using inside a function... for example : # data don <- lapply(1:150, function(x){data.frame(a = rnorm(100000), b rnorm(100000))}) # inline test t0 <- Sys.time() require(parallel) cl <- makeCluster(4) res <- parLapplyLB(cl, don, function(x){1}) stopCluster(cl) Sys.time()-t0 # 3.5 sec, each thread up to 90 Mo # using function parF <- function(data){ require(parallel) cl <- makeCluster(4) result <- parLapply(cl, data, function(x){1}) stopCluster(cl) } system.time(res2 <- parF(don)) # 9.5 sec, each thread up to 320 Mo ...! It's seems that, using inside a function : - is 3x slower... - more data is loaded into each thread...! Thanks. -- Benoit Thieurmel +33 6 69 04 06 11 10 place de la Madeleine - 75008 Paris [[alternative HTML version deleted]]