Henrik Bengtsson
2016-Oct-05 18:28 UTC
[Rd] parallel: Memory improvement to PSOCK clusters (PATCH)
I would like to bump the attention of a very simple patch to parallel:::slaveLoop(), which I've already submitted as https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17115. The patch lowers the memory overhead for anyone using parallel::makeCluster(). The patch makes sure that the workers remove their results / values as soon as they've been transferred back to the master process. They also remove any incoming objects / values as soon as possible. For instance, if a PSOCK worker produces 1 GiB objects in each iteration, it is currently holding on to the old result while working on the new one resulting in an unnecessary 1 GiB memory overhead. This patch avoids this. Index: src/library/parallel/R/worker.R ==================================================================--- src/library/parallel/R/worker.R (revision 70874) +++ src/library/parallel/R/worker.R (working copy) @@ -44,7 +44,9 @@ t2 <- proc.time() value <- list(type = "VALUE", value = value, success = success, time = t2 - t1, tag = msg$data$tag) + rm(list = "msg") sendData(master, value) + rm(list = "value") } }, interrupt = function(e) NULL) } Thanks, Henrik