> From: Joshua Bradley <jgbradley1 at gmail.com>
>
> I have been having issues using parallel::mclapply in a memory-efficient
> way and would like some guidance. I am using a 40 core machine with 96 GB
> of RAM. I've tried to run mclapply with 20, 30, and 40 mc.cores and it
has
> practically brought the machine to a standstill each time to the point
> where I do a hard reset.
When mclapply forks to start a new process, the memory is initially
shared with the parent process. However, a memory page has to be
copied whenever either process writes to it. Unfortunately, R's
garbage collector writes to each object to mark and unmark it whenever
a full garbage collection is done, so it's quite possible that every R
object will be duplicated in each process, even though many of them
are not actually changed (from the point of view of the R programs).
One thing on my near-term to-do list for pqR is to re-implement R's
garbage collector in a way that will avoid this (as well as having
various other advantages, including less memory overhead per object).
Radford Neal