Hi all, I'm working on a large simulation and I'm using the doParallel package to parallelize my work. I have 20 cores on my machine and would like to preserve some for day-to-day activities - word processing, sending emails, etc. I started by saving 1 core and it was clear that *everything* was so slow as to be nearly unusable. Any suggestions on how many cores to hold back (e.g., not to put to work on the parallel process)? Thanks, Leslie [[alternative HTML version deleted]]
Hi Leslie and all. You may want to investigate using SparklyR on a cloud environment like AWS, where you have more packages that are designed to work on cluster computing environments and you have more control over those types of parallel operations. V/r, Tom W. Quoting Leslie Rutkowski <leslie.rutkowski at gmail.com>:> Hi all, > > I'm working on a large simulation and I'm using the doParallel package to > parallelize my work. I have 20 cores on my machine and would like to > preserve some for day-to-day activities - word processing, sending emails, > etc. > > I started by saving 1 core and it was clear that *everything* was so slow > as to be nearly unusable. > > Any suggestions on how many cores to hold back (e.g., not to put to work on > the parallel process)? > > Thanks, > Leslie > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Do you have 20 actual cores or 10 cores/20 threads? detectCores() doesn't usually know the difference but the CPU may be too busy accessing memory to let that last thread get any useful work done. I often find that allocating real cores is more practical than thinking in terms of thread so try allocating 9 workers and watch your cpu usage. Experiment with your settings... the right balance may be dependent on your other activities as well as your hardware, since your analysis may not be completely memory access limited and threads might make (some) sense for you. On September 3, 2020 10:44:34 AM PDT, Leslie Rutkowski <leslie.rutkowski at gmail.com> wrote:>Hi all, > >I'm working on a large simulation and I'm using the doParallel package >to >parallelize my work. I have 20 cores on my machine and would like to >preserve some for day-to-day activities - word processing, sending >emails, >etc. > >I started by saving 1 core and it was clear that *everything* was so >slow >as to be nearly unusable. > >Any suggestions on how many cores to hold back (e.g., not to put to >work on >the parallel process)? > >Thanks, >Leslie > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
On 2020-09-03 13:44 -0400, Leslie Rutkowski wrote:> Hi all, > > I'm working on a large simulation and > I'm using the doParallel package to > parallelize my work. I have 20 cores > on my machine and would like to > preserve some for day-to-day > activities - word processing, sending > emails, etc. > > I started by saving 1 core and it was > clear that *everything* was so slow as > to be nearly unusable. > > Any suggestions on how many cores to > hold back (e.g., not to put to work on > the parallel process)?Dear Leslie, you can also use the core parallel package. See ?parallel::makeCluster and ?parallel::parSapply. Here I run the function FUN on the vector 1:3 over three threads. FUN needs otherFun, so you can export it to the cluster. Remember to stop the cluster in the end. cl <- parallel::makeCluster(3) FUN <- function(x) { return(otherFun(x^2)) } otherFun <- function(x) { return(x+1) } parallel::clusterExport(cl, "otherFun") parallel::parSapply( cl=cl, X=1:3, FUN=FUN) parallel::stopCluster(cl) You could run e.g. 15 cores or something? parallel::makeCluster(15) ... Best, Rasmus -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200903/30aa348d/attachment.sig>
The big question is whether each worker or thread uses parallel processing itself, or whether it uses resources like cache in which case 20 threads fighting over the cache would slow you down substantially. If your simulations use operations implemented in BLAS or LAPACK, be aware that some R installations use custom fast BLAS that can use multiple cores and the processor cache. You can see some of it in sessionInfo(). The other issue is memory usage - if you exhaust your physical RAM, your computer will slow down not so much because of CPU load but rather because of memory management (swapping to and from disk). I would run some smaller experimental runs that take just a minute or two to finish with say 4, 8 and 12 workers and see how fast these go - you may find no or very little speed up past 8 or perhaps even 4-6 workers. HTH, Peter On Thu, Sep 3, 2020 at 10:45 AM Leslie Rutkowski <leslie.rutkowski at gmail.com> wrote:> > Hi all, > > I'm working on a large simulation and I'm using the doParallel package to > parallelize my work. I have 20 cores on my machine and would like to > preserve some for day-to-day activities - word processing, sending emails, > etc. > > I started by saving 1 core and it was clear that *everything* was so slow > as to be nearly unusable. > > Any suggestions on how many cores to hold back (e.g., not to put to work on > the parallel process)? > > Thanks, > Leslie > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.