peter_petersen
2011-Jul-12 11:31 UTC
[R] MC-Simulation with foreach: Some cores finish early
Dear R-Users, I run a MC-Simulation using the the packages "foreach" and "doMC" on a PowerMac with 24 cores. There are roughly a hundred parametersets and I parallelized the program in a way, that each core computes one of these parametersets completely. The problem ist, that some parametersets take a lot longer to compute than others. After a while there are only a quarter of the cores still computing (their first parameterset), while others are already finished. But some parametersets are still untouched. I have thought about changing my parameterfile in a way, that every combination takes roughly the same time (longer computations are offset with less repetitions), but maybe there is a more elegant solution. "Is it somehow possible to wake the finished cores, while there is still work to do?" ;-) Sincerly, H. Bumann -- View this message in context: http://r.789695.n4.nabble.com/MC-Simulation-with-foreach-Some-cores-finish-early-tp3661998p3661998.html Sent from the R help mailing list archive at Nabble.com.
Markus Schmidberger
2011-Jul-12 12:03 UTC
[R] MC-Simulation with foreach: Some cores finish early
If you switch directly to the multicore package you can use the mclapply() function. There, check for the parameter mc.preschedule=T / F. You can use this parameter to improve the load balancing. I do not know a parameter to tune foreach with this parameter. Best Markus Am Dienstag, den 12.07.2011, 04:31 -0700 schrieb peter_petersen:> Dear R-Users, > > I run a MC-Simulation using the the packages "foreach" and "doMC" on a > PowerMac with 24 cores. There are roughly a hundred parametersets and I > parallelized the program in a way, that each core computes one of these > parametersets completely. > > The problem ist, that some parametersets take a lot longer to compute than > others. After a while there are only a quarter of the cores still computing > (their first parameterset), while others are already finished. But some > parametersets are still untouched. > > I have thought about changing my parameterfile in a way, that every > combination takes roughly the same time (longer computations are offset with > less repetitions), but maybe there is a more elegant solution. > > "Is it somehow possible to wake the finished cores, while there is still > work to do?" ;-) > > Sincerly, > H. Bumann > > -- > View this message in context: http://r.789695.n4.nabble.com/MC-Simulation-with-foreach-Some-cores-finish-early-tp3661998p3661998.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
peter_petersen <henning.bumann <at> gmail.com> writes:> I run a MC-Simulation using the the packages "foreach" and "doMC" on a > PowerMac with 24 cores. There are roughly a hundred parametersets and I > parallelized the program in a way, that each core computes one of these > parametersets completely. > > The problem ist, that some parametersets take a lot longer to compute than > others. After a while there are only a quarter of the cores still computing > (their first parameterset), while others are already finished. But some > parametersets are still untouched. > > I have thought about changing my parameterfile in a way, that every > combination takes roughly the same time (longer computations are offset with > less repetitions), but maybe there is a more elegant solution.It sounds to me like this would require writing an entire batch scheduling system within R -- i.e., the system would have to maintain a queue and track which cores were finished. I'd love to know if someone's written it, but I sort of doubt it ...