Chee Chen
2013-Sep-29 17:22 UTC
[R] Help: concurrent R sessions for different settings of simulations
Dear All, I have spent almost 2 days but did not succeed yet. Problem description: I have 3 parameters, p1, p2 and p3, for which p1 take 1 of 5 possible distributions (e.g., normal, laplace), p2 takes 1 of 3 possible distributions, and p3 takes 1 of 5 possible distribution. These 3 parameters create 75 settings, and these 3 parameters are arguments of a function F; and F is part of simulation codes. To summarize: different value of the ordered triple (p1,p2,p3) means different setting and this is the only difference in the simulation codes. Target to achieve: instead of loop through each of the 75 settings one after another, I would like to concurrently run all 75 settings on the cluster. My attempts: via loops, I used Perl to create 75 files, each for a different triple (p1,p2,p3), and Perl uses "system(R ..)" to execute this setting once it is created. The Perl codes are submitted to cluster correctly. But when I looked into the log file, the cluster still executes it one setting after another setting. Request: any help is appreciated! It is because of the loops of Perl that executes a setting once it is created? Have a nice day! Chee [[alternative HTML version deleted]]
(Ted Harding)
2013-Sep-29 18:31 UTC
[R] Help: concurrent R sessions for different settings of simulations
[See at end] On 29-Sep-2013 17:22:24 Chee Chen wrote:> Dear All, > I have spent almost 2 days but did not succeed yet. > > Problem description: I have 3 parameters, p1, p2 and p3, for which > p1 take 1 of 5 possible distributions (e.g., normal, laplace), > p2 takes 1 of 3 possible distributions, and p3 takes 1 of 5 possible > distribution. These 3 parameters create 75 settings, and these 3 > parameters are arguments of a function F; and F is part of simulation > codes. To summarize: different value of the ordered triple (p1,p2,p3) > means different setting and this is the only difference in the > simulation codes. > > Target to achieve: instead of loop through each of the 75 settings > one after another, I would like to concurrently run all 75 settings > on the cluster. > > My attempts: via loops, I used Perl to create 75 files, each for a > different triple (p1,p2,p3), and Perl uses "system(R ..)" to execute > this setting once it is created. The Perl codes are submitted to > cluster correctly. But when I looked into the log file, the cluster > still executes it one setting after another setting. > > Request: any help is appreciated! It is because of the loops of Perl > that executes a setting once it is created? > > Have a nice day! > CheeJust a simple comment (which does not cionsider the technicalities of using Perl, using a cluster, etc.).>From your description, it looks as though the system waits for oneitem in the loop to finish before it starts the next one. If that is the case, and *if* you are using UNIX/Linux (or other UNIX-like OS), then you could try appending " &" to each submitted command. An outline exemplar: for( s in settings ){ system("R <something depending on s> &") } The " &" has the effect, in a UNIX command line, of detaching the command from the executing program. So the program can continue to run (and take as long as it likes) while the system command-shell is immediately freed up for the next command. Therefore, with the above exemplar, is there were say 75 settings, then that loop would complete in a very short time, after which you would have 75 copies of R executing simulations, and your original R command-line would be available. Just a suggestion (which may have missed the essential point of your query, but worth a try ... ). I have no idea how to achieve a similar effect in Windows ... Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 29-Sep-2013 Time: 19:31:29 This message was sent by XFMail
Henrik Bengtsson
2013-Sep-29 18:51 UTC
[R] Help: concurrent R sessions for different settings of simulations
I strongly suggest to use the BatchJobs package [http://cran.r-project.org/web/packages/BatchJobs] for this. It is easy to install and cross platform and does not rely on external software such as perl. It allows you develop your script running sequentially/interactively on your local machine/laptop, the via *a single configuration file* (./.BatchJobs.R) you can use the exact same script to distribute the jobs to separate R sessions either on multiple cores on the same machine or on a cluster (most common cluster types are supported). The learning curve is not that step - as with most parallel computations you have to move away from using for loops to using lapply() and then you're almost done. /Henrik On Sun, Sep 29, 2013 at 10:22 AM, Chee Chen <chee.chen at yahoo.com> wrote:> Dear All, > I have spent almost 2 days but did not succeed yet. > > Problem description: I have 3 parameters, p1, p2 and p3, for which p1 take 1 of 5 possible distributions (e.g., normal, laplace), p2 takes 1 of 3 possible distributions, and p3 takes 1 of 5 possible distribution. These 3 parameters create 75 settings, and these 3 parameters are arguments of a function F; and F is part of simulation codes. To summarize: different value of the ordered triple (p1,p2,p3) means different setting and this is the only difference in the simulation codes. > > Target to achieve: instead of loop through each of the 75 settings one after another, I would like to concurrently run all 75 settings on the cluster. > > My attempts: via loops, I used Perl to create 75 files, each for a different triple (p1,p2,p3), and Perl uses "system(R ..)" to execute this setting once it is created. The Perl codes are submitted to cluster correctly. But when I looked into the log file, the cluster still executes it one setting after another setting. > > Request: any help is appreciated! It is because of the loops of Perl that executes a setting once it is created? > > Have a nice day! > Chee > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.