soren wilkening
2012-Jun-25 10:11 UTC
[R] using multiple cpu's - scaling in processing power
Hi All In the past I have worked with parallel processing in R where a function F is applied to the elements of a list L. The more cpu cores one has, the faster the process will run. At the time of launching the process for (F,L) I will have a certain fixed number of cpu's that I can use. I have tested this approach and it works fine (i.e. package 'multicore' , using 'mapply' ) But now I am encountering a slightly different situation. I have a task (F,L) that will run for a *long* time (a week) even if I have N cpu's processing it. N is the maximum possible number cpus that I can use, however they will not all be available when I start the process. So the problem is that, when I start the process, I may have only n1 < N cpu's at my disposal. AFter some time, I then have n1 < n2 < N cpu's at my disposal. After some more time, I have n2 < n3 < N cpu's and finally, at one point, I will have N cpu's that I can work with. I "scale in" cpu power over the duration of the process. Why this is the case does not matter. Essentially I cannot control when new cpu's become available nor how many of them will become available at that point. With this I cannot use the standard approach above, where all the cpu cores have to be available before I launch the process !! It would help me if someone knew if R offered a solution for this type of processing. But I would also be happy for pointers to non-R resources that could deal with this. Thanks Soren ----- http://censix.com -- View this message in context: http://r.789695.n4.nabble.com/using-multiple-cpu-s-scaling-in-processing-power-tp4634405.html Sent from the R help mailing list archive at Nabble.com.
Hi Soren Have you looked into using the condor scheduler (http://research.cs.wisc.edu/condor/)? I'm not aware of it linking up with multicore or other parallel processing code inside R, but I've used it to run multiple R processes on a variable number of processors where N can both increase and decrease over time. hth On Mon, Jun 25, 2012 at 5:11 AM, soren wilkening <me at censix.com> wrote:> Hi All > > In the past I have worked with parallel processing in R where a function F > is applied to the elements of a list L. The more cpu cores one has, the > faster the process will run. At the time of launching the process for (F,L) > I will have a certain fixed number of cpu's that I can use. I have tested > this approach and it works fine (i.e. package 'multicore' , using 'mapply' ) > > But now I am encountering a slightly different situation. > > I have a task (F,L) that will run for a *long* time (a week) even if I have > N cpu's processing it. N is the maximum possible number cpus that I can use, > however they will not all be available when I start the process. > So the problem is that, when I start the process, I may have only > n1 < N cpu's at my disposal. AFter some time, I then have > n1 < n2 < N cpu's at my disposal. After some more time, I have > n2 < n3 < N cpu's ?and finally, at one point, I will have > N cpu's that I can work with. I "scale in" cpu power over the duration of > the process. Why this is the case does not matter. Essentially I cannot > control when new cpu's become available nor how many of them will become > available at that point. > > With this I cannot use the standard approach above, where all the cpu cores > have to be available before I launch the process !! > > It would help me if someone knew if R offered a solution for this type of > processing. But I would also be happy for pointers to non-R resources that > could deal with this. > > Thanks > > Soren > > > ----- > http://censix.com > -- > View this message in context: http://r.789695.n4.nabble.com/using-multiple-cpu-s-scaling-in-processing-power-tp4634405.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Drew Tyre School of Natural Resources University of Nebraska-Lincoln 416 Hardin Hall, East Campus 3310 Holdrege Street Lincoln, NE 68583-0974 phone: +1 402 472 4054 fax: +1 402 472 2946 email: atyre2 at unl.edu http://snr.unl.edu/tyre http://aminpractice.blogspot.com http://www.flickr.com/photos/atiretoo