thr3ads.net - R help - [R] How to utilise dual cores and multi-processors on WinXP [Mar 2007]

If this information is useful, please help other people find it:
Share via:

rhelp.20.trevva at spamgourmet.com

2007-Mar-06 15:33 UTC

[R] How to utilise dual cores and multi-processors on WinXP

Hello,

I have a question that I was wondering if anyone had a fairly straightforward
answer to: what is the quickest and easiest way to take advantage of the extra
cores / processors that are now commonplace on modern machines? And how do I do
that in Windows?

I realise that this is a complex question that is not answered easily, so let me
refine it some more. The type of scripts that I'm dealing with are well
suited to parallelisation - often they involve mapping out parameter space by
changing a single parameter and then re-running the simulation 10 (or n times),
and then brining all the results back to gether at the end for analysis. If I
can distribute the runs over all the processors available in my machine, I'm
going to roughly halve the run speed. The question is, how to do this?

I've looked at many of the packages in this area: rmpi, snow, snowFT, rpvm,
and taskPR - these all seem to have the functionality that I want, but don't
exist for windows. The best solution is to switch to Linux, but unfortunately
that's not an option.

Another option is to divide the task in half from the beginning, spawn two
"slave" instances of R (e.g. via Rcmd), let them run, and then collate
the results at the end. But how exactly to do this and how to know when
they're done?

Can anyone recommend a nice solution? I'm sure that I'm not the only one
who'd love to double their computational speed...

Cheers,

Mark

Greg Snow

2007-Mar-06 17:19 UTC

head link

[R] How to utilise dual cores and multi-processors on WinXP

The nws package does run on windows and can split calculations between
multiple R processes.  I have not tried it with a single multiprocessor
pc (don't have one), but have used it with multiple pc's.  It looks like
the muliprocessor pc would work pretty much with the defaults.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> rhelp.20.trevva at spamgourmet.com
> Sent: Tuesday, March 06, 2007 8:33 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] How to utilise dual cores and multi-processors on WinXP
> 
> Hello,
> 
> I have a question that I was wondering if anyone had a fairly 
> straightforward answer to: what is the quickest and easiest 
> way to take advantage of the extra cores / processors that 
> are now commonplace on modern machines? And how do I do that 
> in Windows?
> 
> I realise that this is a complex question that is not 
> answered easily, so let me refine it some more. The type of 
> scripts that I'm dealing with are well suited to 
> parallelisation - often they involve mapping out parameter 
> space by changing a single parameter and then re-running the 
> simulation 10 (or n times), and then brining all the results 
> back to gether at the end for analysis. If I can distribute 
> the runs over all the processors available in my machine, I'm 
> going to roughly halve the run speed. The question is, how to do this?
> 
> I've looked at many of the packages in this area: rmpi, snow, 
> snowFT, rpvm, and taskPR - these all seem to have the 
> functionality that I want, but don't exist for windows. The 
> best solution is to switch to Linux, but unfortunately that's 
> not an option. 
> 
> Another option is to divide the task in half from the 
> beginning, spawn two "slave" instances of R (e.g. via Rcmd), 
> let them run, and then collate the results at the end. But 
> how exactly to do this and how to know when they're done?
> 
> Can anyone recommend a nice solution? I'm sure that I'm not 
> the only one who'd love to double their computational speed...
> 
> Cheers,
> 
> Mark
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Martin Morgan

2007-Mar-06 18:07 UTC

head link

[R] How to utilise dual cores and multi-processors on WinXP

rhelp.20.trevva at spamgourmet.com writes:
> Hello,
>
> I have a question that I was wondering if anyone had a fairly
> straightforward answer to: what is the quickest and easiest way to
> take advantage of the extra cores / processors that are now
> commonplace on modern machines? And how do I do that in Windows?
> I realise that this is a complex question that is not answered easily,
> so let me refine it some more. The type of scripts that I'm dealing
> with are well suited to parallelisation - often they involve mapping
> out parameter space by changing a single parameter and then re-running
> the simulation 10 (or n times), and then brining all the results back
> to gether at the end for analysis. If I can distribute the runs over
> all the processors available in my machine, I'm going to roughly halve
> the run speed. The question is, how to do this?
>
> I've looked at many of the packages in this area: rmpi, snow, snowFT,
> rpvm, and taskPR - these all seem to have the functionality that I
> want, but don't exist for windows. The best solution is to switch to
> Linux, but unfortunately that's not an option.
Rmpi runs on windows (see http://www.stats.uwo.ca/faculty/yu/Rmpi/).

You'll end up modifying your code, probably using one of the many
parLapply-like functions (from Rmpi; comparable functions in snow and
the package papply) to do 'lapply' but spread over the different
compute processors. This is likely to require some thought, as for
instance the data transmission costs can overwhelm any speedup and the
FUN argument to the lapply-like functions should probably reference
only local variables. The classic first attempt performs the
equivalent of 1000 bootstraps on each node, rather than dividing the
1000 replicates amongst nodes (which is actually quite hard to do).

In principle I think you might also be able to use a parallelized
LAPACK, following the general instruction of the R Installation and
Administration guide. I have not done this. It would likely represent
a challenge, and would benefit (perhaps) the code that uses the LAPACK
linear algebra routines.
> Another option is to divide the task in half from the beginning, spawn
> two "slave" instances of R (e.g. via Rcmd), let them run, and
then
> collate the results at the end. But how exactly to do this and how to
> know when they're done?
The Bioconductor package Biobase has a function Aggregate that might
be fun to explore; I don't think it receives much use.
> Can anyone recommend a nice solution? I'm sure that I'm not the
only
> one who'd love to double their computational speed...
>
> Cheers,
>
> Mark
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide http://www.R-project.org/posting-guide.html and provide
> commented, minimal, self-contained, reproducible code.
-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

Reasonably Related Threads

Search for more seemingly similar threads

R help - Mar 2007 - How to utilise dual cores and multi-processors on WinXP

[R] How to utilise dual cores and multi-processors on WinXP

[R] How to utilise dual cores and multi-processors on WinXP

[R] How to utilise dual cores and multi-processors on WinXP

Reasonably Related Threads