thr3ads.net - R help - [R] foreach takes foreever? [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Andre Zege

2013-Jan-21 15:59 UTC

[R] foreach takes foreever?

I started to look at ways to improve times of certain very parallel tasks and
thought that foreach should be a valid candidate to do the job.
So, i opened foreach tutorial by Steve Weston and started timing examples from
it. First example from tutorial is 

>system.time(for(i in 1:100000) sqrt(i))
   user  system elapsed 
   0.06    0.00    0.06 > system.time(foreach(i=1:100000) %do% sqrt(i))   user  system elapsed 
 102.37    0.21  103.38 

Hmm, 1700 time slower?

second example is > system.time(x <- exp(1:1000000))   user  system elapsed 
   0.34    0.03    0.42 >system.time(x <- foreach(i=1:1000000, .combine='c') %do% exp(i))

I stopped it at 958 seconds, didn't have enough patience -- it basically
seems that foreach  slows down this one down naive  by more than 2000 times. I
must be  doing something very wrong. Am i supposed to set some environment
variables before it works properly? I am running 64bit R on win7 dual core
2.27GHZ CPUs and 4GB memory laptop.
	[[alternative HTML version deleted]]

R. Michael Weylandt

2013-Jan-21 16:25 UTC

head link

[R] foreach takes foreever?

You're probably being killed by the overhead of parallelization which
is, in this case, far more than actual computation time. I've not dug
through foreach() in a while, but I think this winds up spawning many
many subprocesses which isn't cheap in Windows.

MW

On Mon, Jan 21, 2013 at 3:59 PM, Andre Zege <azege at yahoo.com>
wrote:> I started to look at ways to improve times of certain very parallel tasks
and thought that foreach should be a valid candidate to do the job.
> So, i opened foreach tutorial by Steve Weston and started timing examples
from it. First example from tutorial is
>
>
>>system.time(for(i in 1:100000) sqrt(i))
>
>    user  system elapsed
>    0.06    0.00    0.06
>> system.time(foreach(i=1:100000) %do% sqrt(i))
>    user  system elapsed
>  102.37    0.21  103.38
>
> Hmm, 1700 time slower?
>
> second example is
>> system.time(x <- exp(1:1000000))
>    user  system elapsed
>    0.34    0.03    0.42
>>system.time(x <- foreach(i=1:1000000, .combine='c') %do%
exp(i))
>
>
> I stopped it at 958 seconds, didn't have enough patience -- it
basically seems that foreach  slows down this one down naive  by more than 2000
times. I must be  doing something very wrong. Am i supposed to set some
environment variables before it works properly? I am running 64bit R on win7
dual core 2.27GHZ CPUs and 4GB memory laptop.
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Steve Lianoglou

2013-Jan-21 16:27 UTC

head link

[R] foreach takes foreever?

Hi,

On Mon, Jan 21, 2013 at 10:59 AM, Andre Zege <azege at yahoo.com>
wrote:> I started to look at ways to improve times of certain very parallel tasks
and thought that foreach should be a valid candidate to do the job.
> So, i opened foreach tutorial by Steve Weston and started timing examples
from it. First example from tutorial is
>
>
>>system.time(for(i in 1:100000) sqrt(i))
>
>    user  system elapsed
>    0.06    0.00    0.06
>> system.time(foreach(i=1:100000) %do% sqrt(i))
>    user  system elapsed
>  102.37    0.21  103.38
>
> Hmm, 1700 time slower?
>
> second example is
>> system.time(x <- exp(1:1000000))
>    user  system elapsed
>    0.34    0.03    0.42
>>system.time(x <- foreach(i=1:1000000, .combine='c') %do%
exp(i))
>
>
> I stopped it at 958 seconds, didn't have enough patience -- it
basically seems that foreach  slows down this one down naive  by more than 2000
times. I must be  doing something very wrong. Am i supposed to set some
environment variables before it works properly? I am running 64bit R on win7
dual core 2.27GHZ CPUs and 4GB memory laptop.
You should keep reading that vignette you are working from :-)
>From Section 5 "Parallel Execution":
"""
... But for the kinds of quick running operations that we?ve been
doing, there wouldn?t be much point to executing them in parallel.
Running many tiny tasks in parallel will usually take more time to
execute than running them sequentially, and if it already runs fast,
there?s no motivation to make it run faster anyway. But if the
operation that we?re executing in parallel takes a minute or longer,
there starts to be some motivation.
"""

The task you are parallelizing is too trivial. The time to coordinate
the data splitting + forking + etc. is more than just running sqrt.

When the specific task you are running within each iteration is more
involved, the benefit of parallelization will become more clear.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Jan 2013 - foreach takes foreever?

[R] foreach takes foreever?

[R] foreach takes foreever?

[R] foreach takes foreever?

Apparently Analagous Threads