Tomas:
Do some reading on parallelization.
Parallelizing code requires the overhead of setting up, keeping track
of, synching the separate threads. Whether that overhead is worth the
cost depends on the problem,the size, the algorithms, the
machines/hardware,...
Cheers,
Bert
On Thu, Aug 8, 2013 at 4:00 AM, Tomas Reigl <incivile at seznam.cz>
wrote:>
>
> Hello,
>
>
> i'm pretty confused. I want to speed up my algorithm by using mclapply:
> parallel, but when I compare time efficiency, apply still wins.
>
> I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by
my
> function quantsm and I'm wrapping my data into matrix/list for
apply/lapply
> (mclapply) usage.
>
>
>
>
> I adjust my data like this:
>
> <code><span class='pln'>q </span><span
class='pun'>=</span><span class='pln'>
> matrix</span><span class='pun'>(</span><span
class='pln'>data</span><span
> class='pun'>,</span><span class='pln'>
ncol</span><span
> class='pun'>=</span><span
class='pln'>N</span><span
class='pun'>)</span><span
> class='pln'> </span><span
class='com'># wrapping into matrix (using N > 2, 4, 6 or
8)</span><span class='pln'>
> ql </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='pln'>list</span><span
class='pun'>(</span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='pln'>data</span><span
class='pun'>.</span><span
> class='pln'>frame</span><span
class='pun'>(</span><span
> class='pln'>q</span><span
class='pun'>))</span><span class='pln'>
> </span><span class='com'># making
list</span></code>
>
> And time comparing:
>
> <code><span class='pln'>apply</span><span
class='pun'>=</span><span
> class='pln'>system</span><span
class='pun'>.</span><span
> class='pln'>time</span><span
class='pun'>(</span><span
> class='pln'>apply</span><span
class='pun'>(</span><span
> class='pln'>q</span><span
class='pun'>,</span><span class='pln'>
</span><span
> class='lit'>1</span><span
class='pun'>,</span><span class='pln'>
> FUN</span><span class='pun'>=</span><span
class='pln'>quantsm</span><span
> class='pun'>,</span><span class='pln'>
</span><span
> class='lit'>0.50</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>2</span><span
class='pun'>))</span><span class='pln'>
> lapply</span><span class='pun'>=</span><span
class='pln'>system</span><span
> class='pun'>.</span><span
class='pln'>time</span><span
> class='pun'>(</span><span
class='pln'>lapply</span><span
> class='pun'>(</span><span
class='pln'>ql</span><span
class='pun'>,</span><span
> class='pln'> FUN</span><span
class='pun'>=</span><span
> class='pln'>quantsm</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>2</span><span
class='pun'>))</span><span
> class='pln'>
> mc2lapply</span><span
class='pun'>=</span><span
class='pln'>system</span><span
> class='pun'>.</span><span
class='pln'>time</span><span
> class='pun'>(</span><span
class='pln'>mclapply</span><span
> class='pun'>(</span><span
class='pln'>ql</span><span
class='pun'>,</span><span
> class='pln'> FUN</span><span
class='pun'>=</span><span
> class='pln'>quantsm</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>2</span><span
class='pun'>,</span><span
> class='pln'> mc</span><span
class='pun'>.</span><span
> class='pln'>cores</span><span
class='pun'>=</span><span
> class='lit'>2</span><span
class='pun'>))</span><span class='pln'>
> mc4lapply</span><span
class='pun'>=</span><span
class='pln'>system</span><span
> class='pun'>.</span><span
class='pln'>time</span><span
> class='pun'>(</span><span
class='pln'>mclapply</span><span
> class='pun'>(</span><span
class='pln'>ql</span><span
class='pun'>,</span><span
> class='pln'> FUN</span><span
class='pun'>=</span><span
> class='pln'>quantsm</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>2</span><span
class='pun'>,</span><span
> class='pln'> mc</span><span
class='pun'>.</span><span
> class='pln'>cores</span><span
class='pun'>=</span><span
> class='lit'>4</span><span
class='pun'>))</span><span class='pln'>
> mc6lapply</span><span
class='pun'>=</span><span
class='pln'>system</span><span
> class='pun'>.</span><span
class='pln'>time</span><span
> class='pun'>(</span><span
class='pln'>mclapply</span><span
> class='pun'>(</span><span
class='pln'>ql</span><span
class='pun'>,</span><span
> class='pln'> FUN</span><span
class='pun'>=</span><span
> class='pln'>quantsm</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>2</span><span
class='pun'>,</span><span
> class='pln'> mc</span><span
class='pun'>.</span><span
> class='pln'>cores</span><span
class='pun'>=</span><span
> class='lit'>6</span><span
class='pun'>))</span><span class='pln'>
> mc8lapply</span><span
class='pun'>=</span><span
class='pln'>system</span><span
> class='pun'>.</span><span
class='pln'>time</span><span
> class='pun'>(</span><span
class='pln'>mclapply</span><span
> class='pun'>(</span><span
class='pln'>ql</span><span
class='pun'>,</span><span
> class='pln'> FUN</span><span
class='pun'>=</span><span
> class='pln'>quantsm</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>2</span><span
class='pun'>,</span><span
> class='pln'> mc</span><span
class='pun'>.</span><span
> class='pln'>cores</span><span
class='pun'>=</span><span
> class='lit'>8</span><span
class='pun'>))</span><span class='pln'>
> timing</span><span class='pun'>=</span><span
class='pln'>rbind</span><span
> class='pun'>(</span><span
class='pln'>apply</span><span
> class='pun'>,</span><span
class='pln'>lapply</span><span
> class='pun'>,</span><span
class='pln'>mc2lapply</span><span
> class='pun'>,</span><span
class='pln'>mc4lapply</span><span
> class='pun'>,</span><span
class='pln'>mc6lapply</span><span
> class='pun'>,</span><span
class='pln'>mc8lapply</span><span
> class='pun'>)</span></code>
>
> Function quantsm:
>
> <code><span class='pln'>quantsm </span><span
class='pun'><-</span><span
> class='pln'> </span><span
class='kwd'>function</span><span class='pln'>
> </span><span class='pun'>(</span><span
class='pln'>y</span><span
> class='pun'>,</span><span class='pln'> p
</span><span
> class='pun'>=</span><span class='pln'>
</span><span
> class='lit'>0.5</span><span
class='pun'>,</span><span class='pln'>
> </span><span class='kwd'>lambda</span><span
class='pun'>)</span><span
> class='pln'> </span><span
class='pun'>{</span><span class='pln'>
> </span><span class='com'># Quantile
smoothing</span><span class='pln'>
> </span><span class='com'># Input: response y,
quantile level p (0<p<1),
> smoothing parmeter lambda</span><span class='pln'>
> </span><span class='com'># Result: quantile
curve</span><span class='pln'>
>
> </span><span class='com'># Augment the data for the
difference
> penalty</span><span class='pln'>
> m </span><span
class='pun'><-</span><span class='pln'>
length</span><span
> class='pun'>(</span><span
class='pln'>y</span><span
class='pun'>)</span><span
> class='pln'>
> E </span><span
class='pun'><-</span><span class='pln'>
diag</span><span
> class='pun'>(</span><span
class='pln'>m</span><span
class='pun'>);</span><span
> class='pln'>
> </span><span class='typ'>Dmat</span><span
class='pln'> </span><span
> class='pun'><-</span><span class='pln'>
diff</span><span
> class='pun'>(</span><span
class='pln'>E</span><span
class='pun'>);</span><span
> class='pln'>
> X </span><span
class='pun'><-</span><span class='pln'>
rbind</span><span
> class='pun'>(</span><span
class='pln'>E</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='kwd'>lambda</span><span class='pln'>
> </span><span class='pun'>*</span><span
class='pln'> </span><span
> class='typ'>Dmat</span><span
class='pun'>)</span><span class='pln'>
> u </span><span
class='pun'><-</span><span class='pln'>
c</span><span
> class='pun'>(</span><span
class='pln'>y</span><span
class='pun'>,</span><span
> class='pln'> rep</span><span
class='pun'>(</span><span
> class='lit'>0</span><span
class='pun'>,</span><span class='pln'> m
> </span><span class='pun'>-</span><span
class='pln'> </span><span
> class='lit'>1</span><span
class='pun'>))</span><span class='pln'>
>
> </span><span class='com'># Call quantile
regression</span><span
> class='pln'>
> q </span><span
class='pun'><-</span><span class='pln'>
rq</span><span
> class='pun'>.</span><span
class='pln'>fit</span><span
> class='pun'>.</span><span
class='pln'>fnb</span><span
> class='pun'>(</span><span
class='pln'>X</span><span
class='pun'>,</span><span
> class='pln'> u</span><span
class='pun'>,</span><span class='pln'> tau
> </span><span class='pun'>=</span><span
class='pln'> p</span><span
> class='pun'>)</span><span class='pln'>
> q
> </span><span class='pun'>}</span></code>
>
> Function rq.fit.fnb (quantreg library):
>
> <code><span class='pln'>rq</span><span
class='pun'>.</span><span
> class='pln'>fit</span><span
class='pun'>.</span><span class='pln'>fnb
> </span><span class='pun'><-</span><span
class='pln'> </span><span
> class='kwd'>function</span><span
class='pln'> </span><span
> class='pun'>(</span><span
class='pln'>x</span><span
class='pun'>,</span><span
> class='pln'> y</span><span
class='pun'>,</span><span class='pln'> tau
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='lit'>0.5</span><span
class='pun'>,</span><span class='pln'> beta
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='lit'>0.99995</span><span
class='pun'>,</span><span class='pln'> eps
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='lit'>1e-06</span><span
class='pun'>)</span><span class='pln'>
> </span><span class='pun'>{</span><span
class='pln'>
> n </span><span
class='pun'><-</span><span class='pln'>
length</span><span
> class='pun'>(</span><span
class='pln'>y</span><span
class='pun'>)</span><span
> class='pln'>
> p </span><span
class='pun'><-</span><span class='pln'>
ncol</span><span
> class='pun'>(</span><span
class='pln'>x</span><span
class='pun'>)</span><span
> class='pln'>
> </span><span class='kwd'>if</span><span
class='pln'> </span><span
> class='pun'>(</span><span class='pln'>n
</span><span
> class='pun'>!=</span><span class='pln'>
nrow</span><span
> class='pun'>(</span><span
class='pln'>x</span><span
class='pun'>))</span><span
> class='pln'>
> stop</span><span
class='pun'>(</span><span class='str'>"x and
y don't
> match n"</span><span
class='pun'>)</span><span class='pln'>
> </span><span class='kwd'>if</span><span
class='pln'> </span><span
> class='pun'>(</span><span class='pln'>tau
</span><span
> class='pun'><</span><span class='pln'>
eps </span><span
> class='pun'>||</span><span class='pln'> tau
</span><span
> class='pun'>></span><span class='pln'>
</span><span class='lit'>1</span><span
> class='pln'> </span><span
class='pun'>-</span><span class='pln'>
> eps</span><span class='pun'>)</span><span
class='pln'>
> stop</span><span
class='pun'>(</span><span class='str'>"No
parametric
> Frisch-Newton method. Set tau in (0,1)"</span><span
class='pun'>)</span><span
> class='pln'>
> rhs </span><span
class='pun'><-</span><span class='pln'>
</span><span
> class='pun'>(</span><span
class='lit'>1</span><span class='pln'>
</span><span
> class='pun'>-</span><span class='pln'>
tau</span><span
> class='pun'>)</span><span class='pln'>
</span><span class='pun'>*</span><span
> class='pln'> apply</span><span
class='pun'>(</span><span
> class='pln'>x</span><span
class='pun'>,</span><span class='pln'>
</span><span
> class='lit'>2</span><span
class='pun'>,</span><span class='pln'>
> sum</span><span class='pun'>)</span><span
class='pln'>
> d </span><span
class='pun'><-</span><span class='pln'>
rep</span><span
> class='pun'>(</span><span
class='lit'>1</span><span
class='pun'>,</span><span
> class='pln'> n</span><span
class='pun'>)</span><span class='pln'>
> u </span><span
class='pun'><-</span><span class='pln'>
rep</span><span
> class='pun'>(</span><span
class='lit'>1</span><span
class='pun'>,</span><span
> class='pln'> n</span><span
class='pun'>)</span><span class='pln'>
> wn </span><span
class='pun'><-</span><span class='pln'>
rep</span><span
> class='pun'>(</span><span
class='lit'>0</span><span
class='pun'>,</span><span
> class='pln'> </span><span
class='lit'>10</span><span class='pln'>
</span><span
> class='pun'>*</span><span class='pln'>
n</span><span
> class='pun'>)</span><span class='pln'>
> wn</span><span class='pun'>[</span><span
class='lit'>1</span><span
> class='pun'>:</span><span
class='pln'>n</span><span
class='pun'>]</span><span
> class='pln'> </span><span
class='pun'><-</span><span class='pln'>
</span><span
> class='pun'>(</span><span
class='lit'>1</span><span class='pln'>
</span><span
> class='pun'>-</span><span class='pln'>
tau</span><span
> class='pun'>)</span><span class='pln'>
> z </span><span
class='pun'><-</span><span class='pln'>
</span><span
> class='pun'>.</span><span
class='typ'>Fortran</span><span
> class='pun'>(</span><span
class='str'>"rqfnb"</span><span
> class='pun'>,</span><span class='pln'>
</span><span class='kwd'>as</span><span
> class='pun'>.</span><span
class='pln'>integer</span><span
> class='pun'>(</span><span
class='pln'>n</span><span
class='pun'>),</span><span
> class='pln'> </span><span
class='kwd'>as</span><span
> class='pun'>.</span><span
class='pln'>integer</span><span
> class='pun'>(</span><span
class='pln'>p</span><span
class='pun'>),</span><span
> class='pln'> a </span><span
class='pun'>=</span><span class='pln'>
> </span><span class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>t</span><span
class='pun'>(</span><span
class='kwd'>as</span><span
> class='pun'>.</span><span
class='pln'>matrix</span><span
> class='pun'>(</span><span
class='pln'>x</span><span
> class='pun'>))),</span><span class='pln'>
> c </span><span
class='pun'>=</span><span class='pln'>
</span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(-</span><span
> class='pln'>y</span><span
class='pun'>),</span><span class='pln'> rhs
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>rhs</span><span
class='pun'>),</span><span class='pln'> d
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>d</span><span
class='pun'>),</span><span class='pln'>
> </span><span
class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>u</span><span
class='pun'>),</span><span class='pln'> beta
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>beta</span><span
class='pun'>),</span><span class='pln'> eps
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>eps</span><span
class='pun'>),</span><span class='pln'>
> wn </span><span
class='pun'>=</span><span class='pln'>
</span><span
> class='kwd'>as</span><span
class='pun'>.</span><span
> class='kwd'>double</span><span
class='pun'>(</span><span
> class='pln'>wn</span><span
class='pun'>),</span><span class='pln'> wp
> </span><span class='pun'>=</span><span
class='pln'> </span><span
> class='kwd'>double</span><span
class='pun'>((</span><span class='pln'>p
> </span><span class='pun'>+</span><span
class='pln'> </span><span
> class='lit'>3</span><span
class='pun'>)</span><span class='pln'>
</span><span
> class='pun'>*</span><span class='pln'>
p</span><span
> class='pun'>),</span><span class='pln'>
it</span><span
> class='pun'>.</span><span class='pln'>count
</span><span
> class='pun'>=</span><span class='pln'>
integer</span><span
> class='pun'>(</span><span
class='lit'>3</span><span
class='pun'>),</span><span
> class='pln'>
> info </span><span
class='pun'>=</span><span class='pln'>
> integer</span><span class='pun'>(</span><span
class='lit'>1</span><span
> class='pun'>),</span><span class='pln'>
PACKAGE </span><span
> class='pun'>=</span><span class='pln'>
</span><span
> class='str'>"quantreg"</span><span
class='pun'>)</span><span class='pln'>
> coefficients </span><span
class='pun'><-</span><span class='pln'>
> </span><span class='pun'>-</span><span
class='pln'>z$wp</span><span
> class='pun'>[</span><span
class='lit'>1</span><span
class='pun'>:</span><span
> class='pln'>p</span><span
class='pun'>]</span><span class='pln'>
> names</span><span
class='pun'>(</span><span
> class='pln'>coefficients</span><span
class='pun'>)</span><span class='pln'>
> </span><span class='pun'><-</span><span
class='pln'> dimnames</span><span
> class='pun'>(</span><span
class='pln'>x</span><span
> class='pun'>)[[</span><span
class='lit'>2</span><span
> class='pun'>]]</span><span class='pln'>
> residuals </span><span
class='pun'><-</span><span class='pln'> y
> </span><span class='pun'>-</span><span
class='pln'> x </span><span
> class='pun'>%*%</span><span class='pln'>
coefficients
> list</span><span class='pun'>(</span><span
class='pln'>coefficients
> </span><span class='pun'>=</span><span
class='pln'> coefficients</span><span
> class='pun'>,</span><span class='pln'> tau
</span><span
> class='pun'>=</span><span class='pln'>
tau</span><span
> class='pun'>,</span><span class='pln'>
residuals </span><span
> class='pun'>=</span><span class='pln'>
residuals</span><span
> class='pun'>)</span><span class='pln'>
> </span><span class='pun'>}</span></code>
>
> For data vector of length 2000 i get:
>
> (value = elapsed time in sec; columns = different number of columns of
> smoothed matrix/list)
>
> <code><span class='pln'>
</span><span class='lit'>2cols</span><span
> class='pln'> </span><span
class='lit'>4cols</span><span class='pln'>
> </span><span class='lit'>6cols</span><span
class='pln'> </span><span
> class='lit'>8cols</span><span class='pln'>
> apply </span><span
class='lit'>0.178</span><span class='pln'>
> </span><span class='lit'>0.096</span><span
class='pln'> </span><span
> class='lit'>0.069</span><span class='pln'>
</span><span
> class='lit'>0.056</span><span class='pln'>
> lapply </span><span
class='lit'>16.555</span><span class='pln'>
> </span><span class='lit'>4.299</span><span
class='pln'> </span><span
> class='lit'>1.785</span><span class='pln'>
</span><span
> class='lit'>0.972</span><span class='pln'>
> mc2lapply </span><span
class='lit'>11.192</span><span class='pln'>
> </span><span class='lit'>2.089</span><span
class='pln'> </span><span
> class='lit'>0.927</span><span class='pln'>
</span><span
> class='lit'>0.545</span><span class='pln'>
> mc4lapply </span><span
class='lit'>10.649</span><span class='pln'>
> </span><span class='lit'>1.326</span><span
class='pln'> </span><span
> class='lit'>0.694</span><span class='pln'>
</span><span
> class='lit'>0.396</span><span class='pln'>
> mc6lapply </span><span
class='lit'>11.271</span><span class='pln'>
> </span><span class='lit'>1.384</span><span
class='pln'> </span><span
> class='lit'>0.528</span><span class='pln'>
</span><span
> class='lit'>0.320</span><span class='pln'>
> mc8lapply </span><span
class='lit'>10.133</span><span class='pln'>
> </span><span class='lit'>1.390</span><span
class='pln'> </span><span
> class='lit'>0.560</span><span class='pln'>
</span><span
> class='lit'>0.260</span></code>
>
> For data of length 4000 i get:
>
> <code><span class='pln'>
</span><span class='lit'>2cols</span><span
> class='pln'> </span><span
class='lit'>4cols</span><span class='pln'>
> </span><span class='lit'>6cols</span><span
class='pln'> </span><span
> class='lit'>8cols</span><span class='pln'>
> apply </span><span
class='lit'>0.351</span><span class='pln'>
> </span><span class='lit'>0.187</span><span
class='pln'> </span><span
> class='lit'>0.137</span><span class='pln'>
</span><span
> class='lit'>0.110</span><span class='pln'>
> lapply </span><span
class='lit'>189.339</span><span class='pln'>
> </span><span class='lit'>32.654</span><span
class='pln'> </span><span
> class='lit'>14.544</span><span class='pln'>
</span><span
> class='lit'>8.674</span><span class='pln'>
> mc2lapply </span><span
class='lit'>186.047</span><span class='pln'>
> </span><span class='lit'>20.791</span><span
class='pln'> </span><span
> class='lit'>7.261</span><span class='pln'>
</span><span
> class='lit'>4.231</span><span class='pln'>
> mc4lapply </span><span
class='lit'>185.382</span><span class='pln'>
> </span><span class='lit'>30.286</span><span
class='pln'> </span><span
> class='lit'>5.767</span><span class='pln'>
</span><span
> class='lit'>2.397</span><span class='pln'>
> mc6lapply </span><span
class='lit'>184.048</span><span class='pln'>
> </span><span class='lit'>30.170</span><span
class='pln'> </span><span
> class='lit'>8.059</span><span class='pln'>
</span><span
> class='lit'>2.865</span><span class='pln'>
> mc8lapply </span><span
class='lit'>182.611</span><span class='pln'>
> </span><span class='lit'>37.617</span><span
class='pln'> </span><span
> class='lit'>7.408</span><span class='pln'>
</span><span
> class='lit'>2.842</span></code>
>
> Why is apply so much more efficient than mclapply? Maybe I'm just doing
some
> usual beginner mistake.
>
> Thank you for your reactions.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm