dear R experts---I have been experimenting with the foreach package (with doMC) for a while. my first impression is that it is a very easy way to acquire parallel processing capabilities. (thanks, revolution R.) the only two gotchas were about installation (it required an exit and restart), and the precedence order of the foreach (higher than '+', I think), but once I understood this, everything was smooth. now, I get the impression that there is some serious process overhead issues involved, because foreach spans processes for each parallel task, then collects them all, and then kills the processes. this means that if a program uses 20 %dopar% statements, each starts and stops with a set of new R processes. is this correct? when I used the foreach() package to calculate a few hundred thousand uniroot(), foreach() speeded up the calculations tremendously. however, when I tried to partition a matrix into 8 parts, so as to speed up matrix multiplications, it did the opposite. R's linear algebra code did this faster. (of course, both appeared in loops, so they took a lot of time.) my current rule of thumb is to prefer R's built-in linear algebra over parallel executions. please correct me if this is wrong intuition. if not, then I hope this post helps some novices to foreach(). well worth having, but not a Swiss knife that works everywhere. (hopefully, there will eventually be a package that will make it as easy to use a graphical processor with OpenCL as easily as it is now to use multi-cores. and maybe the doMC package can get an option to be more aggressive in its reuse of processes, although this may intrinsically not be feasible. it still would require having to transfer objects from the master processes to the slave processes.) regards, /iaw ---- Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) CV Starr Professor of Economics (Finance), Brown University http://welch.econ.brown.edu/
