I have written an R package to collect some functions to run simulations for a research project. Main functions are written in C and make use of BLAS routines, such as dsymm, dgemm, and ddot. I run simulations in parallel by using mclapply and the problem is that after some point all R instances collapse into one: R seems to be running, but simulations do not progress further. If I run simulations on 16 cores I end up having an R instance with CPU usage about 1600%, never experienced such a behaviour. sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.6 LTS Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so [[alternative HTML version deleted]]
> I have written an R package to collect some functions to run simulations > for a research project. Main functions are written in C and make use of > BLAS routines, such as dsymm, dgemm, and ddot. I run simulations in > parallel by using mclapply and the problem is that after some point all R > instances collapse into one: R seems to be running, but simulations do not > progress further. If I run simulations on 16 cores I end up having an R > instance with CPU usage about 1600%, never experienced such a behaviour.This is outside my area. However, my suggestion is that you provide a *minimal reproducible* example. I think you're more likely to get responses. Also note that there's an R mailing list for high performance computing, however, it doesn't appear to be very active, so I'm not sure which is the best mailing list to use. Maybe someone else can comment on this...? [[alternative HTML version deleted]]
On Mon, 3 Jun 2019 06:37:46 +0200 Nicola Lunardon <nicola.lunardon.84 at gmail.com> wrote:> R seems to be running, but simulations do not progress further.Have you tried message() (or REprintf() in C code) to narrow down the specific part of the code where simulations stop in their progress? It's less convenient than a good debugger, but with parallel code it is sometimes the only way to reproduce the problem and get some information about it.> If I run simulations on 16 cores I end up having an R instance with > CPU usage about 1600%, never experienced such a behaviour.> BLAS: /usr/lib/openblas-base/libblas.so.3 > LAPACK: /usr/lib/libopenblasp-r0.2.18.soOpenBLAS can, indeed, use multiple threads on its own, inside a single process. Combined with mclapply, this might create a situation when there are far more threads competing for CPU time than CPU cores available. Does it help if you set an environment variable such as OPENBLAS_NUM_THREADS [*] to a number less than or equal to (number of CPU cores / mc.cores mclapply argument)? -- Best regards, Ivan [*] https://github.com/xianyi/OpenBLAS#setting-the-number-of-threads-using-environment-variables