Henrik Bengtsson
2017-Jan-25 03:06 UTC
[Rd] parallel::mc*: Is it possible for a child process to know it is a fork?
When using multicore-forking of the parallel package, is it possible for a child process to know that it is a fork? Something like: parallel::mclapply(1:10, FUN = function(i) { test_if_running_in_a_fork() }) I'm looking into ways to protect against further parallel processes (including threads), which not necessarily are created via the parallel:mc* API, are being spawned off recursively. For instance, there are several packages that by default perform multi-threaded processing using native code, but I'm not sure there's a way for such package to avoid running in multi-threaded mode if running in a forked child R processes. Imagine y <- parallel::mclapply(1:10, FUN = function(i) { somepkg::threaded_calculation_using_all_cores() }) where the developer of `somepkg` is off no control whether user calls it via mclapply() or via lapply(). I can see how the user of mclapply() / lapply() can pass on this information, but that's not safe and it might not be that the user is aware that deep down in the dependency hierarchy there's one or more functions that do multi-thread/process processing. Thanks, Henrik
Jeroen Ooms
2017-Jan-25 04:10 UTC
[Rd] parallel::mc*: Is it possible for a child process to know it is a fork?
On Tue, Jan 24, 2017 at 7:06 PM, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> When using multicore-forking of the parallel package, is it possible > for a child process to know that it is a fork?R internally uses R_isForkedChild to prevent certain operations within the fork. However I don't think this is exported anywhere. You could do something like: extern Rboolean R_isForkedChild; SEXP is_forked(){ return ScalarLogical(R_isForkedChild); } But that won't be allowed on CRAN: * checking compiled code ... NOTE Found non-API call to R: ?R_isForkedChild? Compiled code should not call non-API entry points in R. Another method would be to look at getppid(2) and getpgid(2) to lookup the parent-id and group-id of the current process and test if it matches that of the (parent) R process. If you are only interested in limiting further parallelization within the fork, perhaps you can simply use parallel::mcaffinity to restrict the forked process to a single core.
Henrik Bengtsson
2017-Jan-25 21:42 UTC
[Rd] parallel::mc*: Is it possible for a child process to know it is a fork?
On Tue, Jan 24, 2017 at 8:10 PM, Jeroen Ooms <jeroenooms at gmail.com> wrote:> On Tue, Jan 24, 2017 at 7:06 PM, Henrik Bengtsson > <henrik.bengtsson at gmail.com> wrote: >> When using multicore-forking of the parallel package, is it possible >> for a child process to know that it is a fork? > > R internally uses R_isForkedChild to prevent certain operations within > the fork. However I don't think this is exported anywhere. You could > do something like: > > extern Rboolean R_isForkedChild; > SEXP is_forked(){ > return ScalarLogical(R_isForkedChild); > } > > But that won't be allowed on CRAN: > > * checking compiled code ... NOTE > Found non-API call to R: ?R_isForkedChild? > Compiled code should not call non-API entry points in R.Yes, that's a bummer. It could be useful to have this exposed. It's used by several core packages, not just 'parallel' itself; $ grep -F R_isForkedChild -r --include="*.h" src/include/Defn.h:extern Rboolean R_isForkedChild INI_as(FALSE); /* was this forked? */ $ grep -F R_isForkedChild -r --include="*.c" src/library/tcltk/src/tcltk_unix.c://extern Rboolean R_isForkedChild; src/library/tcltk/src/tcltk_unix.c: if (!R_isForkedChild && !Tcl_lock src/library/parallel/src/fork.c:#include <Defn.h> // for R_isForkedChild src/library/parallel/src/fork.c: R_isForkedChild = 1; src/modules/X11/devX11.c: while (!R_isForkedChild && displayOpen && XPending(display)) { src/modules/X11/devX11.c: if(R_isForkedChild) src/unix/sys-unix.c: if (ptr_R_ProcessEvents && !R_isForkedChild) ptr_R_ProcessEvents();> > Another method would be to look at getppid(2) and getpgid(2) to lookup > the parent-id and group-id of the current process and test if it > matches that of the (parent) R process.I'm not 100% sure I follow. Is the idea similar to the following in R? ppid <- Sys.getpid() is_child <- parallel::mclapply(1:10, FUN = function(i) { Sys.getpid() != ppid }) How can the child process know 'ppid'? getppid would give the parent PID for any process, which could be a non-R process.> > If you are only interested in limiting further parallelization within > the fork, perhaps you can simply use parallel::mcaffinity to restrict > the forked process to a single core.This is tied to parallelization via parallel::mc*, correct? That is, is it only parallel:::mcfork() that respects those settings or does this go down deeper in the OS such that it affects forking / threading on a more general level? Thanks for your pointers and suggestions, Henrik
Possibly Parallel Threads
- parallel::mc*: Is it possible for a child process to know it is a fork?
- Control statements with condition with greater than one should give error (not just warning) [PATCH]
- Control statements with condition with greater than one should give error (not just warning) [PATCH]
- Control statements with condition with greater than one should give error (not just warning) [PATCH]
- sum(), min(), max(), prod() vs. named arguments in ...