Sun Yijiang
2019-May-20 13:36 UTC
[Rd] Race condition on parallel package's mcexit and rmChild
I've been hacking with parallel package for some time and built a parallel processing framework with it. However, although very rarely, I did notice "ignoring SIGPIPE signal" error every now and then. After a deep dig into the source code, I think I found something worth noticing. In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause a SIGPIPE. Code from src/library/parallel/src/fork.c: SEXP NORET mc_exit(SEXP sRes) { int res = asInteger(sRes); ... ... if (master_fd != -1) { /* send 0 to signify that we're leaving */ size_t len = 0; /* assign result for Fedora security settings */ ssize_t n = write(master_fd, &len, sizeof(len)); ... ... } So a pipe write is made in mc_exit, and here's how this function is used in src/library/parallel/R/unix/mcfork.R: mcexit <- function(exit.code = 0L, send = NULL) { if (!is.null(send)) try(sendMaster(send), silent = TRUE) .Call(C_mc_exit, as.integer(exit.code)) } Between sendMaster() and mc_exit() calls, which are made in the child process, the master process may call readChild() followed by rmChild(). rmChild closes the pipe on the master side, and if it's called before child calls mc_exit, a SIGPIPE will be raised when child tries to write to the pipe in mc_exit. rmChild is defined but not used in parallel package, so this problem won't surface in most cases. However, it is a useful API and may be used by users like me for advanced control over child processes. I hope we can discuss a solution on it. In fact, I don't see why we need to write to the pipe on child exit and how it has anything to do with "Fedora security settings" as suggested in the comments. Removing it, IMHO, would be a good and clean way to solve this problem. Regards, Yijiang