Henrik Bengtsson
2018-Aug-30 23:18 UTC
[Rd] Detecting whether a process exists or not by its PID?
Hi, I'd like to test whether a (localhost) PSOCK cluster node is still running or not by its PID, e.g. it may have crashed / core dumped. I'm ok with getting false-positive results due to *another* process with the same PID has since started. I can the PID of each cluster nodes by querying them for their Sys.getpid(), e.g. pids <- parallel::clusterEvalQ(cl, Sys.getpid()) Is there a function in core R for testing whether a process with a given PID exists or not? From trial'n'error, I found that on Linux: pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L)) returns TRUE for existing processes and FALSE otherwise, but I'm not sure if I can trust this. It's not a documented feature in ?tools::pskill, which also warns about 'signal' not being standardized across OSes. The other Linux alternative I can imagine is: pid_exists <- function(pid) system2("ps", args = c("--pid", pid), stdout = FALSE) == 0L Can I expect this to work on macOS as well? What about other *nix systems? And, finally, what can be done on Windows? I'm sure there are packages on CRAN that provides this, but I'd like to keep dependencies at a minimum. I appreciate any feedback. Thxs, Henrik
Gábor Csárdi
2018-Aug-31 06:14 UTC
[Rd] Detecting whether a process exists or not by its PID?
On Fri, Aug 31, 2018 at 1:18 AM Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote: [...]> pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L)) > > returns TRUE for existing processes and FALSE otherwise, but I'm not > sure if I can trust this. It's not a documented feature in > ?tools::pskill, which also warns about 'signal' not being standardized > across OSes.Yes, as long as tools::pskill() is willing to call a killl(0) system call, AFAIK this will work fine on all UNIX systems.> The other Linux alternative I can imagine is: > > pid_exists <- function(pid) system2("ps", args = c("--pid", pid), > stdout = FALSE) == 0L > > Can I expect this to work on macOS as well? What about other *nix systems?There is no --pid option on macOS. I think simply `ps <pid>` is better, but some very minimal systems might not have ps at all.> And, finally, what can be done on Windows?You need to call OpenProcess from C, or find some base R function that does that without messing up the process. Seems like tools::psnice() does that.> I'm sure there are packages on CRAN that provides this, but I'd like > to keep dependencies at a minimum.Yes, e.g. the ps package does this, and it does it properly, i.e. you don't need to worry about pid reuse. Pid reuse does cause problems quite frequently, especially on Windows, and especially on a system that starts a lot of processes, like win-builder. Gabor> I appreciate any feedback. Thxs, > > Henrik > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Tomas Kalibera
2018-Aug-31 12:51 UTC
[Rd] Detecting whether a process exists or not by its PID?
On 08/31/2018 01:18 AM, Henrik Bengtsson wrote:> Hi, I'd like to test whether a (localhost) PSOCK cluster node is still > running or not by its PID, e.g. it may have crashed / core dumped. > I'm ok with getting false-positive results due to *another* process > with the same PID has since started.kill(sig=0) is specified by POSIX but indeed as you say there is a race condition due to PID-reuse.? In principle, detecting that a worker process is still alive cannot be done correctly outside base R. At user-level I would probably consider some watchdog, e.g. the parallel tasks would be repeatedly touching a file. In base R, one can do this correctly for forked processes via mcparallel/mccollect, not for PSOCK cluster workers which are based on system() (and I understand it would be a useful feature) > j <- mcparallel(Sys.sleep(1000)) > mccollect(j, wait=FALSE) NULL # kill the child process > mccollect(j, wait=FALSE) $`1542` NULL More details indeed in ?mcparallel. The key part is that the job must be started as non-detached and as soon as mccollect() collects is, mccollect() must never be called on it again. Tomas> > I can the PID of each cluster nodes by querying them for their > Sys.getpid(), e.g. > > pids <- parallel::clusterEvalQ(cl, Sys.getpid()) > > Is there a function in core R for testing whether a process with a > given PID exists or not? From trial'n'error, I found that on Linux: > > pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L)) > > returns TRUE for existing processes and FALSE otherwise, but I'm not > sure if I can trust this. It's not a documented feature in > ?tools::pskill, which also warns about 'signal' not being standardized > across OSes. > > The other Linux alternative I can imagine is: > > pid_exists <- function(pid) system2("ps", args = c("--pid", pid), > stdout = FALSE) == 0L > > Can I expect this to work on macOS as well? What about other *nix systems? > > And, finally, what can be done on Windows? > > I'm sure there are packages on CRAN that provides this, but I'd like > to keep dependencies at a minimum. > > I appreciate any feedback. Thxs, > > Henrik > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Gábor Csárdi
2018-Aug-31 13:13 UTC
[Rd] Detecting whether a process exists or not by its PID?
On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera <tomas.kalibera at gmail.com> wrote: [...]> kill(sig=0) is specified by POSIX but indeed as you say there is a race > condition due to PID-reuse. In principle, detecting that a worker > process is still alive cannot be done correctly outside base R.I am not sure why you think so.> At user-level I would probably consider some watchdog, e.g. the parallel > tasks would be repeatedly touching a file.I am pretty sure that there are simpler and better solutions. E.g. one would be to ask the worker process for its startup time (with as much precision as possible) and then use the (pid, startup_time) pair as a unique id. With this you can check if the process is still running, by checking that the pid exists, and that its startup time matches. This is all very simple with the ps package, on Linux, macOS and Windows. Gabor> In base R, one can do this correctly for forked processes via > mcparallel/mccollect, not for PSOCK cluster workers which are based on > system() (and I understand it would be a useful feature) > > > j <- mcparallel(Sys.sleep(1000)) > > mccollect(j, wait=FALSE) > NULL > > # kill the child process > > > mccollect(j, wait=FALSE) > $`1542` > NULL > > More details indeed in ?mcparallel. The key part is that the job must be > started as non-detached and as soon as mccollect() collects is, > mccollect() must never be called on it again. > > Tomas > > > > > I can the PID of each cluster nodes by querying them for their > > Sys.getpid(), e.g. > > > > pids <- parallel::clusterEvalQ(cl, Sys.getpid()) > > > > Is there a function in core R for testing whether a process with a > > given PID exists or not? From trial'n'error, I found that on Linux: > > > > pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L)) > > > > returns TRUE for existing processes and FALSE otherwise, but I'm not > > sure if I can trust this. It's not a documented feature in > > ?tools::pskill, which also warns about 'signal' not being standardized > > across OSes. > > > > The other Linux alternative I can imagine is: > > > > pid_exists <- function(pid) system2("ps", args = c("--pid", pid), > > stdout = FALSE) == 0L > > > > Can I expect this to work on macOS as well? What about other *nix systems? > > > > And, finally, what can be done on Windows? > > > > I'm sure there are packages on CRAN that provides this, but I'd like > > to keep dependencies at a minimum. > > > > I appreciate any feedback. Thxs, > > > > Henrik > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Possibly Parallel Threads
- Detecting whether a process exists or not by its PID?
- Detecting whether a process exists or not by its PID?
- Detecting whether a process exists or not by its PID?
- Detecting whether a process exists or not by its PID?
- BUG: tools::pskill() returns incorrect values or non-initated garbage values [PATCH]