Dear Ivan,
here is the comprehensive info you requested:
THis is the output of top when I run a function LOWn() with mclapply in it. It
executes succesfully. (the number of cores in my machine is 2)
> LOWn(OHLCDataEP[[63]])
Tasks: 127 total, 3 running, 124 sleeping, 0 stopped, 0 zombie
%Cpu0 : 82.3 us, 16.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu1 : 74.1 us, 24.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
MiB Mem : 15531.8 total, 11019.4 free, 3521.8 used, 990.6 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 11723.8 avail Mem
This is the output of top when I run function LOWp() with mclapply also in it.
it hangs:
top - 07:48:08 up 54 min, 2 users, load average: 0.02, 0.36, 0.34
Tasks: 127 total, 1 running, 126 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 15531.8 total, 10976.8 free, 3564.4 used, 990.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 11681.2 avail Mem
The mcalpply call only works the first time when I call it after starting an R
session
TRaceback of after interrupting LOWp:
> LOWp(OHLCDataEP[[63]])
^C
There were 50 or more warnings (use warnings() to see the first
50)> traceback()
3: selectChildren(jobs[!is.na(jobsp)], -1)
2: mclapply(LYGH, FUN = arfima, mc.cores = 2, mc.preschedule = FALSE) at
<tmp>#26
1: LOWp(OHLCDataEP[[63]])
I think child processes spawned by maclapply in FUN2 doesn't get
killed...THis is from the top command AFTER interrupting FUN2 (sometimes there
is only one R process)
38615 ec2-user 20 0 1016432 400020 13392 S 0.0 2.5 0:02.05 R
38696 ec2-user 20 0 1016436 400416 13676 S 0.0 2.5 0:02.03 R
THis is the output when FUN2 is running:
1526 ec2-user 20 0 1525784 651628 23040 S 0.0 4.1 0:11.90 R
2616 ec2-user 20 0 1525784 634688 6092 S 0.0 4.0 0:00.03 R
2617 ec2-user 20 0 1525784 634884 6288 S 0.0 4.0 0:00.02 R
THis is AFTER succesful completion of FUN1:
38615 ec2-user 20 0 1016432 400020 13392 S 0.0 2.5 0:02.05 R
38696 ec2-user 20 0 1016436 400416 13676 S 0.0 2.5 0:02.03 R
Please note that PIDs are same between FUN1 and FUN2, and also that when I am
not parallelising there is only one R process:
1526 ec2-user 20 0 1227788 491248 21368 S 0.0 3.1 0:02.95 R
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.6 (Ootpa)
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] imputeTS_3.3 pbmcapply_1.5.1 attempt_0.3.1 forecast_8.21
loaded via a namespace (and not attached):
[1] Rcpp_1.0.10 urca_1.3-3 pillar_1.9.0 compiler_4.2.1
[5] tseries_0.10-54 xts_0.13.1 lifecycle_1.0.3 tibble_3.2.1
[9] gtable_0.3.3 nlme_3.1-157 lattice_0.20-45 pkgconfig_2.0.3
[13] rlang_1.1.1 cli_3.6.1 curl_5.0.0 xml2_1.3.4
[17] generics_0.1.3 vctrs_0.6.2 lmtest_0.9-40 grid_4.2.1
[21] nnet_7.3-17 ggtext_0.1.2 gridtext_0.1.5 glue_1.6.2
[25] R6_2.5.1 fansi_1.0.4 ggplot2_3.4.2 TTR_0.24.3
[29] magrittr_2.0.3 scales_1.2.1 quantmod_0.4.22 timeDate_4022.108
[33] colorspace_2.1-0 fracdiff_1.5-2 quadprog_1.5-8 utf8_1.2.3
[37] stinepack_1.4 munsell_0.5.0 zoo_1.8-12
THis is the output of jobs -l: (it doesn't do anything)
[ec2-user at ip-172-31-15-116 ~]$ jobs -l
[ec2-user at ip-172-31-15-116 ~]$
killall - SIGCONT R has no effect
You had asked me to attach a debugger to the child processes. How do you get the
child processes spawned by mclapply? For example, how do i identify, among the
listed R processes above, the child processes?
Many thanks in advance....
Thanking you,
Yours sincerely,
AKSHAY M KULKARNI
________________________________
From: Ivan Krylov <krylov.r00t at gmail.com>
Sent: Saturday, June 10, 2023 12:54 PM
To: akshay kulkarni <akshay_e4 at hotmail.com>
Cc: R help Mailing list <r-help at r-project.org>
Subject: Re: [R] inconsistency in mclapply.....
On Fri, 9 Jun 2023 21:19:11 +0000
akshay kulkarni <akshay_e4 at hotmail.com> wrote:
> debug at <tmp>#26: LYG <- mclapply(LYGH, FUN = arfima, mc.cores =
2,
> mc.preschedule = FALSE)
> Browse[2]> LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores >
2,mc.preschedule = FALSE)
> | | 0%, ETA NA
So if you interrupt the code _after_ it hangs at 0%, ETA NA, what's the
traceback? (We're doing this to confirm that the parent process hangs
in either selectChildren() or readChild().)
> You might be interested in this:
>
> [ec2-user at ip-172-31-15-116 ~]$ exit
> logout
> There are stopped jobs.
>
> THis occurs when I close R and try to exit the shell prompt( I am on
> an AWS EC2 RHEL 8 Instance). Can this lead you somewhere?
I guess this proves the existence of child processes, probably spawned
by mclapply, but why would they be _stopped_, I don't know. What's the
output of jobs -l at this point?
(This suggests trying to send them a SIGCONT and seeing what happens.
Does mclapply() get unstuck if you run the command killall -SIGCONT R
from a separate ssh connection? Would be strange if it worked, but
worth a try.)
> As of now I have quit R in my machine, so I can't get session
> info..
Knowing the output of sessionInfo() could still be useful for solving
the problem. It's best to show the output after loading all the
packages, ideally just before you reproduce the problem.
> by the by, how do you run top when running R? I think at least in my
> machine, you have to quit R to get to the shell prompt...
I can think of 3 options:
1) Type Ctrl+Z at the R prompt. R (and the rest of the process group, I
think) becomes suspended, you return to the command line prompt where
you can run other commands. At the system command line prompt, type
"fg" and press Enter in order to continue running R. (Press Enter a
second time so that R prints its command line prompt again.) This is
quick, doesn't require preparation, but messes up the state of the
processes you're interested in. (They become suspended instead of
running, which may complicate debugging.)
2) Open a second ssh connection to the same machine the same way you
had opened the first one. You won't be able to (easily) interact with
the R session running in the first connection, but you'll get a second
system command line where you'll be able to run top, gdb, and other
commands, which should let you inspect the state of the system.
3) Before starting R, install a "terminal multiplexer", that is, GNU
Screen or tmux. If you're still on RHEL, use sudo dnf install screen or
sudo dnf install tmux. One of these commands needs to be run once per
computer. Type the name of the program ("screen" or "tmux")
to start it.
Inside screen/tmux, start R.
In Screen, use Ctrl+A then C in order to create a new virtual terminal;
Ctrl+A then type the number in order to switch between terminals. In
tmux, use Ctrl+B then C in order to create a new virtual terminal;
Ctrl+B then type the number in order to switch between them. Terminals
are numbered from 0 upwards, so the second one will be numbered 1, and
so on.
--
Best regards,
Ivan
[[alternative HTML version deleted]]