Dear Ivan, Thanks for the reply. I am pressurised by a fast approaching deadline and your reply calmed me... Take a look at the following code: debug at <tmp>#26: LYG <- mclapply(LYGH, FUN = arfima, mc.cores = 2, mc.preschedule = FALSE) Browse[2]> length(LYGH) [1] 357 Browse[2]> ^C Browse[2]> LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores = 2,mc.preschedule = FALSE) | | 0%, ETA NA I am debuuging a function FUN wherein the above expressions appear. The pbmclapply code works well if called inside FUN:> FUN(arg)Result.. But as you may note, it doesn't work while in debug mode.... Also, if I replace pbmclapply by maclapply inside FUN, it hangs.... You might be interested in this: [ec2-user at ip-172-31-15-116 ~]$ exit logout There are stopped jobs. THis occurs when I close R and try to exit the shell prompt( I am on an AWS EC2 RHEL 8 Instance). Can this lead you somewhere? As of now I have quit R in my machine, so I can't get session info..but please let me know if you need it necessarily... by the by, how do you run top when running R? I think at least in my machine, you have to quit R to get to the shell prompt... I request you TO PLEASE reply to this mail as early as possible. I am facing an imminent deadline...please excuse my blatant violation of protocol, but deadlines are deadlines, right? THanking you, Yours sincerely, AKSHAY M KULKARNI ________________________________ From: Ivan Krylov <krylov.r00t at gmail.com> Sent: Saturday, June 10, 2023 1:43 AM To: akshay kulkarni <akshay_e4 at hotmail.com> Cc: R help Mailing list <r-help at r-project.org> Subject: Re: [R] inconsistency in mclapply..... On Fri, 9 Jun 2023 18:01:44 +0000 akshay kulkarni <akshay_e4 at hotmail.com> wrote:> > LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores = 2,mc.preschedule > > FALSE) > | > | > 0%, ETA NA^ > > It just hangs.My questions from the last time still stand: 0) What is your sessionInfo()? Maybe you're running a parallel BLAS which doesn't always handle fork() or something. It may be worth disabling BLAS-level parallelism as long as you're already trying to use 100% of your processor by other means. 1) What does traceback() show after you interrupt pbmclapply? Most likely, you would be interrupting selectChildren(), but if not, the problem may lie in a very different place from what I'm expecting. 2) While pbmclapply is hung, without interrupting it, take a look at the state of the system and the processes on it (are you still on RHEL? use `top` or whatever task manager you're comfortable with). a) Is 100% of the CPU being used? 100% of one core? Is system mostly idle? b) Can you find the child processes launched by pbmclapply? c) Write down the PID of the child process and attach a debugger to it (If you're on RHEL, try following this guide: <https://beej.us/guide/bggdb/#attach>. If GDB asks you to install additional debug symbols by running debuginfo-install, follow its guidance and then restart GDB.) and obtain a backtrace. (In GDB, the command to obtain a backtrace is "backtrace".) Which function is the child process stuck in? -- Best regards, Ivan [[alternative HTML version deleted]]
Hi Akshay, You do not have to quit R to run 'top'. You can have, for example, 2 windows, with R running in one and top running in the oher. Eric On Sat, Jun 10, 2023 at 12:19?AM akshay kulkarni <akshay_e4 at hotmail.com> wrote:> Dear Ivan, > Thanks for the reply. I am pressurised by a fast > approaching deadline and your reply calmed me... > > Take a look at the following code: > > > debug at <tmp>#26: LYG <- mclapply(LYGH, FUN = arfima, mc.cores = 2, > mc.preschedule = FALSE) > Browse[2]> length(LYGH) > [1] 357 > Browse[2]> > ^C > > Browse[2]> LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores = 2,mc.preschedule > = FALSE) > | > > | 0%, ETA NA > > I am debuuging a function FUN wherein the above expressions appear. The > pbmclapply code works well if called inside FUN: > > FUN(arg) > Result.. > > But as you may note, it doesn't work while in debug mode.... > > Also, if I replace pbmclapply by maclapply inside FUN, it hangs.... > > You might be interested in this: > > [ec2-user at ip-172-31-15-116 ~]$ exit > logout > There are stopped jobs. > > THis occurs when I close R and try to exit the shell prompt( I am on an > AWS EC2 RHEL 8 Instance). Can this lead you somewhere? As of now I have > quit R in my machine, so I can't get session info..but please let me know > if you need it necessarily... > > by the by, how do you run top when running R? I think at least in my > machine, you have to quit R to get to the shell prompt... > > I request you TO PLEASE reply to this mail as early as possible. I am > facing an imminent deadline...please excuse my blatant violation of > protocol, but deadlines are deadlines, right? > > THanking you, > Yours sincerely, > AKSHAY M KULKARNI > > > ________________________________ > From: Ivan Krylov <krylov.r00t at gmail.com> > Sent: Saturday, June 10, 2023 1:43 AM > To: akshay kulkarni <akshay_e4 at hotmail.com> > Cc: R help Mailing list <r-help at r-project.org> > Subject: Re: [R] inconsistency in mclapply..... > > On Fri, 9 Jun 2023 18:01:44 +0000 > akshay kulkarni <akshay_e4 at hotmail.com> wrote: > > > > LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores = 2,mc.preschedule > > > FALSE) > > | > > > | > > 0%, ETA NA^ > > > > It just hangs. > > My questions from the last time still stand: > > 0) What is your sessionInfo()? Maybe you're running a parallel BLAS > which doesn't always handle fork() or something. It may be worth > disabling BLAS-level parallelism as long as you're already trying to > use 100% of your processor by other means. > > 1) What does traceback() show after you interrupt pbmclapply? Most > likely, you would be interrupting selectChildren(), but if not, the > problem may lie in a very different place from what I'm expecting. > > 2) While pbmclapply is hung, without interrupting it, take a look at > the state of the system and the processes on it (are you still on RHEL? > use `top` or whatever task manager you're comfortable with). > > a) Is 100% of the CPU being used? 100% of one core? Is system mostly > idle? > > b) Can you find the child processes launched by pbmclapply? > > c) Write down the PID of the child process and attach a debugger to > it (If you're on RHEL, try following this guide: > <https://beej.us/guide/bggdb/#attach>. If GDB asks you to install > additional debug symbols by running debuginfo-install, follow its > guidance and then restart GDB.) and obtain a backtrace. (In GDB, the > command to obtain a backtrace is "backtrace".) Which function is the > child process stuck in? > > -- > Best regards, > Ivan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Fri, 9 Jun 2023 21:19:11 +0000 akshay kulkarni <akshay_e4 at hotmail.com> wrote:> debug at <tmp>#26: LYG <- mclapply(LYGH, FUN = arfima, mc.cores = 2, > mc.preschedule = FALSE) > Browse[2]> LYG <- pbmclapply(LYGH,FUN = arfima,mc.cores > 2,mc.preschedule = FALSE) > | | 0%, ETA NASo if you interrupt the code _after_ it hangs at 0%, ETA NA, what's the traceback? (We're doing this to confirm that the parent process hangs in either selectChildren() or readChild().)> You might be interested in this: > > [ec2-user at ip-172-31-15-116 ~]$ exit > logout > There are stopped jobs. > > THis occurs when I close R and try to exit the shell prompt( I am on > an AWS EC2 RHEL 8 Instance). Can this lead you somewhere?I guess this proves the existence of child processes, probably spawned by mclapply, but why would they be _stopped_, I don't know. What's the output of jobs -l at this point? (This suggests trying to send them a SIGCONT and seeing what happens. Does mclapply() get unstuck if you run the command killall -SIGCONT R from a separate ssh connection? Would be strange if it worked, but worth a try.)> As of now I have quit R in my machine, so I can't get session > info..Knowing the output of sessionInfo() could still be useful for solving the problem. It's best to show the output after loading all the packages, ideally just before you reproduce the problem.> by the by, how do you run top when running R? I think at least in my > machine, you have to quit R to get to the shell prompt...I can think of 3 options: 1) Type Ctrl+Z at the R prompt. R (and the rest of the process group, I think) becomes suspended, you return to the command line prompt where you can run other commands. At the system command line prompt, type "fg" and press Enter in order to continue running R. (Press Enter a second time so that R prints its command line prompt again.) This is quick, doesn't require preparation, but messes up the state of the processes you're interested in. (They become suspended instead of running, which may complicate debugging.) 2) Open a second ssh connection to the same machine the same way you had opened the first one. You won't be able to (easily) interact with the R session running in the first connection, but you'll get a second system command line where you'll be able to run top, gdb, and other commands, which should let you inspect the state of the system. 3) Before starting R, install a "terminal multiplexer", that is, GNU Screen or tmux. If you're still on RHEL, use sudo dnf install screen or sudo dnf install tmux. One of these commands needs to be run once per computer. Type the name of the program ("screen" or "tmux") to start it. Inside screen/tmux, start R. In Screen, use Ctrl+A then C in order to create a new virtual terminal; Ctrl+A then type the number in order to switch between terminals. In tmux, use Ctrl+B then C in order to create a new virtual terminal; Ctrl+B then type the number in order to switch between them. Terminals are numbered from 0 upwards, so the second one will be numbered 1, and so on. -- Best regards, Ivan