Jonathan T. Looney
2021-Mar-30 15:40 UTC
FreeBSD 13.0-RC4 and Nginx process "stuck" during restart
On Mon, Mar 29, 2021 at 3:57 PM Christos Chatzaras <chris at cretaforce.gr> wrote:> Hello, > > > I upgrade from 12.2 to 13.0-RC4 and I notice a strange issue with Nginx. > > When I run "service nginx restart" in some (random) servers it > doesn't complete the restart and it "stucks" at "Waiting for PIDS: 20536." . > > I can kill the 20536 process and then restart completes. > > > procstat -kk 20536: > > PID TID COMM TDNAME KSTACK > 63094 100505 nginx - mi_switch+0xc1 > sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1be > kern_sigsuspend+0x164 sys_sigsuspend+0x31 amd64_syscall+0x10c > fast_syscall_common+0xf8 > > > I found this commit: > > > https://cgit.freebsd.org/src/commit/?id=dbec10e08808e375365fb2a2462f306e0cdfda32 > > Could this be related? >Yes, it could be related. Because of the timing of when we first saw the behavior, I assumed the trigger for the behavior (which remains unknown) only existed in main. However, it is possible that this behavior is now being triggered in 13.0. Are you able to reliably reproduce this? I was never able to do that. A reliable reproduction may help narrow down the change which triggered this behavior. I can certainly MFC the patch to stable/13. re@ will need to decide whether to admit it to the release branch. Jonathan
Christos Chatzaras
2021-Mar-30 16:01 UTC
FreeBSD 13.0-RC4 and Nginx process "stuck" during restart
> On 30 Mar 2021, at 18:40, Jonathan T. Looney <jtl at freebsd.org> wrote: > > Yes, it could be related. Because of the timing of when we first saw the behavior, I assumed the trigger for the behavior (which remains unknown) only existed in main. However, it is possible that this behavior is now being triggered in 13.0. > > Are you able to reliably reproduce this? I was never able to do that. A reliable reproduction may help narrow down the change which triggered this behavior. > > I can certainly MFC the patch to stable/13. re@ will need to decide whether to admit it to the release branch. > > JonathanHello Jonathan, I have 100+ nginx servers and I can reproduce it in many of them. It doesn't happen always in the same servers. Also after a "killlall -9 nginx && service nginx restart" a second "service nginx restart" few seconds after the first restart completes successfully. I believe it needs few minutes/hours running nginx before the restart "stuck" in "Waiting for PIDS: 85499." Did you notice it with nginx too or something else? I also run monit, bind, dovecot, postfix, mysql, php-fpm and pure-ftpd and didn't notice any restart of these services to "stuck".