Konstantin Belousov
2015-Aug-27 16:26 UTC
Latest stable (r287104) bash leaves zombies on exit
On Thu, Aug 27, 2015 at 02:04:05PM +0200, Mark Martinec wrote:> Pete French wrote: > > > I updated to stable yesterday, plus updated all my porst to > > the latest pecompiled packages, but I am now seeing odd problems > > with bash on exit. Sometimes it quits, but leaves a zombie > > process... e.g > > > > PID TT STAT TIME COMMAND > > 44308 v0 IW 0:00.00 -bash (bash) > > 44312 v0 IW+ 0:00.00 /bin/sh /usr/local/bin/startx -listen_tcp > > 44325 v0 IW+ 0:00.00 xinit xterm -listen_tcp -- /usr/local/bin/X :0 > > -auth /ho > > 44328 v0 IW 0:00.00 /usr/local/bin/wmaker > > 44340 v0 S 0:03.35 /usr/local/bin/wmaker --for-real > > 49101 0- Z+ 0:02.73 <defunct> > > 49314 1- Z+ 0:00.17 <defunct> > > 56068 2 Ss 0:00.01 bash > > 56498 2 R+ 0:00.00 ps > > 56074 3 Is 0:00.01 bash > > 56076 3 S+ 0:00.00 mail freebsd-stable at freebsd.org > > 56308 4 Is+ 0:00.01 bash > > > > Thats the current 'ps' on this machine. The bash processes are running > > inside an xterm, so am not sure if the issue is with bash or the > > terminal. Kind of puzzled! > > I can reproduce this easily, although not every time. > > Running 10.2 under KDE, with bash as a default shell: > start xterm from a KDE 'konsole', then move to within the xterm > and try closing it (^D or exit). More often than not the xterm > will block and stay open, the bash process within goes <defunct>. > > A normal kill of xterm has no effect, although a kill -9 to the > xterm blows away the xterm and the init process then clears > the bash zombie leftover. Seems like running a simple command > like 'date' in xterm before trying to close it does increase > the likelihood that xterm will block on exit. > > > > Currently I have to reboot the machine periodicly once I have > > accumulated > > enough zombies to be annoying. Its not really a long term solution > > though. > > There is no need to reboot, just kill -9 the hanging xterm processes > and the init will clear the zombies.Try to obtain the backtrace from the hung xterm. Ideally, you would rebuild xterm and the system libraries (rtld+libc+libthr) with debug symbols and get the backtraces after that.
> Try to obtain the backtrace from the hung xterm. Ideally, you would > rebuild xterm and the system libraries (rtld+libc+libthr) with debug > symbols and get the backtraces after that.I can try this tomorrow - what do I need to set in src.conf to add debug symbols in when I do a buidlworld (thats what I would end up doing) ? never played with this bit - am hoping its as simple as 'DEBUG=yes' for both porst and world though! cheers, -pete.
The xterm program has a SIGCHLD signal handler that calls wait(). If the handler is invoked while xterm is exiting, a deadlock occurs in rtld. Cheers Michiel #0 _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 #1 0x000000080305a2b0 in __thr_rwlock_rdlock (rwlock=0x803272980, flags=<value optimized out>, tsp=<value optimized out>) at /usr/src/lib/libthr/thread/thr_umtx.c:277 #2 0x000000080306179c in _thr_rtld_rlock_acquire (lock=0x803272980) at thr_umtx.h:196 #3 0x00000008006a72c2 in rlock_acquire (lock=0x8008ba860, lockstate=0x7fffffffd5b8) at /usr/src/libexec/rtld-elf/rtld_lock.c:201 #4 0x00000008006a0c8d in _rtld_bind (obj=0x8006bc000, reloff=6840) at /usr/src/libexec/rtld-elf/rtld.c:701 #5 0x000000080069e46d in _rtld_bind_start () at /usr/src/libexec/rtld-elf/amd64/rtld_start.S:121 #6 0x0000000000445d34 in reapchild (n=20) at main.c:5177 #7 <signal handler called> #8 objlist_call_fini () at /usr/src/libexec/rtld-elf/rtld.c:769 #9 0x00000008006a0c2b in rtld_exit () at /usr/src/libexec/rtld-elf/rtld.c:2710 #10 0x00000008024e5406 in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:200 #11 0x000000080248692c in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:67 #12 0x0000000000445f35 in Exit (n=0) at main.c:5078 #13 0x0000000000456020 in Cleanup (code=0) at misc.c:5238 #14 0x000000000044da49 in NormalExit () at misc.c:5222 #15 0x000000000045a616 in readPtyData (xw=0x804cdc000, select_mask=0x6add80, data=0x804d64000) at ptydata.c:221 #16 0x0000000000421c48 in in_put (xw=0x804cdc000) at charproc.c:4700 #17 0x0000000000421b6a in doinput () at charproc.c:4856 #18 0x000000000041d992 in VTparse (xw=0x804cdc000) at charproc.c:4382 #19 0x000000000041d87a in VTRun (xw=0x804cdc000) at charproc.c:6997 #20 0x0000000000442c01 in main (argc=3, argv=0x7fffffffe6d0) at main.c:2607 #6 0x0000000000445d34 in reapchild (n=20) at main.c:5177 5177 pid = wait(NULL); Current language: auto; currently minimal (gdb) l 5172 int olderrno = errno; 5173 int pid; 5174 5175 DEBUG_MSG("handle:reapchild\n"); 5176 5177 pid = wait(NULL); 5178