bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-13 19:44 UTC
[Bug 2646] New: zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Bug ID: 2646 Summary: zombie processes when using privilege separation Product: Portable OpenSSH Version: 7.2p2 Hardware: ix86 OS: Linux Status: NEW Severity: minor Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: akshay.moghe at gmail.com I'm using `OpenSSH_7.2p2 Ubuntu-4ubuntu1, OpenSSL 1.0.2g-fips` and I've explicitly enabled UsePrivilegeSeparation. With this I notice that the [priv] process does not get reaped by its parent (sshd) and as a result is adopted by whatever pid 1 happens to be. Normally this is okay since most init systems will handle this correctly, however in containers we might encounter homemade "init" systems that only serve to propagate signals but don't reap adopted zombie processes. In such cases we accumulate these zombies over time and can lead to obvious problems. Is there any reason that sshd can't reap its children after they exit? -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-13 19:44 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Akshay <akshay.moghe at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |akshay.moghe at gmail.com -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-13 19:49 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #1 from Akshay <akshay.moghe at gmail.com> --- Steps to reproduce the issue: - using a docker container running phusion/baseimage:latest. - modify sshd_config to explicitly enable UsePrivilegeSeparation - start sshd - trace the init process in the container - ssh into the container, then exit - notice that the init process ends up 'wait'ing for the zombied sshd Alternatively - hack up a 'init' process that simply launches sshd in the container - log in , log out - notice `ps auxf` listing in the container now has zombie ssh process -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-13 21:39 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at zip.com.au --- Comment #2 from Darren Tucker <dtucker at zip.com.au> --- (In reply to Akshay from comment #0)> I'm using `OpenSSH_7.2p2 Ubuntu-4ubuntu1, OpenSSL 1.0.2g-fips` andThat's a vendor-modified version of OpenSSH. Can you reproduce the problem with a binary built from the stock sources from openssh.com? What command line flags is sshd invoked with?> Is there any reason that sshd can't reap its children after they > exit?It does (or at least it should): https://anongit.mindrot.org/openssh.git/tree/sshd.c#n317 -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-14 22:51 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #3 from Akshay <akshay.moghe at gmail.com> ---> Can you reproduce the problem with a binary built from the stock sources from openssh.comSure, I'll go ahead and do that> What command line flags is sshd invoked withI'll provide those as well -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-14 22:58 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #4 from Akshay <akshay.moghe at gmail.com> --- Okay, I was able to reproduce the issue using `OpenSSH_7.2p2, OpenSSL 1.0.2g 1 Mar 2016` First, I have a simple 'init' program that runs in a container. All it does is it launches sshd, and waits for the TERM signal. On receipt of TERM, it TERMs sshd, and exits. So, initially, here is what I see: root at 4871a0e3589e:/# ps auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 9 0.0 0.0 18248 3384 ? Ss 22:47 0:00 bash root 19 0.0 0.0 34424 2820 ? R+ 22:48 0:00 \_ ps auxf root 1 0.4 0.0 40364 8220 ? Ssl+ 22:47 0:00 /usr/bin/ruby -- /init.rb root 8 0.0 0.0 26468 3844 ? S+ 22:47 0:00 /usr/sbin/sshd -D The bash process (that spawns ps) is 'exec'd in the container using docker exec so that I can view the process listing "out-of-band" (i.e without exercising sshd) Next, I log in, and list the processes (in-band, this time). This is what i see: nsadmin at 4871a0e3589e:~$ ps auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 40364 8220 ? Ssl+ 22:47 0:00 /usr/bin/ruby -- /init.rb root 8 0.0 0.0 26468 3844 ? S+ 22:47 0:00 /usr/sbin/sshd -D root 20 0.0 0.0 29028 4532 ? Ss 22:48 0:00 \_ sshd: nsadmin [priv] nsadmin 22 0.0 0.0 29028 2624 ? S 22:48 0:00 \_ sshd: nsadmin at pts/0 nsadmin 23 0.0 0.0 18256 3216 pts/0 Ss 22:48 0:00 \_ -bash nsadmin 28 0.0 0.0 34424 2932 pts/0 R+ 22:48 0:00 \_ ps auxf Then, I log out of the ssh session, and get the process listing using an exec'd shell: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 29 0.0 0.0 18248 3264 ? Ss 22:48 0:00 /bin/bash root 40 0.0 0.0 34424 2876 ? R+ 22:48 0:00 \_ ps auxf root 1 0.0 0.0 40364 8220 ? Ssl+ 22:47 0:00 /usr/bin/ruby -- /init.rb root 8 0.0 0.0 26468 3844 ? S+ 22:47 0:00 /usr/sbin/sshd -D nsadmin 22 0.0 0.0 0 0 ? Z 22:48 0:00 [sshd] <defunct> -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-14 23:32 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #5 from Darren Tucker <dtucker at zip.com.au> --- (In reply to Akshay from comment #4)> Okay, I was able to reproduce the issue using `OpenSSH_7.2p2, > OpenSSL 1.0.2g 1 Mar 2016`Thanks.> nsadmin 22 0.0 0.0 0 0 ? Z 22:48 0:00 > [sshd] <defunct>If I'm reading this correctly that's the post-auth unprivileged process (pid 22 in this example) not the [priv] process (pid 20 in this example). I think I can see how this would happen. After accepting the connection and forking off a copy, sshd re-execs itself with the "-R" flag in order to (hopefully) get a new address space layout. -R sets: case 'R': rexeced_flag = 1; inetd_flag = 1; then a bit later when the signal handlers are set up: /* Get a connection, either from inetd or a listening TCP socket */ if (inetd_flag) { server_accept_inetd(&sock_in, &sock_out); } else { [...] signal(SIGCHLD, main_sigchld_handler); You can test this theory by running your sshd with the (undocumented) "-r" option to disable the re-exec. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-14 23:37 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #6 from Darren Tucker <dtucker at zip.com.au> --- Created attachment 2914 --> https://bugzilla.mindrot.org/attachment.cgi?id=2914&action=edit Add sigchld handler to inetd mode path I think this patch would also fix it. Could you please try it? -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-15 01:16 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #7 from Akshay <akshay.moghe at gmail.com> --- Here is what happened when I tested with the '-r' option: Initially... root at 4871a0e3589e:/# ps auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 9 0.0 0.0 18248 3308 ? Ss 01:14 0:00 /bin/bash root 27 0.0 0.0 34424 2908 ? R+ 01:14 0:00 \_ ps auxf root 1 0.1 0.0 40356 8196 ? Ssl+ 01:14 0:00 /usr/bin/ruby -- /init.rb root 8 0.0 0.0 26468 3772 ? S+ 01:14 0:00 /usr/sbin/sshd -D -r root 19 0.0 0.0 29028 4084 ? Ss 01:14 0:00 \_ sshd: nsadmin [priv] nsadmin 21 0.0 0.0 29028 2668 ? S 01:14 0:00 \_ sshd: nsadmin at pts/0 nsadmin 22 0.0 0.0 18252 3204 pts/0 Ss+ 01:14 0:00 \_ -bash Later, (after login then logout)... root at 4871a0e3589e:/# ps auxf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 9 0.0 0.0 18248 3324 ? Ss 01:14 0:00 /bin/bash root 29 0.0 0.0 34424 2824 ? R+ 01:14 0:00 \_ ps auxf root 1 0.1 0.0 40356 8196 ? Ssl+ 01:14 0:00 /usr/bin/ruby -- /init.rb root 8 0.0 0.0 26468 3772 ? S+ 01:14 0:00 /usr/sbin/sshd -D -r nsadmin 21 0.0 0.0 0 0 ? Z 01:14 0:00 [sshd] <defunct> -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-15 07:57 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #8 from Akshay <akshay.moghe at gmail.com> --- Also, adding the one line patch you suggested (on to 7.2p2*) does not fix the problem. I still see processes marked 'defunct' once I log out. * = your patch was probably on a different branch, because the line nos didnt seem to align. I was able to find the appropriate line using the comment above it -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Jan-06 03:24 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #9 from Damien Miller <djm at mindrot.org> --- (In reply to Akshay from comment #7) I think this is a bug in your init program. We could probably tell more clearly if you include PPID in your process lists (e.g. "ps ajf"). Here are is the process list from when the session is active:> root at 4871a0e3589e:/# ps auxf > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME > COMMAND > root 8 0.0 0.0 26468 3772 ? S+ 01:14 0:00 > /usr/sbin/sshd -D -r^^ this sshd process (pid=8) is listening to the network.> root 19 0.0 0.0 29028 4084 ? Ss 01:14 0:00 \_ > sshd: nsadmin [priv]^^ this one (pid=19) is the privilege separation monitor process.> nsadmin 21 0.0 0.0 29028 2668 ? S 01:14 0:00 > \_ sshd: nsadmin at pts/0^^ this one is the low-privilege child process.> Later, (after login then logout)... > > root at 4871a0e3589e:/# ps auxf > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME > COMMAND > root 8 0.0 0.0 26468 3772 ? S+ 01:14 0:00 > /usr/sbin/sshd -D -r^^ the listener process is still here.> nsadmin 21 0.0 0.0 0 0 ? Z 01:14 0:00 > [sshd] <defunct>This process was previously a child of the monitor process on pid=19, but its parent has already exited, so it's not around to call waitpid() to reap it. In this situation, init is supposed to do the reaping since pid=21 is clearly orphaned. See https://en.wikipedia.org/wiki/Zombie_process for a bit more detail on how this is supposed to flow. This might be your problem: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Jan-06 21:23 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #10 from Akshay <akshay.moghe at gmail.com> ---> In this situation, init is supposed to do the reapingI understand that this is how normal systems might work. But as I mentioned in comment-1...> does not get reaped by its parent (sshd) and as a result is adopted by whatever pid 1 happens to be. Normally this is okay since most init systems will handle this correctly, however in containers we might encounter homemade "init" systems that only serve to propagate signals but don't reap adopted zombie processes. In such cases we accumulate these zombies over time and can lead to obvious problems.Is there any reason that sshd can't reap its children after they exit? So the original intent of filing the bug was to find out if sshd behavior could be changed so that all parents are around long enough to reap the children and then exit, thereby leaving no zombies. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Jan-06 21:25 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #11 from Akshay <akshay.moghe at gmail.com> ---> Is there any reason that sshd can't reap its children after they exit?To be specific, I meant to ask if there is a reason the priv sep process doesn't wait around till its children exit. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Feb-03 04:25 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 --- Comment #12 from Damien Miller <djm at mindrot.org> --- I don't want to add code to sshd to workaround broken init systems. init behaviour is basic system functionality that we shouldn't have to kludge around. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Aug-11 04:17 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |WORKSFORME -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2018-Apr-06 02:26 UTC
[Bug 2646] zombie processes when using privilege separation
https://bugzilla.mindrot.org/show_bug.cgi?id=2646 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #13 from Damien Miller <djm at mindrot.org> --- Close all resolved bugs after release of OpenSSH 7.7. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.