bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 06:43 UTC
[Bug 2167] New: Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Bug ID: 2167 Summary: Connection remains when fork() fails. Product: Portable OpenSSH Version: 5.3p1 Hardware: Other OS: Linux Status: NEW Severity: enhancement Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: penguin-kernel at I-love.SAKURA.ne.jp Created attachment 2368 --> https://bugzilla.mindrot.org/attachment.cgi?id=2368&action=edit A patch which seems to solve this problem. I got "sshd[$pid]: fatal: fork of unprivileged child failed" in /var/log/secure but the connection with ssh client remained. I examined the cause and found that this problem happens when fork() in privsep_preauth()/privsep_postauth() fails. You can easily reproduce this problem by replacing fork() in privsep_preauth()/privsep_postauth() with -1. I don't know what is the right fix, but at least forcibly closing all sockets before exit() seems to solve this problem. I'm using RHEL 6.4's openssh-5.3p1-84.1.el6.src.rpm , but I think this problem exists in any versions which have privsep_preauth()/privsep_postauth() . Regards. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 06:44 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |penguin-kernel at I-love.SAKUR | |A.ne.jp -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 21:00 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #1 from Damien Miller <djm at mindrot.org> --- I can't see how this patch is needed. fatal() calls exit(), which closes all process file descriptors. If exit() isn't closing file descriptors then that is a kernel bug AFAIK. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 02:39 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 --- Comment #2 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> --- This is not a kernel bug. fatal() is designed to eventually call exit(), but this bug is preventing sshd process from calling exit(). The child process calls fatal() when fork() failed at privsep_postauth(). (Please note that fork() is replaced with -1 for explanation.) ---------- privsep_postauth() in sshd.c ---------- static void privsep_postauth(Authctxt *authctxt) { u_int32_t rnd[256]; #ifdef DISABLE_FD_PASSING if (1) { #else if (authctxt->pw->pw_uid == 0 || options.use_login) { #endif /* File descriptor passing is broken or root login */ use_privsep = 0; goto skip; } /* New socket pair */ monitor_reinit(pmonitor); pmonitor->m_pid = -1; // Emulate the fork() failure. if (pmonitor->m_pid == -1) fatal("fork of unprivileged child failed"); (...snipped...) ---------- privsep_postauth() in sshd.c ---------- fatal() calls cleanup_exit(255). (Please note that dummy write(-1, "", $step) lines are inserted for comparing with strace command.) ---------- fatal() in fatal.c ---------- void fatal(const char *fmt,...) { va_list args; va_start(args, fmt); do_log(SYSLOG_LEVEL_FATAL, fmt, args); va_end(args); write(-1, "", 0); cleanup_exit(255); } ---------- fatal() in fatal.c ---------- cleanup_exit(255) will eventually call _exit(255). ---------- cleanup_exit() in sshd.c ---------- /* server specific fatal cleanup */ void cleanup_exit(int i) { static int in_cleanup; int is_privsep_child; write(-1, "", 1); /* cleanup_exit can be called at the very least from the privsep wrappers used for auditing. Make sure we don't recurse indefinitely. */ if (in_cleanup) _exit(i); write(-1, "", 2); in_cleanup = 1; if (the_authctxt) do_cleanup(the_authctxt); write(-1, "", 3); is_privsep_child = use_privsep && pmonitor != NULL && !mm_is_monitor(); write(-1, "", 4); if (sensitive_data.host_keys != NULL) destroy_sensitive_data(is_privsep_child); write(-1, "", 5); packet_destroy_all(1, is_privsep_child); #ifdef SSH_AUDIT_EVENTS /* done after do_cleanup so it can cancel the PAM auth 'thread' */ write(-1, "", 6); if ((the_authctxt == NULL || !the_authctxt->authenticated) && (!use_privsep || mm_is_monitor())) audit_event(SSH_CONNECTION_ABANDON); #endif write(-1, "", 7); _exit(i); } ---------- cleanup_exit() in sshd.c ---------- Did we reach the _exit(i) line? Let's check the strace command. ---------- strace log start ---------- [pid 17153] socketpair(PF_FILE, SOCK_STREAM, 0, [5, 6]) = 0 [pid 17153] fcntl(5, F_SETFD, FD_CLOEXEC) = 0 [pid 17153] fcntl(6, F_SETFD, FD_CLOEXEC) = 0 [pid 17153] sendto(4, "<82>Nov 1 11:11:08 sshd[17153]:"..., 73, MSG_NOSIGNAL, NULL, 0) = 73 [pid 17153] close(4) = 0 [pid 17153] write(4294967295, "", 0) = -1 EBADF (Bad file descriptor) [pid 17153] write(4294967295, "\0", 1) = -1 EBADF (Bad file descriptor) [pid 17153] write(4294967295, "\0G", 2) = -1 EBADF (Bad file descriptor) [pid 17153] write(4294967295, "\0Go", 3) = -1 EBADF (Bad file descriptor) [pid 17153] write(4294967295, "\0Got", 4) = -1 EBADF (Bad file descriptor) [pid 17153] getuid() = 0 [pid 17153] write(5, "\0\0\0DR", 5) = 5 [pid 17153] write(5, "\0\0\0/c1:bd:33:a5:66:d5:83:2d:0c:9"..., 67) = 67 [pid 17153] read(5, ^C <unfinished ...> Process 17147 detached Process 17153 detached ---------- strace log end ---------- We can see that the child process reached the write(-1, "", 4) line but did not reach the write(-1, "", 5) line. This means that the child process is sleeping at if (sensitive_data.host_keys != NULL) destroy_sensitive_data(is_privsep_child); trying to read data from fd == 5. What is fd == 5 connected with? According to strace command, fd == 5 and fd == 6 are a socket pair created by monitor_reinit() call in privsep_postauth(). ---------- monitor_reinit() in monitor.c ---------- void monitor_reinit(struct monitor *mon) { int pair[2]; monitor_socketpair(pair); mon->m_recvfd = pair[0]; mon->m_sendfd = pair[1]; } ---------- monitor_reinit() in monitor.c ---------- destroy_sensitive_data() tried to write to fd == 5 and trying to read from fd == 5, but there is no writers writing to fd == 6. Dead lock caused by trying to I/O against wrong file descriptor. This was why calling shutdown() seems to solve the problem. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 04:32 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #2368|0 |1 is obsolete| | --- Comment #3 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> --- Created attachment 2369 --> https://bugzilla.mindrot.org/attachment.cgi?id=2369&action=edit A patch which fixes this problem. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 04:34 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 --- Comment #4 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> --- OK. I found the exact location. fatal("fork of unprivileged child failed") calls cleanup_exit(255). cleanup_exit(255) calls destroy_sensitive_data(1). destroy_sensitive_data(1) calls mm_audit_destroy_sensitive_data(fp, pid, uid). mm_audit_destroy_sensitive_data() uses global pmonitor->m_recvfd which now references a socket pair. ---------- mm_audit_destroy_sensitive_data() in monitor_wrap.c ---------- void mm_audit_destroy_sensitive_data(const char *fp, pid_t pid, uid_t uid) { Buffer m; buffer_init(&m); buffer_put_cstring(&m, fp); buffer_put_int64(&m, pid); buffer_put_int64(&m, uid); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUDIT_SERVER_KEY_FREE, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_AUDIT_SERVER_KEY_FREE, &m); buffer_free(&m); } ---------- mm_audit_destroy_sensitive_data() in monitor_wrap.c ---------- Regarding privsep_preauth(), pmonitor = monitor_init() causes this problem when fork() fails. Fortunately, timeout was previously configured via alarm() which will eventually terminate the process even if fork() failed. ---------- privsep_preauth() in sshd.c ---------- static int privsep_preauth(Authctxt *authctxt) { int status; pid_t pid; /* Set up unprivileged child process to deal with network data */ pmonitor = monitor_init(); /* Store a pointer to the kex for later rekeying */ pmonitor->m_pkex = &xxx_kex; pid = fork(); if (pid == -1) { fatal("fork of unprivileged child failed"); (...snipped...) ---------- privsep_preauth() in sshd.c ---------- Regarding privsep_postauth(), monitor_reinit(pmonitor) causes this problem when fork() fails. Unfortunately, timeout previously configured via alarm() was disabled, which results in forever wait when fork() failed. ---------- privsep_postauth() in sshd.c ---------- static void privsep_postauth(Authctxt *authctxt) { u_int32_t rnd[256]; #ifdef DISABLE_FD_PASSING if (1) { #else if (authctxt->pw->pw_uid == 0 || options.use_login) { #endif /* File descriptor passing is broken or root login */ use_privsep = 0; goto skip; } /* New socket pair */ monitor_reinit(pmonitor); pmonitor->m_pid = fork(); if (pmonitor->m_pid == -1) fatal("fork of unprivileged child failed"); (...snipped...) ---------- privsep_postauth() in sshd.c ---------- I confirmed that the patch in Comment 3 can fix this problem. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-08 10:58 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|enhancement |major --- Comment #5 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> --- As I get no response, I set importance to "major" due to following impact. This bug actually blocked an unattended ssh session (execution of batched job) of an enterprise server. The worst case might be "sleeping forever while trying to kdump over ssh" which means that "the kdump procedure is unable to reboot after the kernel panic", severely affecting availability (down time) of the enterprise servers. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-08 11:34 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at zip.com.au Blocks| |2130 --- Comment #6 from Darren Tucker <dtucker at zip.com.au> --- look at this for next release -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-11 14:01 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Petr Lautrbach <plautrba at redhat.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |plautrba at redhat.com Resolution|--- |INVALID --- Comment #7 from Petr Lautrbach <plautrba at redhat.com> --- It's an issue of RHEL's downstream audit patches - https://bugzilla.redhat.com/show_bug.cgi?id=1028643 : openssh-5.3p1-audit.patch: + is_privsep_child = use_privsep && pmonitor != NULL && !mm_is_monitor(); This hunk needs a change. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2015-Aug-11 13:03 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #8 from Damien Miller <djm at mindrot.org> --- Set all RESOLVED bugs to CLOSED with release of OpenSSH 7.1 -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.