bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 06:43 UTC
[Bug 2167] New: Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Bug ID: 2167
Summary: Connection remains when fork() fails.
Product: Portable OpenSSH
Version: 5.3p1
Hardware: Other
OS: Linux
Status: NEW
Severity: enhancement
Priority: P5
Component: sshd
Assignee: unassigned-bugs at mindrot.org
Reporter: penguin-kernel at I-love.SAKURA.ne.jp
Created attachment 2368
--> https://bugzilla.mindrot.org/attachment.cgi?id=2368&action=edit
A patch which seems to solve this problem.
I got "sshd[$pid]: fatal: fork of unprivileged child failed" in
/var/log/secure but the connection with ssh client remained.
I examined the cause and found that this problem happens when fork() in
privsep_preauth()/privsep_postauth() fails. You can easily reproduce
this problem by replacing fork() in
privsep_preauth()/privsep_postauth()
with -1.
I don't know what is the right fix, but at least forcibly closing
all sockets before exit() seems to solve this problem.
I'm using RHEL 6.4's openssh-5.3p1-84.1.el6.src.rpm , but I think
this problem exists in any versions which have
privsep_preauth()/privsep_postauth() .
Regards.
--
You are receiving this mail because:
You are watching the assignee of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 06:44 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |penguin-kernel at I-love.SAKUR
| |A.ne.jp
--
You are receiving this mail because:
You are watching the assignee of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Oct-31 21:00 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Damien Miller <djm at mindrot.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |djm at mindrot.org
--- Comment #1 from Damien Miller <djm at mindrot.org> ---
I can't see how this patch is needed. fatal() calls exit(), which
closes all process file descriptors. If exit() isn't closing file
descriptors then that is a kernel bug AFAIK.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 02:39 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
--- Comment #2 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
---
This is not a kernel bug. fatal() is designed to eventually call
exit(), but
this bug is preventing sshd process from calling exit().
The child process calls fatal() when fork() failed at
privsep_postauth().
(Please note that fork() is replaced with -1 for explanation.)
---------- privsep_postauth() in sshd.c ----------
static void
privsep_postauth(Authctxt *authctxt)
{
u_int32_t rnd[256];
#ifdef DISABLE_FD_PASSING
if (1) {
#else
if (authctxt->pw->pw_uid == 0 || options.use_login) {
#endif
/* File descriptor passing is broken or root login */
use_privsep = 0;
goto skip;
}
/* New socket pair */
monitor_reinit(pmonitor);
pmonitor->m_pid = -1; // Emulate the fork() failure.
if (pmonitor->m_pid == -1)
fatal("fork of unprivileged child failed");
(...snipped...)
---------- privsep_postauth() in sshd.c ----------
fatal() calls cleanup_exit(255). (Please note that dummy write(-1, "",
$step)
lines are inserted for comparing with strace command.)
---------- fatal() in fatal.c ----------
void
fatal(const char *fmt,...)
{
va_list args;
va_start(args, fmt);
do_log(SYSLOG_LEVEL_FATAL, fmt, args);
va_end(args);
write(-1, "", 0);
cleanup_exit(255);
}
---------- fatal() in fatal.c ----------
cleanup_exit(255) will eventually call _exit(255).
---------- cleanup_exit() in sshd.c ----------
/* server specific fatal cleanup */
void
cleanup_exit(int i)
{
static int in_cleanup;
int is_privsep_child;
write(-1, "", 1);
/* cleanup_exit can be called at the very least from the
privsep
wrappers used for auditing. Make sure we don't recurse
indefinitely. */
if (in_cleanup)
_exit(i);
write(-1, "", 2);
in_cleanup = 1;
if (the_authctxt)
do_cleanup(the_authctxt);
write(-1, "", 3);
is_privsep_child = use_privsep && pmonitor != NULL &&
!mm_is_monitor();
write(-1, "", 4);
if (sensitive_data.host_keys != NULL)
destroy_sensitive_data(is_privsep_child);
write(-1, "", 5);
packet_destroy_all(1, is_privsep_child);
#ifdef SSH_AUDIT_EVENTS
/* done after do_cleanup so it can cancel the PAM auth 'thread'
*/
write(-1, "", 6);
if ((the_authctxt == NULL || !the_authctxt->authenticated) &&
(!use_privsep || mm_is_monitor()))
audit_event(SSH_CONNECTION_ABANDON);
#endif
write(-1, "", 7);
_exit(i);
}
---------- cleanup_exit() in sshd.c ----------
Did we reach the _exit(i) line? Let's check the strace command.
---------- strace log start ----------
[pid 17153] socketpair(PF_FILE, SOCK_STREAM, 0, [5, 6]) = 0
[pid 17153] fcntl(5, F_SETFD, FD_CLOEXEC) = 0
[pid 17153] fcntl(6, F_SETFD, FD_CLOEXEC) = 0
[pid 17153] sendto(4, "<82>Nov 1 11:11:08 sshd[17153]:"..., 73,
MSG_NOSIGNAL, NULL, 0) = 73
[pid 17153] close(4) = 0
[pid 17153] write(4294967295, "", 0) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0", 1) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0G", 2) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0Go", 3) = -1 EBADF (Bad file
descriptor)
[pid 17153] write(4294967295, "\0Got", 4) = -1 EBADF (Bad file
descriptor)
[pid 17153] getuid() = 0
[pid 17153] write(5, "\0\0\0DR", 5) = 5
[pid 17153] write(5, "\0\0\0/c1:bd:33:a5:66:d5:83:2d:0c:9"..., 67) =
67
[pid 17153] read(5, ^C <unfinished ...>
Process 17147 detached
Process 17153 detached
---------- strace log end ----------
We can see that the child process reached the write(-1, "", 4) line
but
did not reach the write(-1, "", 5) line. This means that the child
process
is sleeping at
if (sensitive_data.host_keys != NULL)
destroy_sensitive_data(is_privsep_child);
trying to read data from fd == 5.
What is fd == 5 connected with?
According to strace command, fd == 5 and fd == 6 are a socket pair
created by monitor_reinit() call in privsep_postauth().
---------- monitor_reinit() in monitor.c ----------
void
monitor_reinit(struct monitor *mon)
{
int pair[2];
monitor_socketpair(pair);
mon->m_recvfd = pair[0];
mon->m_sendfd = pair[1];
}
---------- monitor_reinit() in monitor.c ----------
destroy_sensitive_data() tried to write to fd == 5 and trying to read
from
fd == 5, but there is no writers writing to fd == 6.
Dead lock caused by trying to I/O against wrong file descriptor.
This was why calling shutdown() seems to solve the problem.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 04:32 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #2368|0 |1
is obsolete| |
--- Comment #3 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
---
Created attachment 2369
--> https://bugzilla.mindrot.org/attachment.cgi?id=2369&action=edit
A patch which fixes this problem.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-01 04:34 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
--- Comment #4 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
---
OK. I found the exact location.
fatal("fork of unprivileged child failed") calls cleanup_exit(255).
cleanup_exit(255) calls destroy_sensitive_data(1).
destroy_sensitive_data(1) calls mm_audit_destroy_sensitive_data(fp,
pid, uid).
mm_audit_destroy_sensitive_data() uses global pmonitor->m_recvfd which
now
references a socket pair.
---------- mm_audit_destroy_sensitive_data() in monitor_wrap.c
----------
void
mm_audit_destroy_sensitive_data(const char *fp, pid_t pid, uid_t uid)
{
Buffer m;
buffer_init(&m);
buffer_put_cstring(&m, fp);
buffer_put_int64(&m, pid);
buffer_put_int64(&m, uid);
mm_request_send(pmonitor->m_recvfd,
MONITOR_REQ_AUDIT_SERVER_KEY_FREE, &m);
mm_request_receive_expect(pmonitor->m_recvfd,
MONITOR_ANS_AUDIT_SERVER_KEY_FREE,
&m);
buffer_free(&m);
}
---------- mm_audit_destroy_sensitive_data() in monitor_wrap.c
----------
Regarding privsep_preauth(), pmonitor = monitor_init() causes this
problem
when fork() fails. Fortunately, timeout was previously configured via
alarm()
which will eventually terminate the process even if fork() failed.
---------- privsep_preauth() in sshd.c ----------
static int
privsep_preauth(Authctxt *authctxt)
{
int status;
pid_t pid;
/* Set up unprivileged child process to deal with network data
*/
pmonitor = monitor_init();
/* Store a pointer to the kex for later rekeying */
pmonitor->m_pkex = &xxx_kex;
pid = fork();
if (pid == -1) {
fatal("fork of unprivileged child failed");
(...snipped...)
---------- privsep_preauth() in sshd.c ----------
Regarding privsep_postauth(), monitor_reinit(pmonitor) causes this
problem
when fork() fails. Unfortunately, timeout previously configured via
alarm()
was disabled, which results in forever wait when fork() failed.
---------- privsep_postauth() in sshd.c ----------
static void
privsep_postauth(Authctxt *authctxt)
{
u_int32_t rnd[256];
#ifdef DISABLE_FD_PASSING
if (1) {
#else
if (authctxt->pw->pw_uid == 0 || options.use_login) {
#endif
/* File descriptor passing is broken or root login */
use_privsep = 0;
goto skip;
}
/* New socket pair */
monitor_reinit(pmonitor);
pmonitor->m_pid = fork();
if (pmonitor->m_pid == -1)
fatal("fork of unprivileged child failed");
(...snipped...)
---------- privsep_postauth() in sshd.c ----------
I confirmed that the patch in Comment 3 can fix this problem.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-08 10:58 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|enhancement |major
--- Comment #5 from Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
---
As I get no response, I set importance to "major" due to following
impact.
This bug actually blocked an unattended ssh session (execution of
batched job)
of an enterprise server. The worst case might be "sleeping forever
while trying
to kdump over ssh" which means that "the kdump procedure is unable to
reboot
after the kernel panic", severely affecting availability (down time) of
the
enterprise servers.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-08 11:34 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Darren Tucker <dtucker at zip.com.au> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dtucker at zip.com.au
Blocks| |2130
--- Comment #6 from Darren Tucker <dtucker at zip.com.au> ---
look at this for next release
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at natsu.mindrot.org
2013-Nov-11 14:01 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Petr Lautrbach <plautrba at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |plautrba at redhat.com
Resolution|--- |INVALID
--- Comment #7 from Petr Lautrbach <plautrba at redhat.com> ---
It's an issue of RHEL's downstream audit patches -
https://bugzilla.redhat.com/show_bug.cgi?id=1028643 :
openssh-5.3p1-audit.patch:
+ is_privsep_child = use_privsep && pmonitor != NULL &&
!mm_is_monitor();
This hunk needs a change.
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2015-Aug-11 13:03 UTC
[Bug 2167] Connection remains when fork() fails.
https://bugzilla.mindrot.org/show_bug.cgi?id=2167
Damien Miller <djm at mindrot.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
--- Comment #8 from Damien Miller <djm at mindrot.org> ---
Set all RESOLVED bugs to CLOSED with release of OpenSSH 7.1
--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.