bugzilla-daemon at mindrot.org
2021-Mar-22 15:18 UTC
[Bug 3286] New: sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Bug ID: 3286 Summary: sshd dumps core after authentication timeout Product: Portable OpenSSH Version: 8.5p1 Hardware: All OS: Linux Status: NEW Severity: normal Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: daniel.harnett at gmail.com The sshd daemon in the latest OpenSSH release (8.5p1) running under Arch Linux occasionally dumps core when an authentication attempt times out. This doesn't appear to be specific to Arch. It seems pretty easy to reproduce by just leaving a ssh client waiting at the password prompt for the duration of LoginGraceTime. Mar 17 16:26:38 localhost systemd-coredump[4024]: [??] Process 4019 (sshd) of user 65534 dumped core. Stack trace of thread 4019: #0 0x00007fe3cf8f68fb __socket (libc.so.6 + 0x1008fb) #1 0x00007fe3cf8ef013 openlog_internal (libc.so.6 + 0xf9013) #2 0x00007fe3cf8ef4f7 __vsyslog_internal (libc.so.6 + 0xf94f7) #3 0x00007fe3cf8ef803 __syslog_chk (libc.so.6 + 0xf9803) #4 0x00005626488bcd4c n/a (sshd + 0x65d4c) #5 0x00005626488ba1ec n/a (sshd + 0x631ec) #6 0x000056264888ada9 n/a (sshd + 0x33da9) #7 0x00005626488bccc5 n/a (sshd + 0x65cc5) #8 0x00005626488bd0b1 n/a (sshd + 0x660b1) #9 0x0000562648866376 n/a (sshd + 0xf376) #10 0x00007fe3cf832f80 __restore_rt (libc.so.6 + 0x3cf80) #11 0x00007fe3cf8eca67 __select (libc.so.6 + 0xf6a67) #12 0x00005626488c3745 n/a (sshd + 0x6c745) #13 0x00005626488c96b3 n/a (sshd + 0x726b3) #14 0x00005626488c97b9 n/a (sshd + 0x727b9) #15 0x00005626488638ca n/a (sshd + 0xc8ca) #16 0x00007fe3cf81db25 __libc_start_main (libc.so.6 + 0x27b25) #17 0x000056264886609e n/a (sshd + 0xf09e) Compiled with debugging info, gdb gives the following backtrace: Using host libthread_db library "/usr/lib/libthread_db.so.1". Core was generated by `sshd: unknown [net]'. Program terminated with signal SIGSYS, Bad system call. #0 0x00007fb7ee8e28fb in socket () from /usr/lib/libc.so.6 (gdb) bt #0 0x00007fb7ee8e28fb in socket () from /usr/lib/libc.so.6 #1 0x00007fb7ee8db013 in openlog_internal () from /usr/lib/libc.so.6 #2 0x00007fb7ee8db4f7 in __vsyslog_internal () from /usr/lib/libc.so.6 #3 0x00007fb7ee8db803 in __syslog_chk () from /usr/lib/libc.so.6 #4 0x000055d42c649d4c in syslog (__fmt=0x55d42c6922f0 "%.500s", __pri=2) at /usr/include/bits/syslog.h:31 #5 do_log (args=0x7ffe1d1895e0, fmt=0x7ffe1d189100 "mm_log_handler: write: %s", suffix=0x0, force=, level=SYSLOG_LEVEL_FATAL, line=111, func=0x55d42c68a0a0 <__func__.27> "mm_log_handler", file=0x55d42c689940 "monitor_wrap.c") at log.c:416 #6 sshlogv (file=0x55d42c689940 "monitor_wrap.c", func=0x55d42c68a0a0 <__func__.27> "mm_log_handler", line=111, showfunc=, level=SYSLOG_LEVEL_FATAL, suffix=0x0, fmt=0x55d42c686cc2 "write: %s", args=0x7ffe1d1895e0) at log.c:485 #7 0x000055d42c6471ec in sshfatal (file=file at entry=0x55d42c689940 "monitor_wrap.c", func=func at entry=0x55d42c68a0a0 <__func__.27> "mm_log_handler", line=line at entry=111, showfunc=showfunc at entry=1, level=level at entry=SYSLOG_LEVEL_FATAL, suffix=suffix at entry=0x0, fmt=0x55d42c686cc2 "write: %s") at fatal.c:43 #8 0x000055d42c617da9 in mm_log_handler (file=0x55d42c680004 "sshd.c", func=, line=, level=SYSLOG_LEVEL_FATAL, msg=0x7ffe1d189bd0 "Timeout before authentication for 208.105.190.83 port 30569", ctx=0x55d42e235e40) at monitor_wrap.c:111 #9 0x000055d42c649cc5 in do_log (args=0x7ffe1d18a4b0, fmt=0x7ffe1d189fd0 "Timeout before authentication for %s port %d", suffix=0x0, force=, level=SYSLOG_LEVEL_FATAL, line=370, func=0x55d42c6818f0 <__func__.26> "grace_alarm_handler", file=0x55d42c680004 "sshd.c") at log.c:403 #10 sshlogv (file=0x55d42c680004 "sshd.c", func=0x55d42c6818f0 <__func__.26> "grace_alarm_handler", line=370, showfunc=, level=level at entry=SYSLOG_LEVEL_FATAL, suffix=0x0, fmt=0x55d42c680700 "Timeout before authentication for %s port %d", args=0x7ffe1d18a4b0) at log.c:485 #11 0x000055d42c64a0b1 in sshsigdie (file=file at entry=0x55d42c680004 "sshd.c", func=func at entry=0x55d42c6818f0 <__func__.26> "grace_alarm_handler", line=line at entry=370, showfunc=showfunc at entry=0, level=level at entry=SYSLOG_LEVEL_ERROR, suffix=suffix at entry=0x0, fmt=0x55d42c680700 "Timeout before authentication for %s port %d") at log.c:454 #12 0x000055d42c5f3376 in grace_alarm_handler (sig=) at sshd.c:370 #13 #14 0x00007fb7ee8d8a67 in select () from /usr/lib/libc.so.6 #15 0x000055d42c650745 in ssh_packet_read_seqnr (ssh=ssh at entry=0x55d42e2442e0, typep=typep at entry=0x7ffe1d18cc73 "", seqnr_p=seqnr_p at entry=0x7ffe1d18cc74) at packet.c:1364 #16 0x000055d42c6566b3 in ssh_dispatch_run (ssh=ssh at entry=0x55d42e2442e0, mode=mode at entry=0, done=done at entry=0x55d42e245fc0) at dispatch.c:96 #17 0x000055d42c6567b9 in ssh_dispatch_run_fatal (ssh=ssh at entry=0x55d42e2442e0, mode=mode at entry=0, done=done at entry=0x55d42e245fc0) at dispatch.c:133 #18 0x000055d42c603411 in do_authentication2 (ssh=ssh at entry=0x55d42e2442e0) at auth2.c:176 #19 0x000055d42c5f0903 in main (ac=, av=) at sshd.c:2218 It looks like the parent process closes the logging socket before the child process has written to syslog, and glibc then tries to re-open the socket, which isn't allowed in the seccomp sandbox. strace of a child process when everything works (no core dump; managed to write to the socket before the parent closed it): 3774 pselect6(5, [4<socket:[83561]>], NULL, NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler) 3774 --- SIGALRM {si_signo=SIGALRM, si_code=SI_USER, si_pid=3773, si_uid=0} --- 3774 getpgid(0) = 3773 3774 getpid() = 3774 3774 write(8<pipe:[83581]>, "\0\0\0h\0\0\0\6sshd.c\0\0\0\23grace_alarm_ha"..., 108) = 108 3774 exit_group(1) = ? 3774 +++ exited with 1 +++ strace of a child process when this issue is triggered (core dump; the parent has already closed the socket): 3799 pselect6(5, [4<socket:[83972]>], NULL, NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler) 3799 --- SIGALRM {si_signo=SIGALRM, si_code=SI_USER, si_pid=3798, si_uid=0} --- 3799 getpgid(0) = 3798 3799 getpid() = 3799 3799 write(8<pipe:[83992]>, "\0\0\0h\0\0\0\6sshd.c\0\0\0\23grace_alarm_ha"..., 108) = -1 EPIPE (Broken pipe) 3799 openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied) 3799 getpid() = 3799 3799 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = ? 3799 +++ killed by SIGSYS (core dumped) +++ strace of a parent process: 3773 restart_syscall(<... resuming interrupted read ...>) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) 3773 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- 3773 kill(3774, SIGALRM) = 0 3773 getpgid(0) = 3773 3773 getpid() = 3773 3773 rt_sigaction(SIGTERM, {sa_handler=SIG_IGN, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f859affef80}, {sa_handler=SIG_DFL, sa_mask=~[KILL STOP RTMIN RT_1], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f859affef80}, 8) = 0 3773 kill(0, SIGTERM) = 0 3773 getpid() = 3773 3773 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 5<socket:[83963]> 3773 connect(5<socket:[83963]>, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0 3773 sendto(5<socket:[83963]>, "<34>Mar 17 15:26:59 sshd[3773]: "..., 98, MSG_NOSIGNAL, NULL, 0) = 98 3773 close(5<socket:[83963]>) = 0 3773 exit_group(1) = ? 3773 +++ exited with 1 +++ -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:23 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #1 from Dan Harnett <daniel.harnett at gmail.com> --- Created attachment 3482 --> https://bugzilla.mindrot.org/attachment.cgi?id=3482&action=edit systemd log -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:23 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #2 from Dan Harnett <daniel.harnett at gmail.com> --- Created attachment 3483 --> https://bugzilla.mindrot.org/attachment.cgi?id=3483&action=edit gdb backtrace of sshd compiled with debugging info -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:24 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #3 from Dan Harnett <daniel.harnett at gmail.com> --- Created attachment 3484 --> https://bugzilla.mindrot.org/attachment.cgi?id=3484&action=edit strace when the child logs successfully -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:24 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #4 from Dan Harnett <daniel.harnett at gmail.com> --- Created attachment 3485 --> https://bugzilla.mindrot.org/attachment.cgi?id=3485&action=edit strace when the child fails to log -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:25 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #5 from Dan Harnett <daniel.harnett at gmail.com> --- Created attachment 3486 --> https://bugzilla.mindrot.org/attachment.cgi?id=3486&action=edit strace of the parent, includes closing the logging socket -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-22 15:26 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #6 from Dan Harnett <daniel.harnett at gmail.com> --- My apologies for the inline traces. I've added them as attachments. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-23 03:39 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #7 from Damien Miller <djm at mindrot.org> --- It looks like we don't reset the LoginGraceTime SIGALRM handler in the privsep child. Perhaps this handler is racing the monitor process to terminate the session on timer expiry, but running afoul of the seccomp-bpf sandbox when it tries to create a socket to talk to syslogd. That doesn't explain why the logging isn't going via the monitor as it should be. A possibility here is the recursion protection in log.c:do_log() - if the signal arrived while a logging call was active, then the logging code falls back to attempting to syslog the message directly. AFAIK this hasn't changed recently though. In any case, we should skip the sigdie() in the privsep child, but I'd still like to figure out why this is triggering when it wasn't before. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-23 03:39 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 --- Comment #8 from Damien Miller <djm at mindrot.org> --- oh, thanks for the detailed diagnostic information. It's a huge help when someone takes the time to get a backtrace with symbols. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2021-Mar-24 23:50 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 elie.huvier at fastmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |elie.huvier at fastmail.com -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Apr-30 05:25 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned-bugs at mindrot.org |djm at mindrot.org Status|NEW |ASSIGNED CC| |dtucker at dtucker.net Attachment #3504| |ok?(dtucker at dtucker.net) Flags| | --- Comment #9 from Damien Miller <djm at mindrot.org> --- Created attachment 3504 --> https://bugzilla.mindrot.org/attachment.cgi?id=3504&action=edit avoid sigdie() in privsep child This patch should avoid the problem, but I still don't understand why this condition did not exist previously. Darren, is there any chance the ssh_signal change could have triggered this? -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2021-Apr-30 06:08 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Darren Tucker <dtucker at dtucker.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3504|ok?(dtucker at dtucker.net) |ok+ Flags| | --- Comment #10 from Darren Tucker <dtucker at dtucker.net> --- Comment on attachment 3504 --> https://bugzilla.mindrot.org/attachment.cgi?id=3504 avoid sigdie() in privsep child I could imagine it happening if signal(SIGALRM..) was reset during fork but not for sigaction but I don't known that this is the case. The man page for alarm(1) only says: Alarms created by alarm() are preserved across execve(2) and are not inherited by children created via fork(2). -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2021-May-07 03:14 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|ASSIGNED |RESOLVED Blocks| |3302 --- Comment #11 from Damien Miller <djm at mindrot.org> --- committed as 8c396f42c19 and will be in the OpenSSH 8.7 release Referenced Bugs: https://bugzilla.mindrot.org/show_bug.cgi?id=3302 [Bug 3302] Tracking bug for openssh-8.7 -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2022-Feb-25 02:59 UTC
[Bug 3286] sshd dumps core after authentication timeout
https://bugzilla.mindrot.org/show_bug.cgi?id=3286 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #12 from Damien Miller <djm at mindrot.org> --- closing bugs resolved before openssh-8.9 -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.