So, we just found some ugly behaviour of OpenSSH on Solaris. Sometimes, it seems, sshd gets started with SIGCHLD blocked, this, apparently, being the setting of sshd's parent (a shell no doubt); signal blocking is inherited across exec*(). I don't know exactly which shell, or what really is at fault, but it happens. The problem is that the code in collect_children() first blocks SIGCHLD (SIGCLD) and then resets the signal block mask to whatever it was before, so if SIGCHLD was blocked to begin with, then it never gets unblocked in sshd. The resulting behaviour is that SSHv2 connections may hang. The Solaris proc tools, specifically /usr/proc/bin/psig, along with truss/strace, show the bug in action quite nicely. As much as this behaviour may not be a bug in OpenSSH, it may nonetheless be desirable to add a couple of calls to sigprocmask() in sshd.c:main() to make sure that SIGCHLD is not blocked. While looking at this I noticed that the compatibility shim in openbsd-compat/sigact.c for sigprocmask() has a bug: the second argument may be NULL but the shim does not check for this. A patch to openbsd-compatc/sigact.c:sigprocmask() and sshd.c:main() is attached. Thoughts? Should I file a bug report in bugzilla? Cheers, Nico -- -DISCLAIMER: an automatically appended disclaimer may follow. By posting- -to a public e-mail mailing list I hereby grant permission to distribute- -and copy this message.- -------------- next part -------------- Index: 3_0_2p1_w_gssk5_ubsw_prod.2/openbsd-compat/sigact.c --- 3_0_2p1_w_gssk5_ubsw_prod.2/openbsd-compat/sigact.c Wed, 21 Nov 2001 10:38:46 -0500 +++ 3_0_2p1_w_gssk5_ubsw_prod.2(w)/openbsd-compat/sigact.c Fri, 07 Jun 2002 15:42:50 -0400 @@ -61,6 +61,7 @@ sigset_t current = sigsetmask(0); if (omask) *omask = current; + if (!mask) return 0; if (mode==SIG_BLOCK) current |= *mask; Index: 3_0_2p1_w_gssk5_ubsw_prod.2/sshd.c --- 3_0_2p1_w_gssk5_ubsw_prod.2/sshd.c Thu, 17 Jan 2002 17:53:49 -0500 +++ 3_0_2p1_w_gssk5_ubsw_prod.2(w)/sshd.c Fri, 07 Jun 2002 15:53:22 -0400 @@ -556,6 +556,11 @@ int startups = 0; Key *key; int ret, key_used = 0; + sigset_t curr_mask; + + sigprocmask(0, NULL, &curr_mask); + sigdelset(¤t_mask, SIGCHLD); + sigprocmask(SIG_SETMASK, &curr_mask, NULL); __progname = get_progname(av[0]); init_rng(); -------------- next part -------------- Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Once upon a time, Nicolas Williams <Nicolas.Williams at ubsw.com> said:> Sometimes, it seems, sshd gets started with SIGCHLD blocked, this, > apparently, being the setting of sshd's parent (a shell no doubt); > signal blocking is inherited across exec*(). I don't know exactly which > shell, or what really is at fault, but it happens.Funny; I just ran into a case of sshd running with SIGALRM blocked on Linux (caused problems because I restarted sendmail from an ssh login, and it would never time out connections because SIGALRM was always blocked). Would it be a problem for sshd to clear the blocked signals at start? Is there a valid case for it to inherit blocked signals? -- Chris Adams <cmadams at hiwaay.net> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble.
Looking further into this, yes, it's clear that POSIX requires signal masks to be inherited across exec. This kinda sucks and the standard says as much and puts the burden of making sure to unblock signals on the programs themselves, both parents and children, though the emphasis appears to be on the parents on account of many programs lacking knowledge of signal masks. In other words, programs, such as shells, which generally create new processes ought to ensure that signal masks are cleared before calling exec*(). I would dare say that this applies to sshd as well and that it does so for all signals, not just SIGCHLD. Anyways, that is my interpretation of the text quoted below. See: http://www.opengroup.org/onlinepubs/007904975/functions/exec.html Specifically: " This volume of IEEE Std 1003.1-2001 specifies that signals set to SIG_IGN remain set to SIG_IGN, and that the process signal mask be unchanged across an exec. This is consistent with historical implementations, and it permits some useful functionality, such as the nohup command. However, it should be noted that many existing applications wrongly assume that they start with certain signals set to the default action and/or unblocked. In particular, applications written with a simpler signal model that does not include blocking of signals, such as the one in the ISO C standard, may not behave properly if invoked with some signals blocked. Therefore, it is best not to block or ignore signals across execs without explicit reason to do so, and especially not to block signals across execs of arbitrary (not closely co-operating) programs. " Nico -- -DISCLAIMER: an automatically appended disclaimer may follow. By posting- -to a public e-mail mailing list I hereby grant permission to distribute- -and copy this message.- Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.