Darren Tucker
2022-Jan-10 11:04 UTC
OS X poll breakage (Was: Please help test recent changes)
On Fri, Jan 07, 2022 at 09:52:09AM +1100, Damien Miller wrote:> 1. Conversion of the ssh and sshd mainloop from select() to poll() > > This should be entirely invisible to users, so any behaviour change > is a bug. If you see something and want to help debug it further, > uncomment the DEBUG_CHANNEL_POLL #define in channels.c for helps of > extra debug logging.This change breaks (at least) the exit-status test on Mac OS X. I haven't had a chance to investigate yet, $ git bisect run sh -c "autoreconf && ./configure --without-openssl && make t-exec LTESTS=exit-status" [... lots ...] 17877bc81db3846e6e7d4cfb124d966bb9c9296b is the first bad commit commit 17877bc81db3846e6e7d4cfb124d966bb9c9296b Author: djm at openbsd.org <djm at openbsd.org> Date: Thu Jan 6 21:48:38 2022 +0000 upstream: convert ssh, sshd mainloops from select() to poll(); feedback & ok deraadt@ and markus@ has been in snaps for a few months OpenBSD-Commit-ID: a77e16a667d5b194dcdb3b76308b8bba7fa7239c $ uname -a Darwin osx-highsierra 17.7.0 Darwin Kernel Version 17.7.0: Fri Oct 30 13:34:27 PDT 2020; root:xnu-4570.71.82.8~1/RELEASE_X86_64 x86_64 -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Damien Miller
2022-Jan-11 04:00 UTC
OS X poll breakage (Was: Please help test recent changes)
On Mon, 10 Jan 2022, Darren Tucker wrote:> On Fri, Jan 07, 2022 at 09:52:09AM +1100, Damien Miller wrote: > > 1. Conversion of the ssh and sshd mainloop from select() to poll() > > > > This should be entirely invisible to users, so any behaviour change > > is a bug. If you see something and want to help debug it further, > > uncomment the DEBUG_CHANNEL_POLL #define in channels.c for helps of > > extra debug logging. > > This change breaks (at least) the exit-status test on Mac OS X. I > haven't had a chance to investigate yet, > > $ git bisect run sh -c "autoreconf && ./configure --without-openssl && > make t-exec LTESTS=exit-status" > [... lots ...] > 17877bc81db3846e6e7d4cfb124d966bb9c9296b is the first bad commit > commit 17877bc81db3846e6e7d4cfb124d966bb9c9296b > Author: djm at openbsd.org <djm at openbsd.org> > Date: Thu Jan 6 21:48:38 2022 +0000 > > upstream: convert ssh, sshd mainloops from select() to poll(); > > feedback & ok deraadt@ and markus@ has been in snaps for a few months > > OpenBSD-Commit-ID: a77e16a667d5b194dcdb3b76308b8bba7fa7239c > > $ uname -a > Darwin osx-highsierra 17.7.0 Darwin Kernel Version 17.7.0: Fri Oct 30 > 13:34:27 PDT 2020; root:xnu-4570.71.82.8~1/RELEASE_X86_64 x86_64Here's the client side log of the failure. It comes from the "same with early close of stdout/err" section of the test, but I can't actually see anything get closed... debug3: receive packet: type 91 debug2: channel_input_open_confirmation: channel 0: callback start debug2: client_session2_setup: id 0 debug1: Sending command: exec sh -c 'sleep 2; exec > /dev/null 2>&1; sleep 3; exit 0' debug2: channel 0: request exec confirm 1 debug3: send packet: type 98 debug2: channel_input_open_confirmation: channel 0: callback done debug2: channel 0: open confirm rwindow 0 rmax 32768 debug2: channel 0: rcvd adjust 2097152 debug3: receive packet: type 99 debug2: channel_input_status_confirm: type 99 id 0 debug2: exec request accepted on channel 0 channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 and (separate run) with DEBUG_CHANNEL_POLL set: debug2: channel_input_status_confirm: type 99 id 0 debug2: exec request accepted on channel 0 debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x00 debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[3].fd=7 want 0x01 ev 0x00 ready 0x00 rev 0x00 debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[4].fd=8 want 0x01 ev 0x00 ready 0x00 rev 0x00 debug1: channel_after_poll: pfd[2].fd 4 rev 0x0020 debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x20 channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 FAIL: exit code (with sleep) mismatch for: 255 != 0 It looks like it fails with HAVE_POLL set to 0 too. Still looking... -d