Damien Miller
2022-Jan-11 07:30 UTC
OS X poll breakage (Was: Please help test recent changes)
On Tue, 11 Jan 2022, Damien Miller wrote:> Here's the client side log of the failure. It comes from the > "same with early close of stdout/err" section of the test, but I can't > actually see anything get closed... > > debug3: receive packet: type 91 > debug2: channel_input_open_confirmation: channel 0: callback start > debug2: client_session2_setup: id 0 > debug1: Sending command: exec sh -c 'sleep 2; exec > /dev/null 2>&1; sleep 3; exit 0' > debug2: channel 0: request exec confirm 1 > debug3: send packet: type 98 > debug2: channel_input_open_confirmation: channel 0: callback done > debug2: channel 0: open confirm rwindow 0 rmax 32768 > debug2: channel 0: rcvd adjust 2097152 > debug3: receive packet: type 99 > debug2: channel_input_status_confirm: type 99 id 0 > debug2: exec request accepted on channel 0 > channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 > > and (separate run) with DEBUG_CHANNEL_POLL set: > > debug2: channel_input_status_confirm: type 99 id 0 > debug2: exec request accepted on channel 0 > debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x00 > debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[3].fd=7 want 0x01 ev 0x00 ready 0x00 rev 0x00 > debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[4].fd=8 want 0x01 ev 0x00 ready 0x00 rev 0x00 > debug1: channel_after_poll: pfd[2].fd 4 rev 0x0020 > debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x20 > channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 > FAIL: exit code (with sleep) mismatch for: 255 != 0 > > It looks like it fails with HAVE_POLL set to 0 too. > > Still looking...Wow, it looks like Darwin's poll(2) is completely broken for character- special devices (at least). E.g. the attached program spins shows similar behaviour when run on /dev/null - it spins, returning revents=POLLNVAL. It looks like I'm not the first to see this either, some people noticed it 17 years ago! https://lists.apple.com/archives/darwin-dev/2005/May/msg00220.html There's apparently a bug open (Apple bug 3710161), but I can't see it and if they haven't fixed it by now then they're presumably not in any great hurry. Unsetting HAVE_POLL lets the test pass. It seems like some other programs use a similar approach, e.g. https://www.mail-archive.com/bug-gnulib at gnu.org/msg00296.html So I think we need a HAVE_BROKEN_POLL :( -d
Damien Miller
2022-Jan-11 07:34 UTC
OS X poll breakage (Was: Please help test recent changes)
On Tue, 11 Jan 2022, Damien Miller wrote:> Wow, it looks like Darwin's poll(2) is completely broken for character- > special devices (at least). E.g. the attached program spins shows similar > behaviour when run on /dev/null - it spins, returning revents=POLLNVAL.oops, this was the test program I was using. Compare "./demo -" and "./demo /dev/null" #include <sys/stat.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <poll.h> #include <err.h> int main(int argc, char **argv) { struct pollfd p; struct stat s; int r; char c; setlinebuf(stdout); p.fd = 0; if (argc > 1 && strcmp(argv[1], "-") != 0) { if ((p.fd = open(argv[1], O_RDONLY)) == -1) err(1, "open %s", argv[1]); } if (fstat(p.fd, &s) != 0) err(1, "stat"); printf("fd %d mode 0%o\n", p.fd, s.st_mode); for (;;) { p.events = POLLIN; r = poll(&p, 1, -1); if (r == -1) err(1, "poll"); printf("poll %d rev 0x%02x\n", r, p.revents); if ((p.revents & (POLLIN|POLLHUP)) == 0) continue; r = read(p.fd, &c, 1); if (r == -1) err(1, "read"); if (r == 0) break; } return 0; }
Damien Miller
2022-Jan-11 07:44 UTC
OS X poll breakage (Was: Please help test recent changes)
On Tue, 11 Jan 2022, Damien Miller wrote:> So I think we need a HAVE_BROKEN_POLL :(This seems to work. I don't think the preprocessor will do the right thing if we have to support a system with BROKEN_POLL && HAVE_PPOLL, but we can cross that bridge if we come to it... diff --git a/configure.ac b/configure.ac index 1feb73ef9..eb265143b 100644 --- a/configure.ac +++ b/configure.ac @@ -731,6 +731,10 @@ main() { if (NSVersionOfRunTimeLibrary("System") >= (60 << 16)) # proc_pidinfo()-based closefrom() replacement. AC_CHECK_HEADERS([libproc.h]) AC_CHECK_FUNCS([proc_pidinfo]) + # poll(2) is broken for character-special devices (at least). + # cf. Apple bug 3710161 (not public, but searchable) + AC_DEFINE([BROKEN_POLL], [1], + [System poll(2) implementation is broken]) ;; *-*-dragonfly*) SSHDLIBS="$SSHDLIBS -lcrypt" diff --git a/openbsd-compat/bsd-poll.c b/openbsd-compat/bsd-poll.c index 2d28eed5b..8084776ce 100644 --- a/openbsd-compat/bsd-poll.c +++ b/openbsd-compat/bsd-poll.c @@ -15,7 +15,7 @@ */ #include "includes.h" -#if !defined(HAVE_PPOLL) || !defined(HAVE_POLL) +#if !defined(HAVE_PPOLL) || !defined(HAVE_POLL) || defined(BROKEN_POLL) #include <sys/types.h> #include <sys/time.h> @@ -29,7 +29,7 @@ #include <unistd.h> #include "bsd-poll.h" -#ifndef HAVE_PPOLL +#if !defined(HAVE_PPOLL) || defined(BROKEN_POLL) /* * A minimal implementation of ppoll(2), built on top of pselect(2). * @@ -109,9 +109,9 @@ out: errno = saved_errno; return ret; } -#endif /* HAVE_PPOLL */ +#endif /* !HAVE_PPOLL || BROKEN_POLL */ -#ifndef HAVE_POLL +#if !defined(HAVE_POLL) || defined(BROKEN_POLL) int poll(struct pollfd *fds, nfds_t nfds, int timeout) { @@ -126,6 +126,6 @@ poll(struct pollfd *fds, nfds_t nfds, int timeout) return ppoll(fds, nfds, tsp, NULL); } -#endif /* HAVE_POLL */ +#endif /* !HAVE_POLL || BROKEN_POLL */ -#endif /* HAVE_PPOLL || HAVE_POLL */ +#endif /* !HAVE_PPOLL || !HAVE_POLL || BROKEN_POLL */
Demi Marie Obenour
2022-Jan-11 18:43 UTC
OS X poll breakage (Was: Please help test recent changes)
On 1/11/22 02:30, Damien Miller wrote:> On Tue, 11 Jan 2022, Damien Miller wrote: > >> Here's the client side log of the failure. It comes from the >> "same with early close of stdout/err" section of the test, but I can't >> actually see anything get closed... >> >> debug3: receive packet: type 91 >> debug2: channel_input_open_confirmation: channel 0: callback start >> debug2: client_session2_setup: id 0 >> debug1: Sending command: exec sh -c 'sleep 2; exec > /dev/null 2>&1; sleep 3; exit 0' >> debug2: channel 0: request exec confirm 1 >> debug3: send packet: type 98 >> debug2: channel_input_open_confirmation: channel 0: callback done >> debug2: channel 0: open confirm rwindow 0 rmax 32768 >> debug2: channel 0: rcvd adjust 2097152 >> debug3: receive packet: type 99 >> debug2: channel_input_status_confirm: type 99 id 0 >> debug2: exec request accepted on channel 0 >> channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 >> >> and (separate run) with DEBUG_CHANNEL_POLL set: >> >> debug2: channel_input_status_confirm: type 99 id 0 >> debug2: exec request accepted on channel 0 >> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x00 >> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[3].fd=7 want 0x01 ev 0x00 ready 0x00 rev 0x00 >> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[4].fd=8 want 0x01 ev 0x00 ready 0x00 rev 0x00 >> debug1: channel_after_poll: pfd[2].fd 4 rev 0x0020 >> debug3: dump_channel_poll: channel 0: rfd r4 w7 e8 s-1 pfd[2].fd=4 want 0x01 ev 0x01 ready 0x00 rev 0x20 >> channel 0: invalid rfd pollfd[2].fd 4 r4 w7 e8 s-1 >> FAIL: exit code (with sleep) mismatch for: 255 != 0 >> >> It looks like it fails with HAVE_POLL set to 0 too. >> >> Still looking... > > Wow, it looks like Darwin's poll(2) is completely broken for character- > special devices (at least). E.g. the attached program spins shows similar > behaviour when run on /dev/null - it spins, returning revents=POLLNVAL.Is using kqueue(2) an option? That?s faster, and it doesn?t have the problem with large file descriptors that poll(2) has. -- Sincerely, Demi Marie Obenour (she/her/hers) -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB288B55FFF9C22C1.asc Type: application/pgp-keys Size: 4885 bytes Desc: OpenPGP public key URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220111/45651b0b/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220111/45651b0b/attachment.asc>