Doug Graham
2019-Sep-16 03:41 UTC
ssh client is setting O_NONBLOCK on a pipe shared with other processes
> So the make process gets an EAGAIN on the write syscall and doesn't > retry? That sounds like a bug in whatever make you're using, since > that could potentially occur in other circumstances too.What other circumstances? EAGAIN means that something put the device into non-blocking mode, and normally, that should only happen if the program calling write had itself previously set O_NONBLOCK. I don't think programs that don't set O_NONBLOCK are required to handle EAGAIN or short counts. They *may* need to deal with EINTR but signals don't come out of nowhere either and many programs run in an environment where EINTR is also unexpected. We are using GNU make 3.81 but newer versions of gmake do the same thing: void close_stdout (void) { int prev_fail = ferror (stdout); int fclose_fail = fclose (stdout); if (prev_fail || fclose_fail) { if (fclose_fail) error (NILF, _("write error: %s"), strerror (errno)); else error (NILF, _("write error")); exit (EXIT_FAILURE); } } On Sun, Sep 15, 2019 at 11:16 PM Darren Tucker <dtucker at dtucker.net> wrote:> > On Mon, 16 Sep 2019 at 01:18, Doug Graham <edougra at gmail.com> wrote: > [...] > > Make occasionally gets an unexpected EAGAIN error and fails the build > > with the error message "make: write error". > > So the make process gets an EAGAIN on the write syscall and doesn't > retry? That sounds like a bug in whatever make you're using, since > that could potentially occur in other circumstances too. > > (Same goes for EWOULDBLOCK, as well as EINTR if you don't have > restartable syscalls). > > -- > Darren Tucker (dtucker at dtucker.net) > GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) > Good judgement comes with experience. Unfortunately, the experience > usually comes from bad judgement.
Peter Stuge
2019-Sep-16 08:30 UTC
ssh client is setting O_NONBLOCK on a pipe shared with other processes
Doug Graham wrote:> > So the make process gets an EAGAIN on the write syscall and doesn't > > retry? That sounds like a bug in whatever make you're using, since > > that could potentially occur in other circumstances too. > > What other circumstances? EAGAIN means that something put the > device into non-blocking mode, and normally, that should only happen > if the program calling write had itself previously set O_NONBLOCK.Any program which makes assumptions about fds that have been passed to other programs risk that those assumptions no longer hold.> I don't think programs that don't set O_NONBLOCK are required to handle > EAGAIN or short counts.Please think about that some more. Case in point; EAGAIN can come if you give your fd to another process and continue using it yourself. Short counts; It is documented behavior that read() and write() may return short counts. It is not documented why, so you can not make any assumptions. With Linux, the kernel code of the particular device driver determines what read() and write() calls return, and because the userspace API is documented to allow short counts the different drivers may and do have different semantics for return counts, sometimes the way that fits the particular device, sometimes such that the kernel driver is much simpler, thus more reliable. I ignored short counts because convenient, until it caused me a problem. ;) Now I write a looping function called wr(), rd() do_write() or do_read().> We are using GNU make 3.81 but newer versions of gmake do the same thing:GNU programs are like other programs in that they aren't neccessarily correct. //Peter
Doug Graham
2019-Sep-16 14:06 UTC
ssh client is setting O_NONBLOCK on a pipe shared with other processes
> Case in point; EAGAIN can come if you give your fd to another process > and continue using it yourself.> Short counts; It is documented behavior that read() and write() may > return short counts. It is not documented why, so you can not make > any assumptions.You might be right about short counts but if you're right about EAGAIN, there are bugs everywhere. My first attempt at working around my "make: write error" failure was to pipe make into cat or tee, eg: "make | tee make.log". But that caused both cat and tee to fail with EAGAIN. So they have the same "bug" as make. Also note that make is just calling printf normally and then just before exiting, it calls ferror(stdout) to see if any error occurred when it previously wrote to stdout. ferror() is returning true. So now the bug has moved into the C library. Also note that EAGAIN is not a transient error like EINTR that will probably go away on a retry. Retrying the write in a tight loop would probably just burn some extra cpu and then fail anyway. You'd have to call select() first, or put a delay in the loop. Are you suggesting that every program that writes to stdout should implement such contortions? Not to mention that if the error occurred in stdout or some other C library routine, I don't think the calling program has any way of telling how much output was sent successfully and how much should be retried. I could write if (printf("hello world") < 0 && errno == EAGAIN) <what here?> but can I safely assume that none of my string was written before the error occurred?
Apparently Analagous Threads
- ssh client is setting O_NONBLOCK on a pipe shared with other processes
- ssh client is setting O_NONBLOCK on a pipe shared with other processes
- ssh client is setting O_NONBLOCK on a pipe shared with other processes
- openssh 2.9p1: data loss when stdout sent to a pipe
- [PATCH v3] virtio-rng: return available data with O_NONBLOCK