bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-07 15:39 UTC
[Bug 2492] New: Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Bug ID: 2492 Summary: Incomplete output of child process Product: Portable OpenSSH Version: 7.1p1 Hardware: amd64 OS: Linux Status: NEW Severity: major Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: openssh at volth.com Created attachment 2747 --> https://bugzilla.mindrot.org/attachment.cgi?id=2747&action=edit using openssh-client I noticed that on Ubuntu-15.10 read() inside channel_handle_rfd() may return EAGAIN after the child process is exited. OpenSSH server considers this as an error and breaks the connection without reading the whole output. It looks like: http://i.imgur.com/mLMk66c.png <-- using openssh-client http://i.imgur.com/LNdNSRq.png <-- using Apache Mina ssh client It happens only if pty is allocated. I managed to fix the problem with the following patch: volth at 4d284a3 Would be nice to see this bug fixed one way or another as it is a very annoing bug, it breaks the workflow of automated deployment tools. PS. If you need a playground to reproduce the bug, you may try Ubuntu-15.10 instances on Google Cloud or DigitalOcean. The bug happens more often on the cloud machines. It happens on dedicated hardware as well but not so often. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-07 15:40 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #1 from Volth <openssh at volth.com> --- Created attachment 2748 --> https://bugzilla.mindrot.org/attachment.cgi?id=2748&action=edit using Apache Mina ssh-client -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-07 15:41 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #2 from Volth <openssh at volth.com> --- Fixed link to patch: https://github.com/volth/openssh-portable/commit/4d284a3024f38726a8b8c53ac39ca5bf037fe0d0 -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-09 02:05 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #3 from Damien Miller <djm at mindrot.org> --- This is problematic because it relates to a slightly-hacky fix for the hang on exit described in bug #52. https://bugzilla.mindrot.org/show_bug.cgi?id=52 The heuristic being used is described in this comment https://bugzilla.mindrot.org/show_bug.cgi?id=52#c23 Unfortunately, your patch reverses this fix and AFAIK would reintroduce the hang on exit. There are reproduction cases attached to the above bug. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-09 05:38 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #4 from Volth <openssh at volth.com> --- Hm. I am not sure if the bug #52 still relevant to the modern Linuxes where systemd puts all children of sshd forming one session into a single cgroup. I am not able to reproduce the bug #52 with the patched openssh. Even if it is still reproducible, using cgroups would be the less hacky way to fix it than aborting the session on seeing EAGAIN. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-16 21:05 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Volth <openssh at volth.com> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugzilla.mindrot.or | |g/show_bug.cgi?id=52 -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 18:59 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Marc Aurele La France <tuyoix at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tuyoix at gmail.com -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 21:35 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #5 from Marc Aurele La France <tsi at tuyoix.net> --- I'm with Damien on this one. Your "fix" can't possibly be correct. For one, EAGAIN == EWOULDBLOCK. Yes, even on your "modern Linuxes" with systemd, cgroups and other bloat. Try as I might I can't reproduce this problem with 7.1p1. I'd say more research needed to determine if this is really a problem with openssh. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 22:08 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #6 from Volth <openssh at volth.com> --- I just showed how I solved the annoying problem - by applying this patch, building a custom .deb and installing everywhere on my premises on the machines with Ubuntu 15.10. I do not insist (and cannot insist) on including my patch into openssh. It is completely up to you. Yes, the fresh github's master is affected by the bug #2492. And Ubuntu 15.10 with my patch is unaffected by the bug #52. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 22:28 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #7 from Marc Aurele La France <tsi at tuyoix.net> --- ... and I'm saying something else is causing the read() to return EAGAIN when it shouldn't (i.e. there is still data available). That's what needs further investigation in your case. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 22:38 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #8 from Volth <openssh at volth.com> --- I noticed that read() returns EAGAIN many times, openssh call read() again until it returns success (or other error). I does not cause any problem unless the heuristic is applied, I mean the line "force = c->isatty && c->detach_close && c->istate !CHAN_INPUT_CLOSED;" If there is no pty allocated - there is no problem with EAGAIN, read() returns EAGAIN, openssd retries. Until openssh get the exitcode of the child process - there is no problem with EAGAIN, read() returns EAGAIN, openssd retries. The problem is only when openssh gets the the exitcode (thus set c->detach_close to true) and only since this moment EAGAIN becomes illegal. But it happens. It happens before openssh gets the the exitcode as well it happens after openssh gets the the exitcode. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-17 23:01 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #9 from Marc Aurele La France <tsi at tuyoix.net> --- So the child is posting the last of its data AFTER its parent has been notified of its death? That makes no sense. There's a timing (or buffering) issue here. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-18 00:18 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #10 from Volth <openssh at volth.com> --- Yes. Even of you cannot reproduce the issue in a very visible form, if you log the exchange between ssh client and server, you may notice that almost always the server sends "exit-status" message before the last piece of the data (even without pty). If that last read() returns success, this reordering makes no harm (I tend not to see the reordering as a bug). You can consider the reordering is a bug and try to fix it, then it will automatically fix this bug. On Ubuntu 15.04 there is no reordering, it is something introduced in Ubuntu 15.10. Perhaps a bigger buffer. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-18 16:16 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #11 from Marc Aurele La France <tsi at tuyoix.net> --- I'd expect the ordering of events within the server/client conversation to vary. I wouldn't read too much into it. But that's not the issue here. The problem is that the kernel is telling the parent there is no more data from the child when in fact there is. So does 7.1p1 behave the same on 15.04? If not, I'd say you have some kernel digging to do. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-18 17:10 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #12 from Volth <openssh at volth.com> --- EAGAIN does not mean "there is no more data" it means there is data, please try to read it a bit later. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-18 21:55 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #13 from Marc Aurele La France <tsi at tuyoix.net> --- EGAIN means there is no data available at the time of the call. But, in the case here, this contradicts the fact that the child has previously written the last of its data, closed the tty and signalled the parent about its demise. Therefore, there >is< data available, but for whatever reason 15.10's kernel decides not to return it. So, I'll ask (e)again: does 7.1p1 exhibit the same behaviour on 15.04 (which runs a much older kernel) or not? -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 01:19 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #14 from Volth <openssh at volth.com> --- I have not tried this combination. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 01:23 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #15 from Volth <openssh at volth.com> --- (In reply to Marc Aurele La France from comment #13)> EGAIN means there is no data available at the time of the call. > But, in the case here, this contradicts the fact that the child has > previously written the last of its data, closed the tty and > signalled the parent about its demise. Therefore, there >is< data > available, but for whatever reason 15.10's kernel decides not to > return it.I am not sure the data must be available. Note that /bin/ping is not spawned by openssh directly, the data to be copied (perhaps lazily) through the chain of stdouts.> So, I'll ask (e)again: does 7.1p1 exhibit the same behaviour on > 15.04 (which runs a much older kernel) or not?I have not tried this combination. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 01:30 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #16 from Volth <openssh at volth.com> --- Among all Ubuntus, Debians and CoreOSes on Google Cloud Engine: 1. With the regular version of openssh the problem is only on Ubuntu 15.10 (one may say "only with Kernel 4.xx") 2. On Ubuntu 15.10 the problem is reproducible not only with regular version but also with github master and older versions (random version from github master 1, 2 and 3 years ago) There was no reason for trying different openssh version on Ubuntu 15.04 -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 03:18 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #17 from Marc Aurele La France <tsi at tuyoix.net> --- 1) If you are using this Bugzilla to report a problem with openssh, then it must be against a stock version of it, not one polluted with Ubuntu-isms. 2) You state the problem only occurs on 15.10, regardless of openssh version. This makes it quite conclusive where the real issue lies, not in openssh. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 13:58 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #18 from Volth <openssh at volth.com> --- The machinations between open-source companies is not what I am engaged in nor willing to. I had a problem, solved it and shared the solution to the people who my also experience. All the rest is not my business. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-19 14:01 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #19 from Volth <openssh at volth.com> ---> 2) You state the problem only occurs on 15.10, regardless of openssh > version. This makes it quite conclusive where the real issue lies, > not in openssh.I would agrue that your 9-years old heuristic eventually stopped working. You could try to make a kernel patch to keep it supported, but I am not in it. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-20 14:00 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #20 from Marc Aurele La France <tsi at tuyoix.net> --- I'm still investigating this, but, in the meantime, I think that, even with your change, you might find that ... ssh -t root at localhost '/bin/bash -c "/bin/ping -c3 8.8.8.8" >&2' ... (i.e. redirecting the child's output to stderr) is also problematic. Your change is still incorrect, for a number of reasons, but I'm not too sure yet what to replace it with. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-24 02:58 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #21 from Marc Aurele La France <tsi at tuyoix.net> --- I've verified that your suggested fix does indeed reintroduce the hang-on-exit bug. Log into your patched sshd and do ... sleep 30 & logout This will cause your session to hang until the sleep finishes, which is essentially the original problem reported in comment #1 of bug #52. I still think the problem you reported here is a kernel issue and have been doing some research. I'm curious what `uname -a` says on your system. If this is earlier than 4.0.5, please update to at least 4.0.5 (wily's git currently sits at above 4.2.3) and see if the partial output persists. If you're already at or above 4.0.5, then more research needs to be done. 4.0.5 is where http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/drivers/tty/pty.c?id=1a48632ffed61352a7810ce089dc5a8bcd505a60 first appeared. This changes reportedly fixes all known issues with the PTY code that were introduced in 3.18 and before. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-24 04:27 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #22 from Volth <openssh at volth.com> --- 4.2.0-17 You may want to register on https://cloud.google.com/ There is 60-day free trial and it is possible to start virtual machines with different Linuxes with default sessings. When I named different versions of Ubuntu and Debian I meant virtual machine on Google Cloud. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-24 15:20 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #23 from Marc Aurele La France <tsi at tuyoix.net> --- No. What does `uname -a` actually say. Anyting else, I've found, is smoke and mirrors. Thanks. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-24 15:44 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #24 from Volth <openssh at volth.com> --- (In reply to Marc Aurele La France from comment #23)> No. What does `uname -a` actually say. Anyting else, I've found, > is smoke and mirrors. Thanks.Linux g1.example.com 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2015-Nov-24 15:52 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #25 from Volth <openssh at volth.com> --- (In reply to Volth from comment #24)> (In reply to Marc Aurele La France from comment #23) > > No. What does `uname -a` actually say. Anyting else, I've found, > > is smoke and mirrors. Thanks. > > Linux g1.example.com 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 > 15:35:06 UTC 2015 x86_64 x86_64 x86_64 GNU/LinuxLinux s-tserver12.lan 4.2.0-17-generic #21-Ubuntu SMP Fri Oct 23 19:56:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux Linux q-datanode8.lan 4.2.0-18-generic #22-Ubuntu SMP Fri Nov 6 18:25:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Apr-24 10:11 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #26 from Volth <openssh at volth.com> --- Reproducible with Ubuntu 16.04 LTS (kernel-4.4.0-21 openssh-7.2p2-4) -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Apr-25 23:39 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #27 from Marc Aurele La France <tsi at tuyoix.net> --- This issue is recognised as a kernel regression and is still being discussed on LKML. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-May-23 22:33 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #28 from Marc Aurele La France <tsi at tuyoix.net> --- The fix for this has now made it into the mainline Linux kernel. It'll be included in 4.7. See http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-May-24 03:50 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #29 from Damien Miller <djm at mindrot.org> --- Marc, thank you very much indeed for chasing this all the way down. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Jul-20 04:59 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |dtucker at zip.com.au Resolution|--- |WORKSFORME --- Comment #30 from Darren Tucker <dtucker at zip.com.au> --- Since this has been fixed in the kernel closing this bug. Thanks for your work! -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Jul-25 23:08 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 --- Comment #31 from Marc Aurele La France <tsi at tuyoix.net> --- The fix has now been incorporated into the following Linux mainline kernels: 4.1.26, 4.4.12, 4.5.6, 4.6.1 and the newly released 4.7. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Aug-02 00:40 UTC
[Bug 2492] Incomplete output of child process
https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #32 from Damien Miller <djm at mindrot.org> --- Close all resolved bugs after 7.3p1 release -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.