Hong Cho
2019-Aug-20 02:56 UTC
[8.0p1] After the remote ssh command execution, the session does not terminate until an active disconnect
Hi,
We recently upgraded OpenSSH from 7.5p1 to 8.0p1, and received a report
that there was a behavioral differences where the ssh session did not
disconnect after a remote background command execution. For example, "ssh
user at example.com 'tail -f /var/xxx &'
After looking at the debug logs and comparing the code, it seems there was
a bug in the portable version during the adaptation of the 1.367 change
(refactoring) of channels.c.
The OpenBSD version looks correct, but for the portable version, the exit
condition of channel_handle_efd_read() didn't get converted correctly, but
somehow got copy-and-pasted from the rfd version. I couldn't find the
change history for the portable version, so I can't when/how this happened.
This should make the "force" condition the same as before the
refactoring
of the portable version of channels.c.
--- /tmp/tmp.48415.39 2019-08-19 19:24:29.000000000 -0700
+++ crypto/openssh/channels.c 2019-08-19 19:19:39.000000000 -0700
@@ -2103,7 +2103,15 @@ channel_handle_efd_read(struct ssh *ssh,
ssize_t len;
int r, force;
+ force = c->detach_close;
- force = c->isatty && c->detach_close && c->istate !=
CHAN_INPUT_CLOSED;
if (c->efd == -1 || (!force && !FD_ISSET(c->efd, readset)))
return 1;
---
Thanks.
Hong.
Damien Miller
2019-Aug-20 06:27 UTC
[8.0p1] After the remote ssh command execution, the session does not terminate until an active disconnect
On Tue, 20 Aug 2019, Hong Cho wrote:> Hi, > > We recently upgraded OpenSSH from 7.5p1 to 8.0p1, and received a report > that there was a behavioral differences where the ssh session did not > disconnect after a remote background command execution. For example, "ssh > user at example.com 'tail -f /var/xxx &' > > After looking at the debug logs and comparing the code, it seems there was > a bug in the portable version during the adaptation of the 1.367 change > (refactoring) of channels.c. > > The OpenBSD version looks correct, but for the portable version, the exit > condition of channel_handle_efd_read() didn't get converted correctly, but > somehow got copy-and-pasted from the rfd version. I couldn't find the > change history for the portable version, so I can't when/how this happened. > > This should make the "force" condition the same as before the refactoring > of the portable version of channels.c. > > --- /tmp/tmp.48415.39 2019-08-19 19:24:29.000000000 -0700 > +++ crypto/openssh/channels.c 2019-08-19 19:19:39.000000000 -0700 > @@ -2103,7 +2103,15 @@ channel_handle_efd_read(struct ssh *ssh, > ssize_t len; > int r, force; > > + force = c->detach_close; > - force = c->isatty && c->detach_close && c->istate != CHAN_INPUT_CLOSED;This change wasn't a refactor or a mistake, but an intentional bugfix for https://bugzilla.mindrot.org/show_bug.cgi?id=2071 Anyway, I don't see any change in behaviour for your testcase between OpenSSH 7.2 and 8.0 on Linux - they both wait for the tail process to terminate before dropping the connection. Applying your suggested change doesn't make any difference either. The behaviour at close has always been a tricky area and there are divergences between portable OpenSSH and OpenBSD for a long time, driven by how more SysV-ish systems like Linux differ from BSDs. BTW line-by-line change history can be extracted using "git blame" on a checkout of the portable OpenSSH git tree. -d
Hong Cho
2019-Aug-20 07:00 UTC
[8.0p1] After the remote ssh command execution, the session does not terminate until an active disconnect
Thank you for the reply.
I changed the remote command to a generic example because in our case, we
are using an internal binary that I didn't want to make public. I thought
it wouldn't matter, but apparently, it does. I am sorry I can't share
more
about this tool.
BTW, our environment is based on FreeBSD, not on Linux. In our case,
because "isatty" is set to 0, it never closes.
------
debug2: channel 0: output open -> closed
debug2: channel 0: read<=0 rfd 8 len 0
debug2: channel 0: read failed
debug2: channel 0: Fssh_chan_shutdown_read (i0 o3 sock -1 wfd 8 efd 10
[read])
debug2: channel 0: input open -> drain
debug3: channel_handle_efd_read: channel 0: efd 10 isatty 0 detach_close 1
istate 1 force 0 FD_ISSET 0
debug2: channel 0: ibuf_empty delayed efd 10/(0)
debug2: notify_done: reading
debug3: channel_handle_efd_read: channel 0: efd 10 isatty 0 detach_close 1
istate 1 force 0 FD_ISSET 0
debug2: channel 0: ibuf_empty delayed efd 10/(0)
debug3: channel_handle_efd_read: channel 0: efd 10 isatty 0 detach_close 1
istate 1 force 0 FD_ISSET 0
debug3: receive packet: type 1
Received disconnect from example.com port 29940:11: disconnected by user
Disconnected from user user example.com port 29940
------
The "channel_handle_efd_read" message is added by me, to see what was
going
on. Before 8.0p1, we were using 7.5p1, and it used to "closing
read-efd"
after "input open -> drain".
However, with 8.0p1, because "isatty" is 0, channel_handle_efd_read()
just
returns and the channel remains open ("ibuf_empty delayed efd").
Because of
this behavioral change, our automatic tests have been blocked.
If this is really a bug fix, does it mean that we have to somehow set
"isatty" to 1? How can it be done on the command-line from the ssh
client?
Hong.
On Tue, Aug 20, 2019 at 3:27 PM Damien Miller <djm at mindrot.org> wrote:
> On Tue, 20 Aug 2019, Hong Cho wrote:
>
> > Hi,
> >
> > We recently upgraded OpenSSH from 7.5p1 to 8.0p1, and received a
report
> > that there was a behavioral differences where the ssh session did not
> > disconnect after a remote background command execution. For example,
"ssh
> > user at example.com 'tail -f /var/xxx &'
> >
> > After looking at the debug logs and comparing the code, it seems there
> was
> > a bug in the portable version during the adaptation of the 1.367
change
> > (refactoring) of channels.c.
> >
> > The OpenBSD version looks correct, but for the portable version, the
exit
> > condition of channel_handle_efd_read() didn't get converted
correctly,
> but
> > somehow got copy-and-pasted from the rfd version. I couldn't find
the
> > change history for the portable version, so I can't when/how this
> happened.
> >
> > This should make the "force" condition the same as before
the refactoring
> > of the portable version of channels.c.
> >
> > --- /tmp/tmp.48415.39 2019-08-19 19:24:29.000000000 -0700
> > +++ crypto/openssh/channels.c 2019-08-19 19:19:39.000000000 -0700
> > @@ -2103,7 +2103,15 @@ channel_handle_efd_read(struct ssh *ssh,
> > ssize_t len;
> > int r, force;
> >
> > + force = c->detach_close;
> > - force = c->isatty && c->detach_close &&
c->istate != CHAN_INPUT_CLOSED;
>
> This change wasn't a refactor or a mistake, but an intentional bugfix
for
> https://bugzilla.mindrot.org/show_bug.cgi?id=2071
>
> Anyway, I don't see any change in behaviour for your testcase between
> OpenSSH 7.2 and 8.0 on Linux - they both wait for the tail process to
> terminate before dropping the connection. Applying your suggested change
> doesn't make any difference either.
>
> The behaviour at close has always been a tricky area and there are
> divergences between portable OpenSSH and OpenBSD for a long time, driven
> by how more SysV-ish systems like Linux differ from BSDs.
>
> BTW line-by-line change history can be extracted using "git
blame"
> on a checkout of the portable OpenSSH git tree.
>
> -d
>