thr3ads.net - openssh bugs - [Bug 3067] New: Fails to unlink ControlMaster socket early enough, confuses other clients [Sep 2019]

If this information is useful, please help other people find it:
Share via:

bugzilla-daemon at bugzilla.mindrot.org

2019-Sep-06 18:32 UTC

[Bug 3067] New: Fails to unlink ControlMaster socket early enough, confuses other clients

https://bugzilla.mindrot.org/show_bug.cgi?id=3067

            Bug ID: 3067
           Summary: Fails to unlink ControlMaster socket early enough,
                    confuses other clients
           Product: Portable OpenSSH
           Version: 7.9p1
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: ssh
          Assignee: unassigned-bugs at mindrot.org
          Reporter: leonerd at leonerd.org.uk

(from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=877020)

TL;DR: ssh(1) must unlink local socket _before_ attempting more network
  traffic otherwise broken TCP sockets will stall the entire thing.

-

I make heavy use of the shared control sockets to multiplex multiple
shells, sftp, and other commands down a single TCP connection to remote
servers.

  ControlPath ~/var/run/ssh-master-%r@%h:%p.sock
  ControlPersist 1s
  ControlMaster auto

In this setup, under stable networking all works nicely.

However, my machine is a laptop, and sometimes due to mobile data,
wifi,
ethernet cable swapping, or other issues my IP address and hence
routing
change. After such a change, all existing TCP sockets are now unuseable
and must be closed and reopened.

Simply closing all ssh clients is insufficient here, because the client
tries to perform a controlled shutdown of the TCP socket *first* and
will only unlink(2) the control master socket from the local filesystem
after it has done this. By ordering the operations thus, the client
stalls trying to perform this controlled TCP shutdown over now-invalid
networking, and never gets around to removing the local unix socket.
New
ssh clients would try to use this and similarly stall.

The correct order of operation ought to be that the control master
local
socket is unlinked *before* trying to send any traffic, thus restoring
the user's "turn it off and on again" approach to fixing the
problem -
namely by just killing all their clients and making a new one.

---

Additionally I should add; a workaround for this is to simply

  $ rm ~/var/run/ssh-master-*.sock

after closing the previous master clients, before starting them up
afresh. The lack of existing unix socket causes the first one to take
master control again and all resumes fine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

bugzilla-daemon at mindrot.org

2020-Jul-10 04:01 UTC

head link

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

https://bugzilla.mindrot.org/show_bug.cgi?id=3067

Damien Miller <djm at mindrot.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |djm at mindrot.org

--- Comment #1 from Damien Miller <djm at mindrot.org> ---
I think you have two options here:

First, you can explicitly shut down multiplexed sessions that have
stalled underlying TCP connection. E.g. I use

for x in ~/.ssh/ctl-* ; do test -r "$x" && ssh -Fnone -Ostop
-oControlPath="$x" dummy ; done

Second, you can set a protocol-level health check to automatically kill
unresponsive sessions. E.g. adding the following to ~/.ssh/config

ServerAliveInterval 2m
ServerAliveCountMax 3

Will terminate any unresponsive connection after six minutes.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at mindrot.org

2023-Oct-11 06:33 UTC

head link

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

https://bugzilla.mindrot.org/show_bug.cgi?id=3067

Damien Miller <djm at mindrot.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WORKSFORME
             Status|NEW                         |RESOLVED

--- Comment #2 from Damien Miller <djm at mindrot.org> ---
closing for lack of followup

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at mindrot.org

2023-Oct-11 09:52 UTC

head link

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

https://bugzilla.mindrot.org/show_bug.cgi?id=3067

--- Comment #3 from Paul Evans <leonerd at leonerd.org.uk> ---
Ah - I didn't follow up because those both sound like other
workarounds, of a similar nature to the `rm` based one I already
suggested. Those require the user to actively do something to reset the
problem.

My suggested fix, of performing unlink() on the control socket before
attempting shutdown() on the TCP connection, requires no further action
on the part of the user. It just transparently works.

I therefore don't consider this to be a solution.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

Apparently Analagous Threads

Search for more reasonably related threads

openssh bugs - Sep 2019 - [Bug 3067] New: Fails to unlink ControlMaster socket early enough, confuses other clients

[Bug 3067] New: Fails to unlink ControlMaster socket early enough, confuses other clients

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

[Bug 3067] Fails to unlink ControlMaster socket early enough, confuses other clients

Apparently Analagous Threads