Christoph Anton Mitterer
2014-Dec-02 04:44 UTC
SSH via redundant login-nodes (with and without control channel multiplexing)
Hi. I'm recently playing a lot with control channel multiplexing and how it can be used to improve our local setup (ideally safe and automatically for all users). What we have here at the faculty are many nodes (thousands), all of them which are not directly reachable via SSH but only hopping over a login node, a setup which brings several advantages. Of these login nodes we have several (for availability reasons), e.g.: login-1.example.org login-2.example.org and all of them are reachable via a round-robin domain containing all the A and AAAA RRs of the above nodes: login.example.org. Because of the round robin domain name, all these nodes have the same SSH host key pair. What I now ideally want is, that SSH automatically picks one of the login nodes (ideally also in a round robin fashion), and that all this just works gracefully if one isn't reachable or becomes unresponsive. I'd give something like this to the user's ssh_config: ---------------------- Host login.example.org login-1.example.org login-2.example.org ProxyCommand none ControlMaster auto ControlPersist 1m Host *.example.org ControlPath ~/.ssh/control-mux/%h #1# ProxyCommand sh -c "ssh -W %h:%p login-1.example.org || ssh -W %h:%p login-2.example.org" #2# ProxyCommand ssh -W %h:%p login.example.org #3# ProxyCommand sh -c "ssh -o ConnectTimeout=10s -W %h:%p login-1.example.org || ssh -o ConnectTimeout=10s -W %h:%p login-2.example.org" #4# ProxyCommand ssh -o ConnectTimeout=10s -W %h:%p login.example.org ---------------------- So I played around a bit with all that (both with and without control channel multiplexing) and here are the results, questions and issues I've encountered: 1) without control channel multiplexing (just strip any Control* options from above's config) At first I've used a ProxyCommand sh -c "ssh -W %h:%p login-1.example.org || ssh -W %h:%p login-2.example.org", which works more or less fine, if SSH to login-1 doesn't work (for whichever reason, node down, authentication issue, sshd not running) login-2 will be tried. Great, but the downside is: one always have to add all the login nodes to the command, no load balancing due to the strict ordering and the extra sh that is run. By chance I found out that it actually also works with the round-robin domain name (i.e. ProxyCommand ssh -W %h:%p login.example.org), well at least I've tried it with 2 A RRs (and in my tests I've used -4 to ssh). I tested via DROPing or REJECTing[1] packets to on or the other login nodes via iptables. Apparently, ssh picks the first A RR given by the resolver, and if that "doesn't work", it tries the next. One can see it in the counters of iptables, that on some connections, the DROP or REJECT rule was hit (i.e. ssh tried the "down" node first) and sometimes not (i.e. it immediately chose the "up" node). Fine, but: This behaviour (of trying more than one A/AAAA RR) is nowhere really documented in OpenSSH (which would be really nice): - Does it work only for 2 RRs (as in my test)? Does it really try both all A and AAAA RRs? - In which cases does it try other address RRs? Only when the node wasn't reachable (i.e. negative ICMP answer), or also in cases of timeout, authentication or any other errors? - Doesn't this somehow contradict the default of ConnectionAttempts=1, since it actually makes more than just one attempt? I mean what if some domain name contains 1 million A RRs? Actually it seems that it sends even two packages *per* address, is this simply needed for the tried handshake or is this a bug? Another open question is probably, whether using the round-robin name can be made working if the login-* nodes do *not* use the same host key pair. So what one want's is something like this: Host login.example.org HostKeyAlias login-1.example.org HostKeyAlias login-2.example.org in the sense that either key would be accepted. Does that work or could it be implemented? 2) with control channel multiplexing Here things get of course much more tricky. The first thing one notices is, that the control socket is always created based on the names of the host. In the case of the round-robin domain this means, that again only one login node will actually be used, that one where the socket was opened to, thus all load balancing efforts are basically destroyed again. Any ideas to solve that? Perhaps by adding %X symbols which are not the hostname but the v4 or v6 address that was used to connect? This would have the other advantage that it then also works for same hosts reached by different names (CNAMEs and that like). Apart from that, the different ways above (#1# and #2#) work just as one would expect... if I REJECT access, then it immediately tries the other one, if I DROP access it takes ages till TCP times out. Another question one could ask is: How does all that behave if an existing socket becomes unresponsive? The first thing I've noted is, that if I use REJECT to block any further accesses to the socket server (sshd) the socket/mux process aren't terminated immediately (even though this should probably the way to go?). If one uses DROP then it takes whatever time it needs to time out depending on TCP keep alives and/or ServerAlives*. Now the mux connections seem to behave just like a normal SSH connection, with respect to ServerAlives* - i.e. after the timeout, the mux is killed, and any ssh processes using it as well. I've disabled TCP keep alives and my ServerAlives are set to allow at most 2 mins of no reply (which is desired in order not to kill of hanging connections too early. So basically lowering the timeout is not an alternative if one wants to give hanging sessions the chance to recover. Another thing I've observed during DROP//REJECT of the already existing mux: OpenSSH's documentation basically says "if there is a mux path configured and the sockets exists, we try to use it, if that doesn't work we connect normally". But what's apparently happen is: as soon as the socket exists and ssh can connect to the socket it won't fall back to "normal" even if the socket's connection is already dead. So what I did was: ssh to the same host using the existing socket (whose connection is however iptables blocked, either with DROP or REJECT)... the new ssh connects happily to the socket and after it (or the mux process) times out... it fails and does *not* connect normally :-( Now, even though I would want to keep my (probably just hanging muxes and their sessions) for my long timeout period, I still want any *new* connections trying the other login nodes first (maybe they work immediately). If the old one recovers, fine, continue to use that one for the old connections, use the new one for the new connections.[0] Of course I cannot solve this with ServerAlives* or TCP keep alives timeouts... even if it would work technically, then any such new connection would then have a lower timeout (which I no longer want, once the connection was established). I hoped ConnectionTimeout could do the job for me. So I tried #3# and #4# in the config example above,... but unfortunately: ConnectionTimeout seems to not apply when an existing control mux socket is used :-( Question here basically: Can it be implemented that ConnectionTimeout also works for sockets - in the sense of time that it needs to open the socket, talk to his socket server (the mux process) and finally get the okay answer from the remote sshd that a new sessions is there? Cause if that would work (and also for the round robin thingy) one would basically have a way that *completely established* connection retain their long timeouts (via ServerAlives*), but trying to establish such connection has the short timeout from ConnectionTimeout - thus, if my existing socket just hangs for a while on login-1, I get a new one on login-2 (which may be not haning). Obviously, a tricky portion of the whole thing is still how to use a round robin name, with multiple sockets... as described in [0]. Especially while not accidentally opening up any tricky ways to exploit this in terms of security. Cheers, Chris. [0] Here a problem in my suggestion to use the v4/v6 address as the socket name becomes clear: As the resolver gives back different names, both would sooner or later be used which somehow destroys the idea of muxing... not sure whether there is a easy (and especially secure) way around this. Maybe ssh could check whether a socket already exists that matches the name of one of the hostname's addresses,... but this seems to be security prone (what if DNS changes in the meantime,... then perhaps ssh tricks itself into using the wrong host). So maybe another way could be to not use the address, but a hash of the host's host key + the address family? [1] DROP / REJECT in the sense of Linux' netfilter, respectively iptables keywords. DROP just silently discards (i.e. one can only run into the (possibly long) timeouts of SSH),... REJECT sends some ICMP packet to the client (i.e. one can time out quite fast). -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5313 bytes Desc: not available URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20141202/f0cfb096/attachment-0001.bin>