On 11.11.21 13:31, Stuart Henderson wrote:> On 2021/11/11 12:49, Konrad Bucheli wrote:
>> Hi Jochen
>>
>> We run a few thousands of hosts with varying quality of internet lines.
>> It is a fallback procedure to try to only use ed25519 crypto if the
>> connection fails half-way through. The reason is that it needs only
smaller
>> packets which can help if there there is (more) trouble with bigger
network
>> packets.
>
> This often indicates problems where some links have smaller than usual
> MTUs, in combination with missing ICMP fragmentation-needed messages
> (usually due to incorrect firewall configuration somewhere on the path).
> The handshake won't be the only place where you run into problems
though,
> using ed25519 to sidestep this just pushes the problem deeper and
you're
> likely to run into stalls during either file transfers or with large
> amounts of output. Reducing MTU (or clamping the TCP MSS) might be a
> better idea if you know you have to work over broken networks.
Thanks for the pointers, but after running into the problem again today,
I'm afraid that the actual cryptalgorithms do not play a role in the
problem ...
This time, the problem appeared between a devel test VM (OpenSSH config
tightened according to lynis guidelines, no agent involved, just one
4096 bit RSA and one ed25519 keypair in ~/.ssh) and one of our jump
hosts (serving as a test server in this instance, sshd config hardened
manually). The -vvv outputs from the commands that I also tried a week
ago differ as follows:
> $ diff nok.client.log ok.client.log
> 1c1
> < $ ssh -vvv binect-support
> ---
>> $ ssh -vvv -o "KexAlgorithms
diffie-hellman-group-exchange-sha256" binect-support
> 40c40
> < debug2: KEX algorithms: curve25519-sha256,curve25519-sha256 at
libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1,ext-info-c
> ---
>> debug2: KEX algorithms: diffie-hellman-group-exchange-sha256,ext-info-c
> 65c65
> < debug1: kex: algorithm: curve25519-sha256 at libssh.org
> ---
>> debug1: kex: algorithm: diffie-hellman-group-exchange-sha256
> 69,74c69,230
> < debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> < debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> < debug3: send packet: type 30
> < debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
> <
> < [ ... wait for timeout ... ]
> ---
>> debug1: kex: diffie-hellman-group-exchange-sha256 need=16 dh_need=16
>> debug1: kex: diffie-hellman-group-exchange-sha256 need=16 dh_need=16
>> debug3: send packet: type 34
>> debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<3072<8192) sent
>> debug3: receive packet: type 31
>> debug1: got SSH2_MSG_KEX_DH_GEX_GROUP
>> [...]
Upon seeing the difference in sheer *length* of the KEX algorithm list
offered by the client, I started experimenting with lists shortened from
either end ... the result being that I could cut out *entirely disjunct*
sets of algorithms to make the connection work.
Then I tried *this*:
> $ ssh -vvv -o "KexAlgorithms curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org" binect-support
> OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
[...]> debug1: kex: algorithm: curve25519-sha256 at libssh.org
> debug1: kex: host key algorithm: ecdsa-sha2-nistp256
> debug1: kex: server->client cipher: aes128-ctr MAC: umac-64-etm at
openssh.com compression: none
> debug1: kex: client->server cipher: aes128-ctr MAC: umac-64-etm at
openssh.com compression: none
> debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> debug3: send packet: type 30
> debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
[ ... waiting until timeout ...]
Yes, that's eight times the *same* algorithm (the one that would get
picked if there were no problem at all). Now let's try giving it only
*seven* thumbs-up:
> [bongo at cube-ng-06 ~]$ ssh -vvv -o "KexAlgorithms curve25519-sha256
at libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at
libssh.org,curve25519-sha256 at libssh.org,curve25519-sha256 at libssh.org"
binect-support
> OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
[...]> debug1: kex: algorithm: curve25519-sha256 at libssh.org
> debug1: kex: host key algorithm: ecdsa-sha2-nistp256
> debug1: kex: server->client cipher: aes128-ctr MAC: umac-64-etm at
openssh.com compression: none
> debug1: kex: client->server cipher: aes128-ctr MAC: umac-64-etm at
openssh.com compression: none
> debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> debug1: kex: curve25519-sha256 at libssh.org need=16 dh_need=16
> debug3: send packet: type 30
> debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
> debug3: receive packet: type 31
[ ... continue to successful connection]
Still possible that it's a pMTU detection problem or something alike it,
though, will have to look into the tcpdumps I now have to see whether
that's the case ...
(Both VMs are CentOS 7.9, the client a "free-range" one, the server a
cloud provider's sub-flavor. There's a handful of VLANs, leased line
uplink to a colo, then an IPsec VPN through the Internet into the cloud,
and finally the usual cloud networking between the two.)
Thanks again,
--
Jochen Bern
Systemingenieur
Binect GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3449 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20211117/66d42a16/attachment-0001.p7s>