Adrian Pascalau
2018-Jan-26 19:50 UTC
issue with openssh-server running in a libvirt based centos virtual machine
On Fri, Jan 26, 2018 at 5:02 AM, Darren Tucker <dtucker at dtucker.net> wrote:> On 26 January 2018 at 06:44, Adrian Pascalau <adrian27oradea at gmail.com> > wrote: >> >> [...] >> debug1: SSH2_MSG_KEXINIT sent [preauth] >> >> Here the debug mode stops, and there is no login prompt shown on the >> Putty window. > > > This behaviour is often caused by a path MTU/fragmentation problem. > > If you run "netstat" on both client and server and find the SSH TCP > connection you should see a "Send-Q" column. This column is the number of > bytes in the TCP socket buffer that the other end has not acknowledged. If > it is non-zero number that remains above zero or increases then IP > fragmentation is likely your problem and you need to fix whatever in your > environment is causing that.Darren, thanks for your reply. In the ssh client side, I have a windows host that runs Putty. So I installed a new centos to be able to see the "Send-Q" column, since the windows netstat does not have it. Surprise, the ssh connection works like a charm. I went back to the windows box to try again with Putty, and I have the same issue. So could this be because of windows? I cannot suspect Putty, since I tried this with another windows based ssh client (MobaXterm), and the same issue happens.
Nico Kadel-Garcia
2018-Jan-28 00:11 UTC
issue with openssh-server running in a libvirt based centos virtual machine
On Fri, Jan 26, 2018 at 2:50 PM, Adrian Pascalau <adrian27oradea at gmail.com> wrote:> On Fri, Jan 26, 2018 at 5:02 AM, Darren Tucker <dtucker at dtucker.net> wrote: >> On 26 January 2018 at 06:44, Adrian Pascalau <adrian27oradea at gmail.com> >> wrote: >>> >>> [...] >>> debug1: SSH2_MSG_KEXINIT sent [preauth] >>> >>> Here the debug mode stops, and there is no login prompt shown on the >>> Putty window. >> >> >> This behaviour is often caused by a path MTU/fragmentation problem. >> >> If you run "netstat" on both client and server and find the SSH TCP >> connection you should see a "Send-Q" column. This column is the number of >> bytes in the TCP socket buffer that the other end has not acknowledged. If >> it is non-zero number that remains above zero or increases then IP >> fragmentation is likely your problem and you need to fix whatever in your >> environment is causing that. > > Darren, thanks for your reply. > > In the ssh client side, I have a windows host that runs Putty. So I > installed a new centos to be able to see the "Send-Q" column, since > the windows netstat does not have it. Surprise, the ssh connection > works like a charm. I went back to the windows box to try again with > Putty, and I have the same issue. So could this be because of windows? > I cannot suspect Putty, since I tried this with another windows based > ssh client (MobaXterm), and the same issue happens.As much as I love Putty as a well-built tool, it's gotten a bit long in the tooth. May I suggest that you test it from a Cygwin shell on your client host?
Darren Tucker
2018-Jan-28 01:17 UTC
issue with openssh-server running in a libvirt based centos virtual machine
On 27 January 2018 at 06:50, Adrian Pascalau <adrian27oradea at gmail.com> wrote:> So could this be because of windows? > I cannot suspect Putty, since I tried this with another windows based > ssh client (MobaXterm), and the same issue happens. >It's probably not Windows per se, but rather something different about how it behaves in that situation that is standards compliant but tickles a pre-existing path MTU/fragmentation issue in your network. (eg maybe its ethernet device drivers use jumbo frames by default or something). You would be best served by finding that problem and fixing it, but you can try working around it by reducing the MTU on either server or client. 1500 is the typical value for ethernet, I'd try 1496 (typical 802.1Q overhead, 1492 (typical PPPoE overhead) maybe 1400 and if all else fails 576 (the minimum value the spec says an implementation should be able to handle). -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Adrian Pascalau
2018-Jan-28 19:35 UTC
issue with openssh-server running in a libvirt based centos virtual machine
On Sun, Jan 28, 2018 at 2:11 AM, Nico Kadel-Garcia <nkadel at gmail.com> wrote:> As much as I love Putty as a well-built tool, it's gotten a bit long > in the tooth. May I suggest that you test it from a Cygwin shell on > your client host?Thanks Nico, for your reply. I have tested with MobaXterm ssh client as well, and this one is based on Cygwin, with the ssh client version OpenSSH_7.1p2, OpenSSL 1.0.1g 7 Apr 2014. Still the same issue.
Adrian Pascalau
2018-Jan-28 19:45 UTC
issue with openssh-server running in a libvirt based centos virtual machine
On Sun, Jan 28, 2018 at 3:17 AM, Darren Tucker <dtucker at dtucker.net> wrote:> On 27 January 2018 at 06:50, Adrian Pascalau <adrian27oradea at gmail.com> > wrote: >> >> So could this be because of windows? >> I cannot suspect Putty, since I tried this with another windows based >> ssh client (MobaXterm), and the same issue happens. > > > It's probably not Windows per se, but rather something different about how > it behaves in that situation that is standards compliant but tickles a > pre-existing path MTU/fragmentation issue in your network. (eg maybe its > ethernet device drivers use jumbo frames by default or something). > > You would be best served by finding that problem and fixing it, but you can > try working around it by reducing the MTU on either server or client. 1500 > is the typical value for ethernet, I'd try 1496 (typical 802.1Q overhead, > 1492 (typical PPPoE overhead) maybe 1400 and if all else fails 576 (the > minimum value the spec says an implementation should be able to handle).I have tested this one more time with all the hosts (ssh client and ssh server) in the same subnet, no routers/vpn in between. All hosts are connected to the same switch, same problem persist, so it is not an MTU issue. I took several tcpdump traces, and compared the working ssh sessions with the non working ones, and this is what I have found: when an Ethernet frame that is less then 60 bytes in size goes through the network, it is padded with 0x00 bytes until it has 60 bytes in length (64 with the frame check sequence). In my network I have a linux bridge that connects the centos VM to he external network. When this kind of padded frames goes through the linux bridge, somehow the IP and TCP headers in those frames wrongly consider the 0x00 padded bytes as part of the user data, therefore the upstream protocol (SSH in this case) tries to interpret them, and this is why Putty hangs. Those 0x00 padded bytes are at the layer2 Ethernet frame level, and should not be considered in the user data of the higher level protocols. I think I should take this to the linux bridge mailing lists.