Stephen Hemminger
2018-Jan-28 22:28 UTC
[Bridge] ssh connection not working when ssh server is behind a linux bridge
On Sun, 28 Jan 2018 22:15:34 +0200 Adrian Pascalau <adrian27oradea at gmail.com> wrote:> Hi, > > I have a strange issue with a linux bridge and a openssh server > running in a VM connected to that bridge. Basically when I ssh with > Putty or any other windows based ssh client to an openssh server > running in a centos VM connected to the external network through a > linux bridge, the ssh connection hangs before the login prompt is > shown. > > What I have learned is that when an Ethernet frame that is less then > 60 bytes in size goes through the network, it is padded with 0x00 > bytes until it has 60 bytes in length (64 with the frame check > sequence). When this kind of padded Ethernet frame goes from the > openssh server in my VM through the linux bridge to a windows host > where the Putty ssh client is, the IP and TCP headers in those frames > wrongly consider the 0x00 padded bytes as part of the IP / TCP user > data, therefore the upstream protocol (SSH in my case) tries to > interpret them, and the ssh session hangs. > > My understanding is that those 0x00 padded bytes are at the layer2 > Ethernet frame level, and should not be considered in the user data of > the higher level protocols. About the padding bytes I have found some > info here: https://wiki.wireshark.org/Ethernet#Allowed_Packet_Lengths > > So I suspect this behavior is caused by the linux bridge, because > without the linux bridge, the ssh connection works without any issue. > What I mean is that if I ssh to the host operating system, the ssh > connection works without any issue, however if I ssh to the centos VM > that is running in that host operating system and that uses the linux > bridge for external network access, then I have this behavior that I > describe above. With other words, only when the linux bridge is in the > path of the ssh packets, this issue happens. > > Now, my centos VM where I have this ssh issue is managed by libvirt, > and the bridge is in forwarding mode and created by hand. If the same > centos VM is migrated into an all-in-one OpenStack Pike host, where > the bridge is managed by neutron, the ssh connection works again > without any issue. In all cases I am talking about the latest centos > openssh server, and the default ssh server configuration file. > > So, what do you think? Where should I look further to understand what > exactly causes this behavior? > > Many thanks, > AdrianThese symptoms sound like an MTU mismatch. The padding is not related. More likely, the issue is that one side is sending a larger frame than the MTU of the underlying interface. Since the bridge is a pure layer 2 interface, it has not choice but to drop any frame where the size is greater than the MTU.
Adrian Pascalau
2018-Jan-29 07:51 UTC
[Bridge] ssh connection not working when ssh server is behind a linux bridge
On Mon, Jan 29, 2018 at 12:28 AM, Stephen Hemminger <stephen at networkplumber.org> wrote:> These symptoms sound like an MTU mismatch. > The padding is not related. More likely, the issue is that one side > is sending a larger frame than the MTU of the underlying interface. > Since the bridge is a pure layer 2 interface, it has not choice > but to drop any frame where the size is greater than the MTU.I did some traces in both ssh client and server side, and I can find all the frames in both sides, no one is missing. Here is an example with the padding issue I see: in the ssh server side, there is a 54 bytes TCP ACK frame send by the server, to acknowledge that it has received the initial ssh client protocol version advertisement: x.x.x.250 x.x.x.115 SSHv2 82 Client: Protocol (SSH-2.0-PuTTY_Release_0.70) x.x.x.115 x.x.x.250 TCP 54 22 ? 49810 [ACK] Seq=1 Ack=29 Win=29312 Len=0 In this 54 bytes TCP ACK frame, the Ethernet II header is 14 bytes long, the IP header is 20 bytes long, the TCP header is another 20 bytes in length, and there is no TCP payload, so in total 54 bytes. When this frame arrives in the client side, it is 60 bytes in length, and it is interpreted by the Wireshark as a SSHv2 frame, because of the 6 additional 0x00 padding bytes in the TCP payload. I find this exact frame in a working ssh session, and those 6 additional 0x00 padding bytes are correctly shows at the Ethernet II frame level. x.x.x.250 x.x.x.115 SSHv2 82 Client: Protocol (SSH-2.0-PuTTY_Release_0.70) x.x.x.115 x.x.x.250 SSHv2 60 Server: Encrypted packet (len=6) Frame 5: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) on interface 0 Ethernet II, Src: RealtekU_40:31:85 (52:54:00:40:31:85), Dst: Vmware_a6:ba:8a (00:50:56:a6:ba:8a) Destination: Vmware_a6:ba:8a (00:50:56:a6:ba:8a) Source: RealtekU_40:31:85 (52:54:00:40:31:85) Type: IPv4 (0x0800) Internet Protocol Version 4, Src: x.x.x.115, Dst: x.x.x.250 0100 .... = Version: 4 .... 0101 = Header Length: 20 bytes (5) Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT) Total Length: 46 Identification: 0xf6f1 (63217) Flags: 0x02 (Don't Fragment) Fragment offset: 0 Time to live: 64 Protocol: TCP (6) Header checksum: 0x76d4 [validation disabled] [Header checksum status: Unverified] Source: x.x.x.115 Destination: x.x.x.250 [Source GeoIP: Unknown] [Destination GeoIP: Unknown] Transmission Control Protocol, Src Port: 22, Dst Port: 49810, Seq: 1, Ack: 29, Len: 6 Source Port: 22 Destination Port: 49810 [Stream index: 0] [TCP Segment Len: 6] Sequence number: 1 (relative sequence number) [Next sequence number: 7 (relative sequence number)] Acknowledgment number: 29 (relative ack number) 0101 .... = Header Length: 20 bytes (5) Flags: 0x010 (ACK) Window size value: 229 [Calculated window size: 29312] [Window size scaling factor: 128] Checksum: 0x9e76 [correct] [Checksum Status: Good] [Calculated Checksum: 0x9e76] Urgent pointer: 0 [SEQ/ACK analysis] TCP payload (6 bytes) SSH Protocol SSH Version 2 Packet Length (encrypted): 00000000 Encrypted Packet: 0000
Seweryn Niemiec
2018-Jan-29 09:42 UTC
[Bridge] ssh connection not working when ssh server is behind a linux bridge
On 28.01.2018 23:28, Stephen Hemminger wrote:> These symptoms sound like an MTU mismatch. > The padding is not related. More likely, the issue is that one side > is sending a larger frame than the MTU of the underlying interface. > Since the bridge is a pure layer 2 interface, it has not choice > but to drop any frame where the size is greater than the MTU.I have a similar problem (same infrastructure and same symptoms) but with random occurrence. Ssh session hangs randomly, usually before the login prompt is shown), sometimes a bit later, sometimes after few days. I have MTU 1500 on all interfaces taking part in communication. Ping of any size works and as far as I tested, HTTP communication too, but there are problems with HTTPS. -- Best regards, Seweryn Niemiec