Dennis Jacobfeuerborn
2016-Mar-04  18:17 UTC
[libvirt-users] network checksum offloading broken using virtio
Hi, with recent guest installs (both centos 5 and 7) on centos 7 hosts I seem to have to disable checksum offloading using "ethtool -K eth0 tx off" in order to allow traffic to flow a specific route. Basically the guest is installed with IP 192.168.21.10 and a default gateway of 192.168.21.254. Up until that point I can ssh into the system normally. There exists an OpenVPN System with the IP 192.168.21.1 that choses client IP's for the vpn connections from the pool 192.168.20.0/24. In order to pass the reponses back to the OpenVPN system I installed the route "192.168.20.0/24 via 192.168.21.1 dev eth0". When I now ping the IP 192.168.21.10 through the VPN connection this works fine but when I try to ssh into that system the connection just hangs. Looking at a tcpdump I noticed that the checksum for the packets weren't quite right so I issued a "ethtool -K eth0 tx off" and suddenly everything worked as expected. What is strange here is that I'm seeing this with both CentOS 5 and CentOS 7 guests and only when dealing with th routed traffic and not the regular one. Does anyone have an idea what is going on here? Could this be an issue with the virtio driver? Regards, Dennis
Daniel P. Berrange
2016-Mar-04  18:27 UTC
Re: [libvirt-users] network checksum offloading broken using virtio
On Fri, Mar 04, 2016 at 07:17:57PM +0100, Dennis Jacobfeuerborn wrote:> Hi, > with recent guest installs (both centos 5 and 7) on centos 7 hosts I > seem to have to disable checksum offloading using "ethtool -K eth0 tx > off" in order to allow traffic to flow a specific route. > > Basically the guest is installed with IP 192.168.21.10 and a default > gateway of 192.168.21.254. Up until that point I can ssh into the system > normally. > There exists an OpenVPN System with the IP 192.168.21.1 that choses > client IP's for the vpn connections from the pool 192.168.20.0/24. > In order to pass the reponses back to the OpenVPN system I installed the > route "192.168.20.0/24 via 192.168.21.1 dev eth0". > > When I now ping the IP 192.168.21.10 through the VPN connection this > works fine but when I try to ssh into that system the connection just > hangs. Looking at a tcpdump I noticed that the checksum for the packets > weren't quite right so I issued a "ethtool -K eth0 tx off" and suddenly > everything worked as expected. > > What is strange here is that I'm seeing this with both CentOS 5 and > CentOS 7 guests and only when dealing with th routed traffic and not the > regular one. > > Does anyone have an idea what is going on here? Could this be an issue > with the virtio driver?Things are optimized with virtio-net so that no checksum is ever written until the packet reaches a physical NIC. So with commnuication between 2 guest on the same physical host no checksums will ever be done, & it is expected that tcpdump would show corrupt checksums in that case. Normally this is just fine as almost no applications in the guest OS will operate directly on the ethernet packets, so will never even realize that no checksum is done. There has been one notable problem in the past though where dhcp clients would get upset by the missing checksum. When using libvirt virtual networks, we actually create a firewall rule on the host OS that explicitly adds valid checksums for packets on the DHCP port to avoid this problem. I guess it is conceivable that some other applications may get upset by the missing checksums if they operate at the ethernet layer instead of the IP layer, which might explain what you see. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Dennis Jacobfeuerborn
2016-Mar-05  00:05 UTC
Re: [libvirt-users] network checksum offloading broken using virtio
On 04.03.2016 19:27, Daniel P. Berrange wrote:> On Fri, Mar 04, 2016 at 07:17:57PM +0100, Dennis Jacobfeuerborn wrote: >> Hi, >> with recent guest installs (both centos 5 and 7) on centos 7 hosts I >> seem to have to disable checksum offloading using "ethtool -K eth0 tx >> off" in order to allow traffic to flow a specific route. >> >> Basically the guest is installed with IP 192.168.21.10 and a default >> gateway of 192.168.21.254. Up until that point I can ssh into the system >> normally. >> There exists an OpenVPN System with the IP 192.168.21.1 that choses >> client IP's for the vpn connections from the pool 192.168.20.0/24. >> In order to pass the reponses back to the OpenVPN system I installed the >> route "192.168.20.0/24 via 192.168.21.1 dev eth0". >> >> When I now ping the IP 192.168.21.10 through the VPN connection this >> works fine but when I try to ssh into that system the connection just >> hangs. Looking at a tcpdump I noticed that the checksum for the packets >> weren't quite right so I issued a "ethtool -K eth0 tx off" and suddenly >> everything worked as expected. >> >> What is strange here is that I'm seeing this with both CentOS 5 and >> CentOS 7 guests and only when dealing with th routed traffic and not the >> regular one. >> >> Does anyone have an idea what is going on here? Could this be an issue >> with the virtio driver? > > Things are optimized with virtio-net so that no checksum is ever > written until the packet reaches a physical NIC. So with commnuication > between 2 guest on the same physical host no checksums will ever be done, > & it is expected that tcpdump would show corrupt checksums in that case. > > Normally this is just fine as almost no applications in the guest OS will > operate directly on the ethernet packets, so will never even realize that > no checksum is done. There has been one notable problem in the past though > where dhcp clients would get upset by the missing checksum. > > When using libvirt virtual networks, we actually create a firewall rule > on the host OS that explicitly adds valid checksums for packets on the > DHCP port to avoid this problem. > > I guess it is conceivable that some other applications may get upset by > the missing checksums if they operate at the ethernet layer instead of > the IP layer, which might explain what you see.So apparently the issue seems to be that the SYN,ACK from the ssh connection gets sent out by the 192.168.21.10 system with a wrong checksum then arrives on eth0 at 192.168.21.1 and then gets dripped when it is sent out the tun0 device create by OpenVPN. WHat is the proper fix to deal with this? This behavior apparently has changed only recently and it seems rather cumbersome to now have to disable tcp checksum offloading for every guest I install in the future. Regards, Dennis
Possibly Parallel Threads
- Re: network checksum offloading broken using virtio
- Re: Question about TCP checksum offload in Xen
- [PATCH net-next 1/5] virtio: Add support for SCTP checksum offloading
- [PATCH net-next 1/5] virtio: Add support for SCTP checksum offloading
- [PATCH net-next 1/5] virtio: Add support for SCTP checksum offloading