Scott Garron
2010-Mar-25 18:53 UTC
[Xen-users] When using domU as router, packets have incorrect checksum
Before I run this by the devs, I thought I''d pose the question here first, in case it''s just something I''m overlooking. I do suspect that it''s a bug, though... I was creating a test scenario in a lab environment, which was a bit more complicated (just more variables) than what I''m about to describe here, but I managed to remove as many variables as I could, for the purpose of tracking down a problem I was having with it. What I determined was that consistently, any packets forwarded through the domU''s kernel would show up at their destination with the wrong checksum. Not only was it wrong, every packet has the same checksum (0x9e85). Using tcpdump to watch the packets through all of the virtual interfaces, bridges and even the dom0''s physical interface, they show the correct checksum. It''s only at the destination''s interface where it was incorrect (and thus dropped there, causing communications to fail). I still suspect Xen''s networking to be the source of the problem because if I use the dom0''s kernel to do the same, exact forwarding, it works just fine. Another peculiarity is that ICMP forwarded through the domU is dropped by the dom0''s kernel and every packet causes this message to appear in dmesg: Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet Here''s the set-up: I have two, generic desktop machines, running Debian Linux (not Xen): One with IP address 192.168.15.3/24, the other with IP address 192.168.14.3/24. I have a Xen server, also running Debian Linux for the privileged domain (dom0), with three ethernet interfaces. There are no IP addresses bound to eth1, eth2, peth1, or peth2, but their links are in the "UP" state. The Xen server is hosting a Debian Linux domU that has all three of the dom0''s ethernet interfaces bridged to them. The first desktop is connected to eth1 on the Xen server and the second is connected to eth2 on the Xen server (eth0 is not really used in this test). I''m using cross-over cables for these connections in order to eliminate ethernet switches as being culprits. The domU is configured with 192.168.15.1/24 on eth1 and 192.168.14.1/24 on eth2. The desktops are using those as their respective default gateways. /proc/sys/net/ipv4/ip_forward shows "1" on the domU and there''s nothing in iptables. One of the two desktop machines (the one at 192.168.15.3) has SSH listening on port 22120. From the domU, both 192.168.15.3 and 192.168.14.3 are pingable. Connecting to port 22120 on the 192.168.15.3 machine is also possible from the domU. Attempting a connection from the 192.168.14.3 machine to the 192.168.15.3 machine on that port yields the following results in tcpdump: **************** [eth1 on the domU] 192.168.14.3.42062 > 192.168.15.3.22120: Flags [S], cksum 0x6c4b (correct), seq 2616249367, win 5840, options [mss 1460,sackOK,TS val 15712992 ecr 0,nop,wscale 6], length 0 [vif6.1 on the dom0] 192.168.14.3.42062 > 192.168.15.3.22120: Flags [S], cksum 0x6c4b (correct), seq 2616249367, win 5840, options [mss 1460,sackOK,TS val 15712992 ecr 0,nop,wscale 6], length 0 [eth1 on the dom0] (the bridge) 192.168.14.3.42062 > 192.168.15.3.22120: Flags [S], cksum 0x6c4b (correct), seq 2616249367, win 5840, options [mss 1460,sackOK,TS val 15712992 ecr 0,nop,wscale 6], length 0 [peth1 on the dom0] 192.168.14.3.42062 > 192.168.15.3.22120: Flags [S], cksum 0x6c4b (correct), seq 2616249367, win 5840, options [mss 1460,sackOK,TS val 15712992 ecr 0,nop,wscale 6], length 0 [eth0 on the destination desktop machine] 192.168.14.3.42062 > 192.168.15.3.22120: Flags [S], cksum 0x9e85 (incorrect -> 0x6c4b), seq 2616249367, win 5840, options [mss 1460,sackOK,TS val 15712992 ecr 0,nop,wscale 6], length 0 **************** From this, I know that the packet is being correctly forwarded by the domU kernel because it''s arriving on eth2 from one desktop machine and it''s showing up on eth1, headed toward the second desktop machine. The checksum shown on the tcpdump from eth1 on the domU (the interface it is being forwarded to) is correct. I tried the test with the domU''s checksum offloading turned on and with it turned off. Both yield the same result. It''s as though, just as it''s physically going out on the wire, the checksum gets changed, and it always seems to be set to 0x9e85, regardless of what''s in the packet. The version of Xen is 3.4.3 (from debian [testing] package xen-hypervisor-3.4-amd64_3.4.3~rc3-1_amd64.deb). The kernel is 2.6.32.3 from Jeremy Fitzhardinge''s stable-2.6.32.x git branch. The server has the following hardware: Tyan Thunder K8S Pro S2882 2 AMD Opteron 240 1.4GHz 4GB RAM LSI MegaRAID MRSCSI320-2X The on-board ethernet controllers are: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) (tigon) Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) (tigon) Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 10) (e100) Has anyone else run across this problem? It only seems to be affecting packets forwarded through the domU kernel. All other communications seem to be behaving normally. -- Scott Garron _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Markus Schuster
2010-Mar-31 17:48 UTC
[Xen-users] Re: When using domU as router, packets have incorrect checksum
Scott Garron wrote:> [..] > Has anyone else run across this problem? It only seems to be > affecting packets forwarded through the domU kernel. All other > communications seem to be behaving normally.Yes, there are some people here on xen-users and xen-devel (including me) that suffer from this problem. AFAIK nobody has a solution currently. I''ve installed an "oldstyle" (non pv_ops) kernel to work arround this problem because of a massive lack of time in the last months to dig into this problem. Regards, Markus _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users