Daniel Goertzen
2005-Dec-27 23:03 UTC
[Xen-users] tcp checksum errors across dom0-domu bridge
My xen setup is having trouble communicating between dom0 and domu. The dom0 and domUs have no problem communicating outside the physical machine, and the domUs can even talk to each other with no problems, but dom0-domU comms is a no-go. I am running gentoo x86 and xen-3.0.0. The xend network-bridge script didn''t seem to work at all, so I disabled it and use a bridge configured by gentoo (see dom0 network info below). The domUs use vanilla xend bridge networking (see dom1 network info below). Now the dom0 and domU can ping each other, and even establish a tcp connection, but dom0->domU tcp packets always seem to have a failing checksum. (see tcpdump trace of a telnet session below) Any thoughts? Thanks, Dan. ####################################### # dom0 network information dom0 / # brctl show bridge name bridge id STP enabled interfaces br0 8000.005004652f84 no eth0 vif3.0 dom0 / # ip a l dev br0 20: br0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue link/ether 00:50:04:65:2f:84 brd ff:ff:ff:ff:ff:ff inet 192.168.1.5/24 brd 192.168.1.255 scope global br0 <inet6 stuff omitted> dom0 / # ip a l dev eth0 19: eth0: <BROADCAST,MULTICAST,PROMISC,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:50:04:65:2f:84 brd ff:ff:ff:ff:ff:ff inet6 fe80::250:4ff:fe65:2f84/64 scope link valid_lft forever preferred_lft forever dom0 / # ip a l vif3.0 23: vif3.0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff inet6 fe80::fcff:ffff:feff:ffff/64 scope link valid_lft forever preferred_lft forever dom0 / # ip r l 192.168.1.0/24 dev br0 proto kernel scope link src 192.168.1.5 127.0.0.0/8 dev lo scope link default via 192.168.1.1 dev br0 dom0 / # arp -a ? (192.168.1.6) at 00:16:3E:6C:24:FC [ether] on br0 ? (192.168.1.103) at 00:11:D8:5C:01:98 [ether] on br0 ? (192.168.1.103) at 00:11:D8:5C:01:98 [ether] on br0 dom0 / # brctl showmacs br0 port no mac addr is local? ageing timer 1 00:11:95:e2:7d:72 no 11.08 1 00:11:d8:5c:01:98 no 0.00 2 00:16:3e:6c:24:fc no 15.33 1 00:20:af:50:15:9c no 8.32 1 00:50:04:65:2f:84 yes 0.00 1 00:90:4b:4b:6d:f6 no 99.14 2 fe:ff:ff:ff:ff:ff yes 0.00 ####################################### # dom1 network information dom1 / # ip a l eth0 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:16:3e:6c:24:fc brd ff:ff:ff:ff:ff:ff inet 192.168.1.6/24 brd 192.168.1.255 scope global eth0 <inet6 stuff omitted> dom1 / # ip r l 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.6 127.0.0.0/8 dev lo scope link default via 192.168.1.1 dev eth0 dom1 / # arp -a ? (192.168.1.5) at 00:50:04:65:2F:84 [ether] on eth0 ? (192.168.1.103) at 00:11:D8:5C:01:98 [ether] on eth0 ? (192.168.1.1) at 00:20:AF:50:15:9C [ether] on eth0 ####################################### # tcpdump of domU->dom0 telnet session # # Note that tcpdump running in dom0 and dom1 will both show the tcp checksum errors. # dom1 ~ # tcpdump -v -i eth0 port 23 tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 17:54:21.182760 IP (tos 0x10, ttl 64, id 48589, offset 0, flags [DF], length: 60) 192.168.1.6.33778 > 192.168.1.5.telnet: S [tcp sum ok] 1720050423:1720050423(0) win 5840 <mss 1460,sackOK,timestamp 1728314 0,nop,wscale 2> 17:54:21.280506 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], length: 60) 192.168.1.5.telnet > 192.168.1.6.33778: S [tcp sum ok] 2896578887:2896578887(0) ack 1720050424 win 5792 <mss 1460,sackOK,timestamp 11502518 1728314,nop,wscale 2> 17:54:21.280528 IP (tos 0x10, ttl 64, id 48591, offset 0, flags [DF], length: 52) 192.168.1.6.33778 > 192.168.1.5.telnet: . [tcp sum ok] ack 1 win 1460 <nop,nop,timestamp 1728315 11502518> 17:54:21.183846 IP (tos 0x10, ttl 64, id 48593, offset 0, flags [DF], length: 76) 192.168.1.6.33778 > 192.168.1.5.telnet: P [tcp sum ok] 1:25(24) ack 1 win 1460 <nop,nop,timestamp 1728315 11502518> [telnet DO SUPPRESS GO AHEAD, WILL TERMINAL TYPE, WILL NAWS, WILL TSPEED, WILL LFLOW, WILL LINEMODE, WILL NEW-ENVIRON, DO STATUS] 17:54:21.183905 IP (tos 0x10, ttl 64, id 51966, offset 0, flags [DF], length: 52) 192.168.1.5.telnet > 192.168.1.6.33778: . [tcp sum ok] ack 25 win 1448 <nop,nop,timestamp 11502518 1728315> 17:54:21.238656 IP (tos 0x10, ttl 64, id 51968, offset 0, flags [DF], length: 64) 192.168.1.5.telnet > 192.168.1.6.33778: P [bad tcp cksum 838e (->f14)!] 1:13(12) ack 25 win 1448 <nop,nop,timestamp 11502523 1728315> [telnet DO TERMINAL TYPE, DO TSPEED, DO XDISPLOC, DO NEW-ENVIRON] 17:54:21.439151 IP (tos 0x10, ttl 64, id 51970, offset 0, flags [DF], length: 64) 192.168.1.5.telnet > 192.168.1.6.33778: P [bad tcp cksum 838e (->eff)!] 1:13(12) ack 25 win 1448 <nop,nop,timestamp 11502544 1728315> [telnet DO TERMINAL TYPE, DO TSPEED, DO XDISPLOC, DO NEW-ENVIRON] 17:54:21.859153 IP (tos 0x10, ttl 64, id 51972, offset 0, flags [DF], length: 64) 192.168.1.5.telnet > 192.168.1.6.33778: P [bad tcp cksum 838e (->ed5)!] 1:13(12) ack 25 win 1448 <nop,nop,timestamp 11502586 1728315> [telnet DO TERMINAL TYPE, DO TSPEED, DO XDISPLOC, DO NEW-ENVIRON] _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ian Pratt
2005-Dec-27 23:12 UTC
RE: [Xen-users] tcp checksum errors across dom0-domu bridge
> My xen setup is having trouble communicating between dom0 and > domu. The dom0 and domUs have no problem communicating > outside the physical machine, and the domUs can even talk to > each other with no problems, but dom0-domU comms is a no-go. > > I am running gentoo x86 and xen-3.0.0. The xend > network-bridge script didn''t seem to work at all, so I > disabled it and use a bridge configured by gentoo (see dom0 > network info below). The domUs use vanilla xend bridge > networking (see dom1 network info below). > > Now the dom0 and domU can ping each other, and even establish > a tcp connection, but dom0->domU tcp packets always seem to > have a failing checksum. (see tcpdump trace of a telnet session below)Are you getting ''insufficient headroom'' message in the domU? (see dmesg). If so, upgrade to a newer 3.0.0 build number - the RPMs on the web site should already be updated. Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Daniel Goertzen
2005-Dec-28 04:09 UTC
Re: [Xen-users] tcp checksum errors across dom0-domu bridge
Ian Pratt wrote:> > >> >>Now the dom0 and domU can ping each other, and even establish >>a tcp connection, but dom0->domU tcp packets always seem to >>have a failing checksum. (see tcpdump trace of a telnet session below) >> >> > >Are you getting ''insufficient headroom'' message in the domU? (see >dmesg). > >If so, upgrade to a newer 3.0.0 build number - the RPMs on the web site >should already be updated. > > >No, dmesg looks clean in both dom0 and domU. Sames results for xen-3.0-testing-20051206 and xen-3.0-testing-20051227. (I only upgraded the hypervisor to 20051227.) Dan. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ian Pratt
2005-Dec-28 13:46 UTC
RE: [Xen-users] tcp checksum errors across dom0-domu bridge
> No, dmesg looks clean in both dom0 and domU. Sames results for > xen-3.0-testing-20051206 and xen-3.0-testing-20051227. (I > only upgraded the hypervisor to 20051227.)It''s the dom0 and domU kernels that would need the upgrade as the hypervisor isn''t involved with this. The other thing to try is to use ethtool to turn checksum offload off on eth0. Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Daniel Goertzen
2005-Dec-28 20:12 UTC
Re: [Xen-users] tcp checksum errors across dom0-domu bridge
Ian Pratt wrote:>>No, dmesg looks clean in both dom0 and domU. Sames results for >>xen-3.0-testing-20051206 and xen-3.0-testing-20051227. (I >>only upgraded the hypervisor to 20051227.) >> >> > >It''s the dom0 and domU kernels that would need the upgrade as the >hypervisor isn''t involved with this. > >The other thing to try is to use ethtool to turn checksum offload off on >eth0. > >Ian > > > >No progress: - I used the dom0 and domU kernels from xen-3.0-testing-20051227, same results. - I tried turning off the checksum offload for eth0, but it wouldn''t let me. - I removed my physical eth0 from the bridge so that the whole network was purely virtual... the tcp checksum problem still occurred. - I wrote a tiny python tcp client/server program that also fails because of the tcp checksum problem. I''m willing to try more things if you can think of them. Cheers, Dan. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ian Pratt
2005-Dec-28 21:54 UTC
RE: [Xen-users] tcp checksum errors across dom0-domu bridge
> - I used the dom0 and domU kernels from > xen-3.0-testing-20051227, same results. > - I tried turning off the checksum offload for eth0, but it > wouldn''t let me.(it would be peth0 you''d need to turn csum off on, but this isn''t your problem)> - I removed my physical eth0 from the bridge so that the > whole network was purely virtual... the tcp checksum problem > still occurred. > - I wrote a tiny python tcp client/server program that also > fails because of the tcp checksum problem.Have you changed your kernel config from the default -xen or -xen0/U ? Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Daniel Goertzen
2005-Dec-29 16:22 UTC
Re: [Xen-users] tcp checksum errors across dom0-domu bridge
> > >>- I used the dom0 and domU kernels from >>xen-3.0-testing-20051227, same results. >> >> >Have you changed your kernel config from the default -xen or -xen0/U ? > > >Ian >I used the default kernel configurations because they happened to have everything I needed. Dan. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ian Pratt
2005-Dec-29 22:19 UTC
RE: [Xen-users] tcp checksum errors across dom0-domu bridge
> - I used the dom0 and domU kernels from > xen-3.0-testing-20051227, same results. > - I tried turning off the checksum offload for eth0, but it > wouldn''t let me. > - I removed my physical eth0 from the bridge so that the > whole network was purely virtual... the tcp checksum problem > still occurred. > - I wrote a tiny python tcp client/server program that also > fails because of the tcp checksum problem. > > I''m willing to try more things if you can think of them.Are you 100% sure you''re using the kernels you think you are. Please check the changeset id using dmesg in both dom0 and the domU. Are you using the normal network-bridge script? Are you sure you''re using the latest version? How are you detecting that checksum errors are ocuring? It''s bizzare that you should be having problems that absoloutely no-one else is seeing. Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Daniel Goertzen
2005-Dec-30 17:29 UTC
Re: [Xen-users] tcp checksum errors across dom0-domu bridge
Okay, I think I''ve discovered where the problem lies. As I indicated in the original email, the network-bridge script did not work for me so I disabled it and rolled my own solution which consisted of completely ignoring veth0/vif0.0 and just assigning an address to a bridge and using it directly in dom0. I am assuming that this bypasses some xen checksum trickery and caused my problems. Is this correct? My working dom0 networking setup is listed below. Note that I am still using the standard vif-bridge script. Cheers, Dan. /etc/conf.d/net (gentoo linux) # bridge settings bridge_br0="eth0 vif0.0" config_br0=( "null" ) #disable dhcp # physical adapter will be a dumb port on our bridge config_eth0=( "null" ) mac_eth0="fe:ff:ff:ff:ff:ff" # dom0 backend interface will also be a dumb port on our bridge config_vif0_0=( "null" ) #note that we have to say vif0_0 instead of vif0.0 # dom0 frontend config config_veth0=("192.168.1.5/24") routes_veth0=("default via 192.168.1.1") mac_veth0="random-anykind" _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users