S.H. Verbrugge
2010-Mar-17 12:34 UTC
[Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
Hello, I seem to be having some troubles regarding the latest 2.6.31.6 and 2.6.32.9 Xen dom0 pv_ops trees. Our platform: -Xen 3.4.3-rc3 (also tried 3.4.2 on 2.6.31.6 pv_ops dom0) -2.6.32.9 pv_ops dom0 kernel, perhaps a week old checkout from xen/stable git (can provide changeset if requested). -100+ domU''s, all PV. Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2), we started to have some problems when attempting to route packets on a domU. The following message appears in dmesg on the dom0: "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet" We can actually sniff both the virtual interface on the dom0 (nothing ever leaves the domU) and we do see the ICMP echo requests inside the domU. The reply, however, never gets to the destination or outside the domU for that matter. It seems that for some reason, as soon as the packet leaves the domU, the dom0 kernel drops the packet, as shown in the dmesg log. Some background info: We''re using normal virtual interfaces, on a named bridge, br-internet. This bridge contains the hardware interface ''eth0''. This is a tg3 interface, we already tried turning off both RX and TX checksumming for this interface. Setting ''ethtool tx off'' in the domU itself, doesn''t help, either. The different interfaces are vlan''ed through a Cisco 2927 switch. Since this problem did not occur in xenkernels before, it is most likely related to a netback patch in pv_ops dom0. Perhaps somebody could provide me with some more info, or insights. -- /\/\ Hostingvereniging Soleus | Community-driven < ** > http://soleus.nu | Virtual Private Servers \/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Kuhne
2010-Mar-17 12:48 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
Am 17.03.2010 13:34, schrieb S.H. Verbrugge: Hello,> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2), > we started to have some problems when attempting to route packets on a domU. > > The following message appears in dmesg on the dom0: > > "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet" >I know one how has same problem. I''ve a similar Problem. I can ping from routing DomU and Dom0 to Internet, but no one else can. See also: Post "2.6.31.6 pv_ops and routing DomU" from 10/03/09 on Xen-Devel Regards, Stefan Kuhne _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
S.H. Verbrugge
2010-Mar-17 15:48 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Wed, Mar 17, 2010 at 01:48:33PM +0100, Stefan Kuhne wrote:> Am 17.03.2010 13:34, schrieb S.H. Verbrugge: > > Hello, > > > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2), > > we started to have some problems when attempting to route packets on a domU. > > > > The following message appears in dmesg on the dom0: > > > > "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet" > > > I know one how has same problem. > I''ve a similar Problem. > > I can ping from routing DomU and Dom0 to Internet, but no one else can. > > See also: > Post "2.6.31.6 pv_ops and routing DomU" from 10/03/09 on Xen-Devel > > Regards, > Stefan Kuhne > >Yeah, I''ve seen that. It does not give that much information however. More specifically, I''ve tested with two dom0 pv_ops kernels now, and since it''s still reproducable on the latest xen/stable platform, I''m guessing it either has to do with vlan''ing or the tg3 driver , in combination with netback. I was hoping some Xen developers could shed some light on this. I already conversed with Jeremy about this, and he pointed me to this mailing list. -- /\/\ Hostingvereniging Soleus | Community-driven < ** > http://soleus.nu | Virtual Private Servers \/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Harper
2010-Mar-17 23:25 UTC
RE: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
> > Yeah, I''ve seen that. It does not give that much information however. > > More specifically, I''ve tested with two dom0 pv_ops kernels now, andsince> it''s still > reproducable on the latest xen/stable platform, I''m guessing it eitherhas to> do with vlan''ing or > the tg3 driver , in combination with netback. > > I was hoping some Xen developers could shed some light on this. > I already conversed with Jeremy about this, and he pointed me to thismailing> list. >This may not be relavant, but I have seen problems with the following combination: br0: eth0 <netback devices> br1: eth0.2 <netback devices> Some (most?) network hardware cannot provide checksum/large send offload functions for packets that use vlan tagging, but Linux doesn''t quite understand that and gets confused, so when such a packet comes off of netback and is sent to eth0.2, the LSO/checksum function should be performed in software but isn''t. I haven''t yet figured out of the problem is that the driver is incorrectly reporting that offload is supported on the vlan device or if the rest of Linux isn''t taking the appropriate action... James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Kuhne
2010-Mar-17 23:37 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
Am 18.03.2010 00:25, schrieb James Harper: Hello,> This may not be relavant, but I have seen problems with the following > combination: > > br0: > eth0 > <netback devices> > > br1: > eth0.2 > <netback devices> > > Some (most?) network hardware cannot provide checksum/large send offload > functions for packets that use vlan tagging, but Linux doesn''t quite > understand that and gets confused, so when such a packet comes off of > netback and is sent to eth0.2, the LSO/checksum function should be > performed in software but isn''t. >I''ve no VLAN running. But i''ve only a similar problem. Regards, Stefan Kuhne _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
S.H. Verbrugge
2010-Mar-17 23:48 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Thu, Mar 18, 2010 at 10:25:34AM +1100, James Harper wrote:> > > > Yeah, I''ve seen that. It does not give that much information however. > > > > More specifically, I''ve tested with two dom0 pv_ops kernels now, and > since > > it''s still > > reproducable on the latest xen/stable platform, I''m guessing it either > has to > > do with vlan''ing or > > the tg3 driver , in combination with netback. > > > > I was hoping some Xen developers could shed some light on this. > > I already conversed with Jeremy about this, and he pointed me to this > mailing > > list. > > > > This may not be relavant, but I have seen problems with the following > combination: > > br0: > eth0 > <netback devices> > > br1: > eth0.2 > <netback devices> > > Some (most?) network hardware cannot provide checksum/large send offload > functions for packets that use vlan tagging, but Linux doesn''t quite > understand that and gets confused, so when such a packet comes off of > netback and is sent to eth0.2, the LSO/checksum function should be > performed in software but isn''t. > > I haven''t yet figured out of the problem is that the driver is > incorrectly reporting that offload is supported on the vlan device or if > the rest of Linux isn''t taking the appropriate action...Hmm, just to set the record straight, there''s vlan''ing on the switch (containing several ports in a single physical network), but no tagging as far as I know. However, I''ve read some previous problems using checksum offloading in the TG3 driver. Perhaps the two are related (netback / tg3 checksums), but I have no way of determining that. Isn''t there some way of patching the netback driver, so it does not support checksumming? Maybe I''m way off base here.. -- /\/\ Hostingvereniging Soleus | Community-driven < ** > http://soleus.nu | Virtual Private Servers \/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Harper
2010-Mar-18 00:04 UTC
RE: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
> > Isn''t there some way of patching the netback driver, so it does notsupport> checksumming? > Maybe I''m way off base here.. >Yes, use ethtool on the DomU interface, the bridge, and the Dom0 physical interface. James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
S.H. Verbrugge
2010-Mar-18 00:17 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Thu, Mar 18, 2010 at 11:04:41AM +1100, James Harper wrote:> > > > > Isn''t there some way of patching the netback driver, so it does not > support > > checksumming? > > Maybe I''m way off base here.. > > > > Yes, use ethtool on the DomU interface, the bridge, and the Dom0 > physical interface. > > JamesYeah, I already tried that. I did that in the domU on eth0 and any other interfaces, as well as the dom0 phys. intf, the bridge and the virtual vif for the domU. No luck, unfortunately. -- /\/\ Hostingvereniging Soleus | Community-driven < ** > http://soleus.nu | Virtual Private Servers \/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Scott Garron
2010-Mar-30 22:58 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
S.H. Verbrugge wrote:> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 > xenkernel from Debian repo before, with Xen 3.2), we started to have > some problems when attempting to route packets on a domU.I''m having the same problem. All TCP packets that are forwarded through a domU are somehow getting a static checksum (0x9e85) just as they''re being put out on the wire. ICMP is dropped by the dom0, as you describe, with the "Attempting to checksum a non UDP/TCP packet" message in dmesg. More detail about my situation is in my post to xen-users, here: http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html It doesn''t include a solution, though.> This is a tg3 interfaceI''m also running the tg3 ethernet driver, which may be of significance. -- Scott Garron _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Apr-14 17:27 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Tue, Mar 30, 2010 at 06:58:02PM -0400, Scott Garron wrote:> S.H. Verbrugge wrote: >> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 >> xenkernel from Debian repo before, with Xen 3.2), we started to have >> some problems when attempting to route packets on a domU. > > I''m having the same problem. All TCP packets that are forwarded > through a domU are somehow getting a static checksum (0x9e85) just as > they''re being put out on the wire. ICMP is dropped by the dom0, as you > describe, with the "Attempting to checksum a non UDP/TCP packet" message > in dmesg. More detail about my situation is in my post to xen-users, here: > > http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html > > It doesn''t include a solution, though. > >> This is a tg3 interface > > I''m also running the tg3 ethernet driver, which may be of > significance. >I CCd some else having the same problem.. did you guys ever resolve this problem? For him the problem got solved when he replaced the pvops _domU_ kernel with 2.6.18.8.. (Still running pvops in dom0). -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Apr-15 08:20 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Tue, 2010-03-30 at 23:58 +0100, Scott Garron wrote:> S.H. Verbrugge wrote: > > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 > > xenkernel from Debian repo before, with Xen 3.2), we started to have > > some problems when attempting to route packets on a domU. > > I''m having the same problem. All TCP packets that are forwarded > through a domU are somehow getting a static checksum (0x9e85) just as > they''re being put out on the wire. ICMP is dropped by the dom0, as you > describe, with the "Attempting to checksum a non UDP/TCP packet" message > in dmesg. More detail about my situation is in my post to xen-users, here: > > http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html > > It doesn''t include a solution, though. > > > This is a tg3 interface > > I''m also running the tg3 ethernet driver, which may be of > significance.According to the driver source some tg3 chipsets are known to have broken checksumming hardware, in particular 5700 B0 silicon. The workaround seems to have been present in the driver forever though so that may be a red-herring. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-May-22 11:51 UTC
Re: [Xen-devel] Checksumming problem in pv_ops dom0 kernel / netback
On Thu, Apr 15, 2010 at 09:20:39AM +0100, Ian Campbell wrote:> On Tue, 2010-03-30 at 23:58 +0100, Scott Garron wrote: > > S.H. Verbrugge wrote: > > > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 > > > xenkernel from Debian repo before, with Xen 3.2), we started to have > > > some problems when attempting to route packets on a domU. > > > > I''m having the same problem. All TCP packets that are forwarded > > through a domU are somehow getting a static checksum (0x9e85) just as > > they''re being put out on the wire. ICMP is dropped by the dom0, as you > > describe, with the "Attempting to checksum a non UDP/TCP packet" message > > in dmesg. More detail about my situation is in my post to xen-users, here: > > > > http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html > > > > It doesn''t include a solution, though. > > > > > This is a tg3 interface > > > > I''m also running the tg3 ethernet driver, which may be of > > significance. > > According to the driver source some tg3 chipsets are known to have > broken checksumming hardware, in particular 5700 B0 silicon. The > workaround seems to have been present in the driver forever though so > that may be a red-herring. >Recently there was a fix for a bug in netback.. so you might want to update to latest pvops dom0 kernel from xen/stable-2.6.32.x branch and see if that fixes the problem. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel