Looks like the GSO is involved? I got this while running Dom0 only (no guests), with a BOINC/Rosetta@home application running on all 4 cores. changeset: 10649:8e55c5c11475 Build: x86_32p (pae). ------------[ cut here ]------------ kernel BUG at net/core/dev.c:1133! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c04dceb0>] Not tainted VLI EFLAGS: 00210297 (2.6.16.13-xen #12) EIP is at skb_gso_segment+0xf0/0x110 eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 ds: 007b es: 007b ss: 0069 Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 task=cb019560) Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 00000002 c63a7900 c04dcfb0 cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 cf8e3140 c04dd07e cd9e32e0 cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e cd9e32e0 cf8e3000 c0895140 Call Trace: [<c04dcfb0>] dev_gso_segment+0x30/0xb0 [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 [<c04ec07e>] __qdisc_run+0xbe/0x280 [<c04dd4b9>] dev_queue_xmit+0x379/0x380 [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 [<c05c2402>] br_nf_post_routing+0x102/0x1d0 [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 [<c04f0eab>] nf_iterate+0x6b/0xa0 [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 [<c04f0f4e>] nf_hook_slow+0x6e/0x120 [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 [<c05bbf40>] br_forward_finish+0x60/0x70 [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 [<c05c1b71>] br_nf_forward_finish+0x71/0x130 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 [<c05c1b00>] br_nf_forward_finish+0x0/0x130 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c04f0eab>] nf_iterate+0x6b/0xa0 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c04f0f4e>] nf_hook_slow+0x6e/0x120 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c05bc044>] __br_forward+0x74/0x80 [<c05bbee0>] br_forward_finish+0x0/0x70 [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c054fe13>] ip_nat_in+0x43/0xc0 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c04f0eab>] nf_iterate+0x6b/0xa0 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c04f0f4e>] nf_hook_slow+0x6e/0x120 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c05c1914>] br_nf_pre_routing+0x404/0x580 [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 [<c04f0eab>] nf_iterate+0x6b/0xa0 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c04f0f4e>] nf_hook_slow+0x6e/0x120 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c05bd124>] br_handle_frame+0x1e4/0x250 [<c05bcde0>] br_handle_frame_finish+0x0/0x160 [<c04ddae5>] netif_receive_skb+0x165/0x2a0 [<c04ddcdf>] process_backlog+0xbf/0x180 [<c04ddebf>] net_rx_action+0x11f/0x1d0 [<c01262e6>] __do_softirq+0x86/0x120 [<c01263f5>] do_softirq+0x75/0x90 [<c0106cef>] do_IRQ+0x1f/0x30 [<c04271d0>] evtchn_do_upcall+0x90/0x100 [<c0105315>] hypervisor_callback+0x3d/0x48 Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 60 8b 44 24 08 83 c4 0c 5b 5e 5f 5d c3 0f 0 b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c c0 e9 3a ff ff ff 0f 0b 6c 04 e8 ab 6c c0 <0>Kernel panic - not syncing: Fatal exception in interrupt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Some more info: If I force BOINC to access the network, it crashes instantly. If I don''t start xend, it seems to work fine... -- Mats> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Petersson, Mats > Sent: 06 July 2006 21:28 > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] kernel BUG at net/core/dev.c:1133! > > Looks like the GSO is involved? > > I got this while running Dom0 only (no guests), with a > BOINC/Rosetta@home application running on all 4 cores. > > changeset: 10649:8e55c5c11475 > > Build: x86_32p (pae). > > ------------[ cut here ]------------ > kernel BUG at net/core/dev.c:1133! > invalid opcode: 0000 [#1] > SMP > CPU: 0 > EIP: 0061:[<c04dceb0>] Not tainted VLI > EFLAGS: 00210297 (2.6.16.13-xen #12) > EIP is at skb_gso_segment+0xf0/0x110 > eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 > esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 > ds: 007b es: 007b ss: 0069 > Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 task=cb019560) > Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 > 00000002 c63a7900 > c04dcfb0 > cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 cf8e3140 c04dd07e > cd9e32e0 > cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e cd9e32e0 cf8e3000 > c0895140 > Call Trace: > [<c04dcfb0>] dev_gso_segment+0x30/0xb0 > [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 > [<c04ec07e>] __qdisc_run+0xbe/0x280 > [<c04dd4b9>] dev_queue_xmit+0x379/0x380 > [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 > [<c05c2402>] br_nf_post_routing+0x102/0x1d0 > [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05bbf40>] br_forward_finish+0x60/0x70 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05c1b71>] br_nf_forward_finish+0x71/0x130 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 > [<c05c1b00>] br_nf_forward_finish+0x0/0x130 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bc044>] __br_forward+0x74/0x80 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c054fe13>] ip_nat_in+0x43/0xc0 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c05c1914>] br_nf_pre_routing+0x404/0x580 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05bd124>] br_handle_frame+0x1e4/0x250 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c04ddae5>] netif_receive_skb+0x165/0x2a0 > [<c04ddcdf>] process_backlog+0xbf/0x180 > [<c04ddebf>] net_rx_action+0x11f/0x1d0 > [<c01262e6>] __do_softirq+0x86/0x120 > [<c01263f5>] do_softirq+0x75/0x90 > [<c0106cef>] do_IRQ+0x1f/0x30 > [<c04271d0>] evtchn_do_upcall+0x90/0x100 > [<c0105315>] hypervisor_callback+0x3d/0x48 > Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 60 > 8b 44 24 08 > 83 c4 0c 5b 5e 5f 5d c3 0f 0 > b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c c0 > e9 3a ff ff > ff 0f 0b 6c 04 e8 ab 6c c0 > <0>Kernel panic - not syncing: Fatal exception in interrupt > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats <Mats.Petersson@amd.com> wrote:> Looks like the GSO is involved?It''s certainly what crashed your machine :) It''s probably not the guilty party though. Someone is passing through a TSO packet with checksum set to something other than CHECKSUM_HW. I bet it''s netfilter and we just never noticed before because real NICS would simply corrupt the checksum silently. Could you confirm that you have netfilter rules (in particular NAT rules) and that this goes away if you flush all your netfilter tables? Patrick, do we really have to zap the checksum on outbound NAT? Could we update it instead?> I got this while running Dom0 only (no guests), with a > BOINC/Rosetta@home application running on all 4 cores. > > changeset: 10649:8e55c5c11475 > > Build: x86_32p (pae). > > ------------[ cut here ]------------ > kernel BUG at net/core/dev.c:1133! > invalid opcode: 0000 [#1] > SMP > CPU: 0 > EIP: 0061:[<c04dceb0>] Not tainted VLI > EFLAGS: 00210297 (2.6.16.13-xen #12) > EIP is at skb_gso_segment+0xf0/0x110 > eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 > esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 > ds: 007b es: 007b ss: 0069 > Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 task=cb019560) > Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 00000002 c63a7900 > c04dcfb0 > cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 cf8e3140 c04dd07e > cd9e32e0 > cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e cd9e32e0 cf8e3000 > c0895140 > Call Trace: > [<c04dcfb0>] dev_gso_segment+0x30/0xb0 > [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 > [<c04ec07e>] __qdisc_run+0xbe/0x280 > [<c04dd4b9>] dev_queue_xmit+0x379/0x380 > [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 > [<c05c2402>] br_nf_post_routing+0x102/0x1d0 > [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05bbf40>] br_forward_finish+0x60/0x70 > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > [<c05c1b71>] br_nf_forward_finish+0x71/0x130 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 > [<c05c1b00>] br_nf_forward_finish+0x0/0x130 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bc044>] __br_forward+0x74/0x80 > [<c05bbee0>] br_forward_finish+0x0/0x70 > [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c054fe13>] ip_nat_in+0x43/0xc0 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c05c1914>] br_nf_pre_routing+0x404/0x580 > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > [<c04f0eab>] nf_iterate+0x6b/0xa0 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c05bd124>] br_handle_frame+0x1e4/0x250 > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > [<c04ddae5>] netif_receive_skb+0x165/0x2a0 > [<c04ddcdf>] process_backlog+0xbf/0x180 > [<c04ddebf>] net_rx_action+0x11f/0x1d0 > [<c01262e6>] __do_softirq+0x86/0x120 > [<c01263f5>] do_softirq+0x75/0x90 > [<c0106cef>] do_IRQ+0x1f/0x30 > [<c04271d0>] evtchn_do_upcall+0x90/0x100 > [<c0105315>] hypervisor_callback+0x3d/0x48 > Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 60 8b 44 24 08 > 83 c4 0c 5b 5e 5f 5d c3 0f 0 > b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c c0 e9 3a ff ff > ff 0f 0b 6c 04 e8 ab 6c c0 > <0>Kernel panic - not syncing: Fatal exception in interruptCheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 06, 2006 at 08:31:45PM +0000, Petersson, Mats wrote:> > If I force BOINC to access the network, it crashes instantly. If I don''t > start xend, it seems to work fine...Makes sense because your NIC doesn''t support TSO so it won''t do TSO direct to the NIC (so the bug isn''t triggered). Once xend starts it''s talking to the virtual netloop device which does support TSO. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: Herbert Xu [mailto:herbert@gondor.apana.org.au] > Sent: 07 July 2006 15:40 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com; netdev@vger.kernel.org; > kaber@trash.net; davem@davemloft.net > Subject: Re: [Xen-devel] kernel BUG at net/core/dev.c:1133! > > Petersson, Mats <Mats.Petersson@amd.com> wrote: > > Looks like the GSO is involved? > > It''s certainly what crashed your machine :) It''s probably not the > guilty party though. Someone is passing through a TSO packet with > checksum set to something other than CHECKSUM_HW. > > I bet it''s netfilter and we just never noticed before because real > NICS would simply corrupt the checksum silently. > > Could you confirm that you have netfilter rules (in particular NAT > rules) and that this goes away if you flush all your netfilter tables?If by netfilter, you mean "iptables", it says: [root@cheetah ~]# iptables --list Chain FORWARD (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination So, nothing going on there... I certainly haven''t got NAT on my machine, as my machine is within the AMD network, and doesn''t need NAT. AMD probably uses NAT as part of it''s external communications, but I doubt it''s used at all internally. I also have noticed the crash happens when I try to access another machine within my local switch - if that makes any difference... But not instantly. I can do some communication with the machine next to it [like I did "ssh cheetah" from my machine "quad" to get the iptables above, and it works just fine - but when I did "xm dmesg" from "cheetah" through ssh on "quad", it didn''t work - presumably because it''s a bit more data being pushed - but I can''t say for sure, as I have made no attempt to really debug it]. I hope this info is of help to analyze the situation, and please feel free to ask for further info. -- Mats> > Patrick, do we really have to zap the checksum on outbound NAT? Could > we update it instead? > > > I got this while running Dom0 only (no guests), with a > > BOINC/Rosetta@home application running on all 4 cores. > > > > changeset: 10649:8e55c5c11475 > > > > Build: x86_32p (pae). > > > > ------------[ cut here ]------------ > > kernel BUG at net/core/dev.c:1133! > > invalid opcode: 0000 [#1] > > SMP > > CPU: 0 > > EIP: 0061:[<c04dceb0>] Not tainted VLI > > EFLAGS: 00210297 (2.6.16.13-xen #12) > > EIP is at skb_gso_segment+0xf0/0x110 > > eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 > > esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 > > ds: 007b es: 007b ss: 0069 > > Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 > task=cb019560) > > Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 > 00000002 c63a7900 > > c04dcfb0 > > cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 cf8e3140 c04dd07e > > cd9e32e0 > > cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e cd9e32e0 cf8e3000 > > c0895140 > > Call Trace: > > [<c04dcfb0>] dev_gso_segment+0x30/0xb0 > > [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 > > [<c04ec07e>] __qdisc_run+0xbe/0x280 > > [<c04dd4b9>] dev_queue_xmit+0x379/0x380 > > [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 > > [<c05c2402>] br_nf_post_routing+0x102/0x1d0 > > [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05bbf40>] br_forward_finish+0x60/0x70 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05c1b71>] br_nf_forward_finish+0x71/0x130 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 > > [<c05c1b00>] br_nf_forward_finish+0x0/0x130 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bc044>] __br_forward+0x74/0x80 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c054fe13>] ip_nat_in+0x43/0xc0 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c05c1914>] br_nf_pre_routing+0x404/0x580 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05bd124>] br_handle_frame+0x1e4/0x250 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c04ddae5>] netif_receive_skb+0x165/0x2a0 > > [<c04ddcdf>] process_backlog+0xbf/0x180 > > [<c04ddebf>] net_rx_action+0x11f/0x1d0 > > [<c01262e6>] __do_softirq+0x86/0x120 > > [<c01263f5>] do_softirq+0x75/0x90 > > [<c0106cef>] do_IRQ+0x1f/0x30 > > [<c04271d0>] evtchn_do_upcall+0x90/0x100 > > [<c0105315>] hypervisor_callback+0x3d/0x48 > > Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 60 > 8b 44 24 08 > > 83 c4 0c 5b 5e 5f 5d c3 0f 0 > > b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c > c0 e9 3a ff ff > > ff 0f 0b 6c 04 e8 ab 6c c0 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I got the exact same thing when attempting to use BOINC on a single node supporting a 5 node open SSI cluster, (5 guests) and yes the problem went away when I flushed the rules. I attributed this to a quirk with the cluster CVIP, because I had also assigned each node its own outbound IP in addition to the incoming CVIP. Since I felt it was due to my tendency to over-tinker, I didn''t mention it on the lists, was a few months ago. Thought I would chime in as it sounds like the same experience, up to and including BOINC. HTH --Tim On Sat, 2006-07-08 at 00:39 +1000, Herbert Xu wrote:> Petersson, Mats <Mats.Petersson@amd.com> wrote: > > Looks like the GSO is involved? > > It''s certainly what crashed your machine :) It''s probably not the > guilty party though. Someone is passing through a TSO packet with > checksum set to something other than CHECKSUM_HW. > > I bet it''s netfilter and we just never noticed before because real > NICS would simply corrupt the checksum silently. > > Could you confirm that you have netfilter rules (in particular NAT > rules) and that this goes away if you flush all your netfilter tables? > > Patrick, do we really have to zap the checksum on outbound NAT? Could > we update it instead? > > > I got this while running Dom0 only (no guests), with a > > BOINC/Rosetta@home application running on all 4 cores. > > > > changeset: 10649:8e55c5c11475 > > > > Build: x86_32p (pae). > > > > ------------[ cut here ]------------ > > kernel BUG at net/core/dev.c:1133! > > invalid opcode: 0000 [#1] > > SMP > > CPU: 0 > > EIP: 0061:[<c04dceb0>] Not tainted VLI > > EFLAGS: 00210297 (2.6.16.13-xen #12) > > EIP is at skb_gso_segment+0xf0/0x110 > > eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 > > esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 > > ds: 007b es: 007b ss: 0069 > > Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 task=cb019560) > > Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 00000002 c63a7900 > > c04dcfb0 > > cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 cf8e3140 c04dd07e > > cd9e32e0 > > cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e cd9e32e0 cf8e3000 > > c0895140 > > Call Trace: > > [<c04dcfb0>] dev_gso_segment+0x30/0xb0 > > [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 > > [<c04ec07e>] __qdisc_run+0xbe/0x280 > > [<c04dd4b9>] dev_queue_xmit+0x379/0x380 > > [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 > > [<c05c2402>] br_nf_post_routing+0x102/0x1d0 > > [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05bbf40>] br_forward_finish+0x60/0x70 > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > [<c05c1b71>] br_nf_forward_finish+0x71/0x130 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 > > [<c05c1b00>] br_nf_forward_finish+0x0/0x130 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bc044>] __br_forward+0x74/0x80 > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c054fe13>] ip_nat_in+0x43/0xc0 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c05c1914>] br_nf_pre_routing+0x404/0x580 > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c05bd124>] br_handle_frame+0x1e4/0x250 > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > [<c04ddae5>] netif_receive_skb+0x165/0x2a0 > > [<c04ddcdf>] process_backlog+0xbf/0x180 > > [<c04ddebf>] net_rx_action+0x11f/0x1d0 > > [<c01262e6>] __do_softirq+0x86/0x120 > > [<c01263f5>] do_softirq+0x75/0x90 > > [<c0106cef>] do_IRQ+0x1f/0x30 > > [<c04271d0>] evtchn_do_upcall+0x90/0x100 > > [<c0105315>] hypervisor_callback+0x3d/0x48 > > Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 60 8b 44 24 08 > > 83 c4 0c 5b 5e 5f 5d c3 0f 0 > > b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c c0 e9 3a ff ff > > ff 0f 0b 6c 04 e8 ab 6c c0 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > Cheers,_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: Tim Post [mailto:tim.post@netkinetics.net] > Sent: 07 July 2006 16:06 > To: Herbert Xu > Cc: Petersson, Mats; netdev@vger.kernel.org; > xen-devel@lists.xensource.com; kaber@trash.net; davem@davemloft.net > Subject: Re: [Xen-devel] kernel BUG at net/core/dev.c:1133! > > I got the exact same thing when attempting to use BOINC on a > single node > supporting a 5 node open SSI cluster, (5 guests) and yes the problem > went away when I flushed the rules. > > I attributed this to a quirk with the cluster CVIP, because I had also > assigned each node its own outbound IP in addition to the > incoming CVIP. > > Since I felt it was due to my tendency to over-tinker, I > didn''t mention > it on the lists, was a few months ago. > > Thought I would chime in as it sounds like the same experience, up to > and including BOINC.I haven''t been tinkering with anything [on purpose, at least] - the system is a default installation of FC4, with the latest Xen-unstable [bar the last dozen or so changesets - I don''t pull the latest every half-hour]. -- Mats> > HTH > > --Tim > > On Sat, 2006-07-08 at 00:39 +1000, Herbert Xu wrote: > > Petersson, Mats <Mats.Petersson@amd.com> wrote: > > > Looks like the GSO is involved? > > > > It''s certainly what crashed your machine :) It''s probably not the > > guilty party though. Someone is passing through a TSO packet with > > checksum set to something other than CHECKSUM_HW. > > > > I bet it''s netfilter and we just never noticed before because real > > NICS would simply corrupt the checksum silently. > > > > Could you confirm that you have netfilter rules (in particular NAT > > rules) and that this goes away if you flush all your > netfilter tables? > > > > Patrick, do we really have to zap the checksum on outbound > NAT? Could > > we update it instead? > > > > > I got this while running Dom0 only (no guests), with a > > > BOINC/Rosetta@home application running on all 4 cores. > > > > > > changeset: 10649:8e55c5c11475 > > > > > > Build: x86_32p (pae). > > > > > > ------------[ cut here ]------------ > > > kernel BUG at net/core/dev.c:1133! > > > invalid opcode: 0000 [#1] > > > SMP > > > CPU: 0 > > > EIP: 0061:[<c04dceb0>] Not tainted VLI > > > EFLAGS: 00210297 (2.6.16.13-xen #12) > > > EIP is at skb_gso_segment+0xf0/0x110 > > > eax: 00000000 ebx: 00000003 ecx: 00000002 edx: c06e2e00 > > > esi: 00000008 edi: cd9e32e0 ebp: c63a7900 esp: c0de5ad0 > > > ds: 007b es: 007b ss: 0069 > > > Process rosetta_5.25_i6 (pid: 8826, threadinfo=c0de4000 > task=cb019560) > > > Stack: <0>c8f69060 00000000 ffffffa3 00000003 cd9e32e0 > 00000002 c63a7900 > > > c04dcfb0 > > > cd9e32e0 00000003 00000000 cd9e32e0 cf8e3000 > cf8e3140 c04dd07e > > > cd9e32e0 > > > cf8e3000 00000000 cd9e32e0 cf8e3000 c04ec07e > cd9e32e0 cf8e3000 > > > c0895140 > > > Call Trace: > > > [<c04dcfb0>] dev_gso_segment+0x30/0xb0 > > > [<c04dd07e>] dev_hard_start_xmit+0x4e/0x110 > > > [<c04ec07e>] __qdisc_run+0xbe/0x280 > > > [<c04dd4b9>] dev_queue_xmit+0x379/0x380 > > > [<c05bbe44>] br_dev_queue_push_xmit+0xa4/0x140 > > > [<c05c2402>] br_nf_post_routing+0x102/0x1d0 > > > [<c05c22b0>] br_nf_dev_queue_xmit+0x0/0x50 > > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > > [<c05bbf40>] br_forward_finish+0x60/0x70 > > > [<c05bbda0>] br_dev_queue_push_xmit+0x0/0x140 > > > [<c05c1b71>] br_nf_forward_finish+0x71/0x130 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c05c1d20>] br_nf_forward_ip+0xf0/0x1a0 > > > [<c05c1b00>] br_nf_forward_finish+0x0/0x130 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c05bc044>] __br_forward+0x74/0x80 > > > [<c05bbee0>] br_forward_finish+0x0/0x70 > > > [<c05bceb1>] br_handle_frame_finish+0xd1/0x160 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c05c0e0b>] br_nf_pre_routing_finish+0xfb/0x480 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c054fe13>] ip_nat_in+0x43/0xc0 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c05c1914>] br_nf_pre_routing+0x404/0x580 > > > [<c05c0d10>] br_nf_pre_routing_finish+0x0/0x480 > > > [<c04f0eab>] nf_iterate+0x6b/0xa0 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c04f0f4e>] nf_hook_slow+0x6e/0x120 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c05bd124>] br_handle_frame+0x1e4/0x250 > > > [<c05bcde0>] br_handle_frame_finish+0x0/0x160 > > > [<c04ddae5>] netif_receive_skb+0x165/0x2a0 > > > [<c04ddcdf>] process_backlog+0xbf/0x180 > > > [<c04ddebf>] net_rx_action+0x11f/0x1d0 > > > [<c01262e6>] __do_softirq+0x86/0x120 > > > [<c01263f5>] do_softirq+0x75/0x90 > > > [<c0106cef>] do_IRQ+0x1f/0x30 > > > [<c04271d0>] evtchn_do_upcall+0x90/0x100 > > > [<c0105315>] hypervisor_callback+0x3d/0x48 > > > Code: c2 2b 57 24 29 d0 8d 14 2a 89 87 94 00 00 00 89 57 > 60 8b 44 24 08 > > > 83 c4 0c 5b 5e 5f 5d c3 0f 0 > > > b 69 03 fe 8c 66 c0 e9 69 ff ff ff <0f> 0b 6d 04 e8 ab 6c > c0 e9 3a ff ff > > > ff 0f 0b 6c 04 e8 ab 6c c0 > > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > > > Cheers, > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Herbert Xu wrote:> Petersson, Mats <Mats.Petersson@amd.com> wrote: > >>Looks like the GSO is involved? > > > It''s certainly what crashed your machine :) It''s probably not the > guilty party though. Someone is passing through a TSO packet with > checksum set to something other than CHECKSUM_HW. > > I bet it''s netfilter and we just never noticed before because real > NICS would simply corrupt the checksum silently. > > Could you confirm that you have netfilter rules (in particular NAT > rules) and that this goes away if you flush all your netfilter tables? > > Patrick, do we really have to zap the checksum on outbound NAT? Could > we update it instead?Are you refering to this code in ip_nat_fn()? /* If we had a hardware checksum before, it''s now invalid */ if ((*pskb)->ip_summed == CHECKSUM_HW) if (skb_checksum_help(*pskb, (out == NULL))) return NF_DROP; Doing incremental updates should work fine. This is something I wanted to take care of at some point, but didn''t get to it yet. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Jul 07, 2006 at 05:03:36PM +0200, Petersson, Mats wrote:> > So, nothing going on there... I certainly haven''t got NAT on my machine, > as my machine is within the AMD network, and doesn''t need NAT. AMD > probably uses NAT as part of it''s external communications, but I doubt > it''s used at all internally.Actually, just having it loaded is enough to break TSO. So for all this time anyone who had ip_nat loaded were silently corrupting all their TSO checksums! I''ll send a patch soon once I''ve tested it. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Jul 07, 2006 at 10:06:08PM +0200, Patrick McHardy wrote:> > Are you refering to this code in ip_nat_fn()? > > /* If we had a hardware checksum before, it''s now invalid */ > if ((*pskb)->ip_summed == CHECKSUM_HW) > if (skb_checksum_help(*pskb, (out == NULL))) > return NF_DROP;Yep that''s the one.> Doing incremental updates should work fine. This is something > I wanted to take care of at some point, but didn''t get to it > yet.No worries. I''m going to do a workaround to fix the checksums in GSO for now. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Jul 08, 2006 at 12:03:42PM +1000, herbert wrote:> > I''ll send a patch soon once I''ve tested it.Here you go. This should fix the problem. [NET] linux: Import net-gso1.patch Here is the original changelog: [NET] gso: Fix up GSO packets with broken checksums Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they''re fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they''ve all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 51252bc644da -r 521480565732 linux-2.6-xen-sparse/include/linux/skbuff.h --- a/linux-2.6-xen-sparse/include/linux/skbuff.h Fri Jul 07 23:38:57 2006 +1000 +++ b/linux-2.6-xen-sparse/include/linux/skbuff.h Sat Jul 08 14:30:02 2006 +1000 @@ -1412,5 +1412,10 @@ static inline void nf_reset(struct sk_bu static inline void nf_reset(struct sk_buff *skb) {} #endif /* CONFIG_NETFILTER */ +static inline int skb_is_gso(const struct sk_buff *skb) +{ + return skb_shinfo(skb)->gso_size; +} + #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff -r 51252bc644da -r 521480565732 linux-2.6-xen-sparse/net/core/dev.c --- a/linux-2.6-xen-sparse/net/core/dev.c Fri Jul 07 23:38:57 2006 +1000 +++ b/linux-2.6-xen-sparse/net/core/dev.c Sat Jul 08 14:30:02 2006 +1000 @@ -1089,9 +1089,17 @@ int skb_checksum_help(struct sk_buff *sk unsigned int csum; int ret = 0, offset = skb->h.raw - skb->data; - if (inward) { - skb->ip_summed = CHECKSUM_NONE; - goto out; + if (inward) + goto out_set_summed; + + if (unlikely(skb_shinfo(skb)->gso_size)) { + static int warned; + + WARN_ON(!warned); + warned = 1; + + /* Let GSO fix up the checksum. */ + goto out_set_summed; } if (skb_cloned(skb)) { @@ -1108,6 +1116,8 @@ int skb_checksum_help(struct sk_buff *sk BUG_ON(skb->csum + 2 > offset); *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); + +out_set_summed: skb->ip_summed = CHECKSUM_NONE; out: return ret; @@ -1128,17 +1138,35 @@ struct sk_buff *skb_gso_segment(struct s struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); struct packet_type *ptype; int type = skb->protocol; + int err; BUG_ON(skb_shinfo(skb)->frag_list); - BUG_ON(skb->ip_summed != CHECKSUM_HW); skb->mac.raw = skb->data; skb->mac_len = skb->nh.raw - skb->data; __skb_pull(skb, skb->mac_len); + if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + static int warned; + + WARN_ON(!warned); + warned = 1; + + if (skb_header_cloned(skb) && + (err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) + return ERR_PTR(err); + } + rcu_read_lock(); list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & 15], list) { if (ptype->type == type && !ptype->dev && ptype->gso_segment) { + if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + err = ptype->gso_send_check(skb); + segs = ERR_PTR(err); + if (err || skb_gso_ok(skb, features)) + break; + __skb_push(skb, skb->data - skb->nh.raw); + } segs = ptype->gso_segment(skb, features); break; } diff -r 51252bc644da -r 521480565732 patches/linux-2.6.16.13/net-gso2.patch --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/patches/linux-2.6.16.13/net-gso2.patch Sat Jul 08 14:30:02 2006 +1000 @@ -0,0 +1,473 @@ +diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c +index b5e39a1..29d9218 100644 +--- a/drivers/net/bnx2.c ++++ b/drivers/net/bnx2.c +@@ -1593,7 +1593,7 @@ bnx2_tx_int(struct bnx2 *bp) + skb = tx_buf->skb; + #ifdef BCM_TSO + /* partial BD completions possible with TSO packets */ +- if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { + u16 last_idx, last_ring_idx; + + last_idx = sw_cons + +diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c +index 7b7d360..7d72e16 100644 +--- a/drivers/net/chelsio/sge.c ++++ b/drivers/net/chelsio/sge.c +@@ -1419,7 +1419,7 @@ int t1_start_xmit(struct sk_buff *skb, s + struct cpl_tx_pkt *cpl; + + #ifdef NETIF_F_TSO +- if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { + int eth_type; + struct cpl_tx_pkt_lso *hdr; + +diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c +index 681d284..96ddc24 100644 +--- a/drivers/net/e1000/e1000_main.c ++++ b/drivers/net/e1000/e1000_main.c +@@ -2526,7 +2526,7 @@ #ifdef NETIF_F_TSO + uint8_t ipcss, ipcso, tucss, tucso, hdr_len; + int err; + +- if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { + if (skb_header_cloned(skb)) { + err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC); + if (err) +@@ -2651,7 +2651,7 @@ #ifdef NETIF_F_TSO + * tso gets written back prematurely before the data is fully + * DMAd to the controller */ + if (!skb->data_len && tx_ring->last_tx_tso && +- !skb_shinfo(skb)->gso_size) { ++ !skb_is_gso(skb)) { + tx_ring->last_tx_tso = 0; + size -= 4; + } +@@ -2934,8 +2934,7 @@ #endif + + #ifdef NETIF_F_TSO + /* Controller Erratum workaround */ +- if (!skb->data_len && tx_ring->last_tx_tso && +- !skb_shinfo(skb)->gso_size) ++ if (!skb->data_len && tx_ring->last_tx_tso && !skb_is_gso(skb)) + count++; + #endif + +diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c +index c35f16e..c6ca459 100644 +--- a/drivers/net/forcedeth.c ++++ b/drivers/net/forcedeth.c +@@ -1105,7 +1105,7 @@ static int nv_start_xmit(struct sk_buff + np->tx_skbuff[nr] = skb; + + #ifdef NETIF_F_TSO +- if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) + tx_flags_extra = NV_TX2_TSO | (skb_shinfo(skb)->gso_size << NV_TX2_TSO_SHIFT); + else + #endif +diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c +index bdab369..7d187d0 100644 +--- a/drivers/net/ixgb/ixgb_main.c ++++ b/drivers/net/ixgb/ixgb_main.c +@@ -1163,7 +1163,7 @@ #ifdef NETIF_F_TSO + uint16_t ipcse, tucse, mss; + int err; + +- if(likely(skb_shinfo(skb)->gso_size)) { ++ if (likely(skb_is_gso(skb))) { + if (skb_header_cloned(skb)) { + err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC); + if (err) +diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c +index 9bcaa80..3843e0a 100644 +--- a/drivers/net/loopback.c ++++ b/drivers/net/loopback.c +@@ -139,7 +139,7 @@ #ifndef LOOPBACK_MUST_CHECKSUM + #endif + + #ifdef LOOPBACK_TSO +- if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { + BUG_ON(skb->protocol != htons(ETH_P_IP)); + BUG_ON(skb->nh.iph->protocol != IPPROTO_TCP); + +diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c +index 2a55eb3..aa06a82 100644 +--- a/drivers/net/sky2.c ++++ b/drivers/net/sky2.c +@@ -1125,7 +1125,7 @@ static unsigned tx_le_req(const struct s + count = sizeof(dma_addr_t) / sizeof(u32); + count += skb_shinfo(skb)->nr_frags * count; + +- if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) + ++count; + + if (skb->ip_summed == CHECKSUM_HW) +diff --git a/drivers/net/typhoon.c b/drivers/net/typhoon.c +index 30c48c9..3d62abc 100644 +--- a/drivers/net/typhoon.c ++++ b/drivers/net/typhoon.c +@@ -805,7 +805,7 @@ typhoon_start_tx(struct sk_buff *skb, st + * If problems develop with TSO, check this first. + */ + numDesc = skb_shinfo(skb)->nr_frags + 1; +- if(skb_tso_size(skb)) ++ if (skb_is_gso(skb)) + numDesc++; + + /* When checking for free space in the ring, we need to also +@@ -845,7 +845,7 @@ typhoon_start_tx(struct sk_buff *skb, st + TYPHOON_TX_PF_VLAN_TAG_SHIFT); + } + +- if(skb_tso_size(skb)) { ++ if (skb_is_gso(skb)) { + first_txd->processFlags |= TYPHOON_TX_PF_TCP_SEGMENT; + first_txd->numDesc++; + +diff --git a/drivers/s390/net/qeth_main.c b/drivers/s390/net/qeth_main.c +index d9cc997..a3ea8e0 100644 +--- a/drivers/s390/net/qeth_main.c ++++ b/drivers/s390/net/qeth_main.c +@@ -4454,7 +4454,7 @@ qeth_send_packet(struct qeth_card *card, + queue = card->qdio.out_qs + [qeth_get_priority_queue(card, skb, ipv, cast_type)]; + +- if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) + large_send = card->options.large_send; + + /*are we able to do TSO ? If so ,prepare and send it from here */ +@@ -4501,8 +4501,7 @@ qeth_send_packet(struct qeth_card *card, + card->stats.tx_packets++; + card->stats.tx_bytes += skb->len; + #ifdef CONFIG_QETH_PERF_STATS +- if (skb_shinfo(skb)->gso_size && +- !(large_send == QETH_LARGE_SEND_NO)) { ++ if (skb_is_gso(skb) && !(large_send == QETH_LARGE_SEND_NO)) { + card->perf_stats.large_send_bytes += skb->len; + card->perf_stats.large_send_cnt++; + } +diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h +index 47b0965..9865736 100644 +--- a/include/linux/netdevice.h ++++ b/include/linux/netdevice.h +@@ -541,6 +541,7 @@ struct packet_type { + struct net_device *); + struct sk_buff *(*gso_segment)(struct sk_buff *skb, + int features); ++ int (*gso_send_check)(struct sk_buff *skb); + void *af_packet_priv; + struct list_head list; + }; +@@ -1001,14 +1002,15 @@ extern void linkwatch_run_queue(void); + + static inline int skb_gso_ok(struct sk_buff *skb, int features) + { +- int feature = skb_shinfo(skb)->gso_size ? +- skb_shinfo(skb)->gso_type << NETIF_F_GSO_SHIFT : 0; ++ int feature = skb_shinfo(skb)->gso_type << NETIF_F_GSO_SHIFT; + return (features & feature) == feature; + } + + static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb) + { +- return !skb_gso_ok(skb, dev->features); ++ return skb_is_gso(skb) && ++ (!skb_gso_ok(skb, dev->features) || ++ unlikely(skb->ip_summed != CHECKSUM_HW)); + } + + #endif /* __KERNEL__ */ +diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h +index b19d45d..adfe3a8 100644 +--- a/include/linux/skbuff.h ++++ b/include/linux/skbuff.h +@@ -1403,5 +1403,10 @@ #else /* CONFIG_NETFILTER */ + static inline void nf_reset(struct sk_buff *skb) {} + #endif /* CONFIG_NETFILTER */ + ++static inline int skb_is_gso(const struct sk_buff *skb) ++{ ++ return skb_shinfo(skb)->gso_size; ++} ++ + #endif /* __KERNEL__ */ + #endif /* _LINUX_SKBUFF_H */ +diff --git a/include/net/protocol.h b/include/net/protocol.h +index 0d2dcdb..d516c58 100644 +--- a/include/net/protocol.h ++++ b/include/net/protocol.h +@@ -37,6 +37,7 @@ #define MAX_INET_PROTOS 256 /* Must be + struct net_protocol { + int (*handler)(struct sk_buff *skb); + void (*err_handler)(struct sk_buff *skb, u32 info); ++ int (*gso_send_check)(struct sk_buff *skb); + struct sk_buff *(*gso_segment)(struct sk_buff *skb, + int features); + int no_policy; +diff --git a/include/net/tcp.h b/include/net/tcp.h +index 70e1d5f..22dbbac 100644 +--- a/include/net/tcp.h ++++ b/include/net/tcp.h +@@ -1063,6 +1063,7 @@ extern struct request_sock_ops tcp_reque + + extern int tcp_v4_destroy_sock(struct sock *sk); + ++extern int tcp_v4_gso_send_check(struct sk_buff *skb); + extern struct sk_buff *tcp_tso_segment(struct sk_buff *skb, int features); + + #ifdef CONFIG_PROC_FS +diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c +index 00b1128..b34e76f 100644 +--- a/net/bridge/br_forward.c ++++ b/net/bridge/br_forward.c +@@ -32,7 +32,7 @@ static inline int should_deliver(const s + int br_dev_queue_push_xmit(struct sk_buff *skb) + { + /* drop mtu oversized packets except tso */ +- if (skb->len > skb->dev->mtu && !skb_shinfo(skb)->gso_size) ++ if (skb->len > skb->dev->mtu && !skb_is_gso(skb)) + kfree_skb(skb); + else { + #ifdef CONFIG_BRIDGE_NETFILTER +diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c +index 588207f..b2dba74 100644 +--- a/net/bridge/br_netfilter.c ++++ b/net/bridge/br_netfilter.c +@@ -743,7 +743,7 @@ static int br_nf_dev_queue_xmit(struct s + { + if (skb->protocol == htons(ETH_P_IP) && + skb->len > skb->dev->mtu && +- !skb_shinfo(skb)->gso_size) ++ !skb_is_gso(skb)) + return ip_fragment(skb, br_dev_queue_push_xmit); + else + return br_dev_queue_push_xmit(skb); +diff --git a/net/core/dev.c b/net/core/dev.c +index 32e1056..e814a89 100644 +--- a/net/core/dev.c ++++ b/net/core/dev.c +@@ -1083,9 +1083,17 @@ int skb_checksum_help(struct sk_buff *sk + unsigned int csum; + int ret = 0, offset = skb->h.raw - skb->data; + +- if (inward) { +- skb->ip_summed = CHECKSUM_NONE; +- goto out; ++ if (inward) ++ goto out_set_summed; ++ ++ if (unlikely(skb_shinfo(skb)->gso_size)) { ++ static int warned; ++ ++ WARN_ON(!warned); ++ warned = 1; ++ ++ /* Let GSO fix up the checksum. */ ++ goto out_set_summed; + } + + if (skb_cloned(skb)) { +@@ -1102,6 +1110,8 @@ int skb_checksum_help(struct sk_buff *sk + BUG_ON(skb->csum + 2 > offset); + + *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); ++ ++out_set_summed: + skb->ip_summed = CHECKSUM_NONE; + out: + return ret; +@@ -1122,17 +1132,35 @@ struct sk_buff *skb_gso_segment(struct s + struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); + struct packet_type *ptype; + int type = skb->protocol; ++ int err; + + BUG_ON(skb_shinfo(skb)->frag_list); +- BUG_ON(skb->ip_summed != CHECKSUM_HW); + + skb->mac.raw = skb->data; + skb->mac_len = skb->nh.raw - skb->data; + __skb_pull(skb, skb->mac_len); + ++ if (unlikely(skb->ip_summed != CHECKSUM_HW)) { ++ static int warned; ++ ++ WARN_ON(!warned); ++ warned = 1; ++ ++ if (skb_header_cloned(skb) && ++ (err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) ++ return ERR_PTR(err); ++ } ++ + rcu_read_lock(); + list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & 15], list) { + if (ptype->type == type && !ptype->dev && ptype->gso_segment) { ++ if (unlikely(skb->ip_summed != CHECKSUM_HW)) { ++ err = ptype->gso_send_check(skb); ++ segs = ERR_PTR(err); ++ if (err || skb_gso_ok(skb, features)) ++ break; ++ __skb_push(skb, skb->data - skb->nh.raw); ++ } + segs = ptype->gso_segment(skb, features); + break; + } +diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c +index 5ba719e..0a8c559 100644 +--- a/net/ipv4/af_inet.c ++++ b/net/ipv4/af_inet.c +@@ -1085,6 +1085,40 @@ int inet_sk_rebuild_header(struct sock * + + EXPORT_SYMBOL(inet_sk_rebuild_header); + ++static int inet_gso_send_check(struct sk_buff *skb) ++{ ++ struct iphdr *iph; ++ struct net_protocol *ops; ++ int proto; ++ int ihl; ++ int err = -EINVAL; ++ ++ if (unlikely(!pskb_may_pull(skb, sizeof(*iph)))) ++ goto out; ++ ++ iph = skb->nh.iph; ++ ihl = iph->ihl * 4; ++ if (ihl < sizeof(*iph)) ++ goto out; ++ ++ if (unlikely(!pskb_may_pull(skb, ihl))) ++ goto out; ++ ++ skb->h.raw = __skb_pull(skb, ihl); ++ iph = skb->nh.iph; ++ proto = iph->protocol & (MAX_INET_PROTOS - 1); ++ err = -EPROTONOSUPPORT; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(inet_protos[proto]); ++ if (likely(ops && ops->gso_send_check)) ++ err = ops->gso_send_check(skb); ++ rcu_read_unlock(); ++ ++out: ++ return err; ++} ++ + static struct sk_buff *inet_gso_segment(struct sk_buff *skb, int features) + { + struct sk_buff *segs = ERR_PTR(-EINVAL); +@@ -1142,6 +1176,7 @@ #endif + static struct net_protocol tcp_protocol = { + .handler = tcp_v4_rcv, + .err_handler = tcp_v4_err, ++ .gso_send_check = tcp_v4_gso_send_check, + .gso_segment = tcp_tso_segment, + .no_policy = 1, + }; +@@ -1188,6 +1223,7 @@ static int ipv4_proc_init(void); + static struct packet_type ip_packet_type = { + .type = __constant_htons(ETH_P_IP), + .func = ip_rcv, ++ .gso_send_check = inet_gso_send_check, + .gso_segment = inet_gso_segment, + }; + +diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c +index 19c3c73..2de887c 100644 +--- a/net/ipv4/ip_output.c ++++ b/net/ipv4/ip_output.c +@@ -210,7 +210,7 @@ #if defined(CONFIG_NETFILTER) && defined + return dst_output(skb); + } + #endif +- if (skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->gso_size) ++ if (skb->len > dst_mtu(skb->dst) && !skb_is_gso(skb)) + return ip_fragment(skb, ip_finish_output2); + else + return ip_finish_output2(skb); +@@ -1095,7 +1095,7 @@ ssize_t ip_append_page(struct sock *sk, + while (size > 0) { + int i; + +- if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) + len = size; + else { + +diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c +index 233bdf2..b4240b4 100644 +--- a/net/ipv4/tcp_ipv4.c ++++ b/net/ipv4/tcp_ipv4.c +@@ -495,6 +495,24 @@ void tcp_v4_send_check(struct sock *sk, + } + } + ++int tcp_v4_gso_send_check(struct sk_buff *skb) ++{ ++ struct iphdr *iph; ++ struct tcphdr *th; ++ ++ if (!pskb_may_pull(skb, sizeof(*th))) ++ return -EINVAL; ++ ++ iph = skb->nh.iph; ++ th = skb->h.th; ++ ++ th->check = 0; ++ th->check = ~tcp_v4_check(th, skb->len, iph->saddr, iph->daddr, 0); ++ skb->csum = offsetof(struct tcphdr, check); ++ skb->ip_summed = CHECKSUM_HW; ++ return 0; ++} ++ + /* + * This routine will send an RST to the other tcp. + * +diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c +index 737c1db..62ead52 100644 +--- a/net/ipv4/xfrm4_output.c ++++ b/net/ipv4/xfrm4_output.c +@@ -189,7 +189,7 @@ #ifdef CONFIG_NETFILTER + } + #endif + +- if (!skb_shinfo(skb)->gso_size) ++ if (!skb_is_gso(skb)) + return xfrm4_output_finish2(skb); + + skb->protocol = htons(ETH_P_IP); +diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c +index cf5d17e..33a5850 100644 +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -147,7 +147,7 @@ static int ip6_output2(struct sk_buff *s + + int ip6_output(struct sk_buff *skb) + { +- if ((skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->gso_size) || ++ if ((skb->len > dst_mtu(skb->dst) && !skb_is_gso(skb)) || + dst_allfrag(skb->dst)) + return ip6_fragment(skb, ip6_output2); + else +diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c +index 39bdeec..e9ea338 100644 +--- a/net/ipv6/xfrm6_output.c ++++ b/net/ipv6/xfrm6_output.c +@@ -179,7 +179,7 @@ static int xfrm6_output_finish(struct sk + { + struct sk_buff *segs; + +- if (!skb_shinfo(skb)->gso_size) ++ if (!skb_is_gso(skb)) + return xfrm6_output_finish2(skb); + + skb->protocol = htons(ETH_P_IP); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Jul 08, 2006 at 02:31:35PM +1000, herbert wrote:> > [NET] linux: Import net-gso1.patchThis patch is now upstream. So here is a new patch for Xen: [NET] net-gso.patch: Fix up GSO packets with broken checksums Here is the original changelog: [NET] gso: Fix up GSO packets with broken checksums Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they''re fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they''ve all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff -r 51252bc644da -r 135b1ff1ead6 linux-2.6-xen-sparse/include/linux/skbuff.h --- a/linux-2.6-xen-sparse/include/linux/skbuff.h Fri Jul 07 23:38:57 2006 +1000 +++ b/linux-2.6-xen-sparse/include/linux/skbuff.h Sun Jul 09 18:13:52 2006 +1000 @@ -1412,5 +1412,10 @@ static inline void nf_reset(struct sk_bu static inline void nf_reset(struct sk_buff *skb) {} #endif /* CONFIG_NETFILTER */ +static inline int skb_is_gso(const struct sk_buff *skb) +{ + return skb_shinfo(skb)->gso_size; +} + #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff -r 51252bc644da -r 135b1ff1ead6 linux-2.6-xen-sparse/net/core/dev.c --- a/linux-2.6-xen-sparse/net/core/dev.c Fri Jul 07 23:38:57 2006 +1000 +++ b/linux-2.6-xen-sparse/net/core/dev.c Sun Jul 09 18:13:52 2006 +1000 @@ -1089,9 +1089,17 @@ int skb_checksum_help(struct sk_buff *sk unsigned int csum; int ret = 0, offset = skb->h.raw - skb->data; - if (inward) { - skb->ip_summed = CHECKSUM_NONE; - goto out; + if (inward) + goto out_set_summed; + + if (unlikely(skb_shinfo(skb)->gso_size)) { + static int warned; + + WARN_ON(!warned); + warned = 1; + + /* Let GSO fix up the checksum. */ + goto out_set_summed; } if (skb_cloned(skb)) { @@ -1108,6 +1116,8 @@ int skb_checksum_help(struct sk_buff *sk BUG_ON(skb->csum + 2 > offset); *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); + +out_set_summed: skb->ip_summed = CHECKSUM_NONE; out: return ret; @@ -1128,17 +1138,35 @@ struct sk_buff *skb_gso_segment(struct s struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); struct packet_type *ptype; int type = skb->protocol; + int err; BUG_ON(skb_shinfo(skb)->frag_list); - BUG_ON(skb->ip_summed != CHECKSUM_HW); skb->mac.raw = skb->data; skb->mac_len = skb->nh.raw - skb->data; __skb_pull(skb, skb->mac_len); + if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + static int warned; + + WARN_ON(!warned); + warned = 1; + + if (skb_header_cloned(skb) && + (err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) + return ERR_PTR(err); + } + rcu_read_lock(); list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & 15], list) { if (ptype->type == type && !ptype->dev && ptype->gso_segment) { + if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + err = ptype->gso_send_check(skb); + segs = ERR_PTR(err); + if (err || skb_gso_ok(skb, features)) + break; + __skb_push(skb, skb->data - skb->nh.raw); + } segs = ptype->gso_segment(skb, features); break; } diff -r 51252bc644da -r 135b1ff1ead6 patches/linux-2.6.16.13/net-gso.patch --- a/patches/linux-2.6.16.13/net-gso.patch Fri Jul 07 23:38:57 2006 +1000 +++ b/patches/linux-2.6.16.13/net-gso.patch Sun Jul 09 18:13:52 2006 +1000 @@ -104,7 +104,7 @@ index dd41049..6615583 100644 if (skb_shinfo(skb)->nr_frags == 0) { struct cp_desc *txd = &cp->tx_ring[entry]; diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c -index a24200d..b5e39a1 100644 +index a24200d..29d9218 100644 --- a/drivers/net/bnx2.c +++ b/drivers/net/bnx2.c @@ -1593,7 +1593,7 @@ bnx2_tx_int(struct bnx2 *bp) @@ -112,7 +112,7 @@ index a24200d..b5e39a1 100644 #ifdef BCM_TSO /* partial BD completions possible with TSO packets */ - if (skb_shinfo(skb)->tso_size) { -+ if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { u16 last_idx, last_ring_idx; last_idx = sw_cons + @@ -178,7 +178,7 @@ index bcf9f17..e970921 100644 bond_dev->features |= NETIF_F_LLTX; diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c -index 30ff8ea..7b7d360 100644 +index 30ff8ea..7d72e16 100644 --- a/drivers/net/chelsio/sge.c +++ b/drivers/net/chelsio/sge.c @@ -1419,7 +1419,7 @@ int t1_start_xmit(struct sk_buff *skb, s @@ -186,7 +186,7 @@ index 30ff8ea..7b7d360 100644 #ifdef NETIF_F_TSO - if (skb_shinfo(skb)->tso_size) { -+ if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { int eth_type; struct cpl_tx_pkt_lso *hdr; @@ -200,7 +200,7 @@ index 30ff8ea..7b7d360 100644 cpl = (struct cpl_tx_pkt *)hdr; sge->stats.tx_lso_pkts++; diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c -index fa29402..681d284 100644 +index fa29402..96ddc24 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -2526,7 +2526,7 @@ #ifdef NETIF_F_TSO @@ -208,7 +208,7 @@ index fa29402..681d284 100644 int err; - if (skb_shinfo(skb)->tso_size) { -+ if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { if (skb_header_cloned(skb)) { err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC); if (err) @@ -226,7 +226,7 @@ index fa29402..681d284 100644 * DMAd to the controller */ if (!skb->data_len && tx_ring->last_tx_tso && - !skb_shinfo(skb)->tso_size) { -+ !skb_shinfo(skb)->gso_size) { ++ !skb_is_gso(skb)) { tx_ring->last_tx_tso = 0; size -= 4; } @@ -239,17 +239,18 @@ index fa29402..681d284 100644 /* The controller does a simple calculation to * make sure there is enough room in the FIFO before * initiating the DMA for each buffer. The calc is: -@@ -2935,7 +2935,7 @@ #endif +@@ -2934,8 +2934,7 @@ #endif + #ifdef NETIF_F_TSO /* Controller Erratum workaround */ - if (!skb->data_len && tx_ring->last_tx_tso && +- if (!skb->data_len && tx_ring->last_tx_tso && - !skb_shinfo(skb)->tso_size) -+ !skb_shinfo(skb)->gso_size) ++ if (!skb->data_len && tx_ring->last_tx_tso && !skb_is_gso(skb)) count++; #endif diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c -index 3682ec6..c35f16e 100644 +index 3682ec6..c6ca459 100644 --- a/drivers/net/forcedeth.c +++ b/drivers/net/forcedeth.c @@ -482,9 +482,9 @@ #define LPA_1000HALF 0x0400 @@ -279,7 +280,7 @@ index 3682ec6..c35f16e 100644 #ifdef NETIF_F_TSO - if (skb_shinfo(skb)->tso_size) - tx_flags_extra = NV_TX2_TSO | (skb_shinfo(skb)->tso_size << NV_TX2_TSO_SHIFT); -+ if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) + tx_flags_extra = NV_TX2_TSO | (skb_shinfo(skb)->gso_size << NV_TX2_TSO_SHIFT); else #endif @@ -450,7 +451,7 @@ index a9f49f0..339d4a7 100644 } diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c -index f9f77e4..bdab369 100644 +index f9f77e4..7d187d0 100644 --- a/drivers/net/ixgb/ixgb_main.c +++ b/drivers/net/ixgb/ixgb_main.c @@ -1163,7 +1163,7 @@ #ifdef NETIF_F_TSO @@ -458,7 +459,7 @@ index f9f77e4..bdab369 100644 int err; - if(likely(skb_shinfo(skb)->tso_size)) { -+ if(likely(skb_shinfo(skb)->gso_size)) { ++ if (likely(skb_is_gso(skb))) { if (skb_header_cloned(skb)) { err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC); if (err) @@ -472,7 +473,7 @@ index f9f77e4..bdab369 100644 skb->nh.iph->check = 0; skb->h.th->check = ~csum_tcpudp_magic(skb->nh.iph->saddr, diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c -index 690a1aa..9bcaa80 100644 +index 690a1aa..3843e0a 100644 --- a/drivers/net/loopback.c +++ b/drivers/net/loopback.c @@ -74,7 +74,7 @@ static void emulate_large_send_offload(s @@ -489,7 +490,7 @@ index 690a1aa..9bcaa80 100644 #ifdef LOOPBACK_TSO - if (skb_shinfo(skb)->tso_size) { -+ if (skb_shinfo(skb)->gso_size) { ++ if (skb_is_gso(skb)) { BUG_ON(skb->protocol != htons(ETH_P_IP)); BUG_ON(skb->nh.iph->protocol != IPPROTO_TCP); @@ -600,7 +601,7 @@ index b7f00d6..439f45f 100644 writeq(val64, &tx_fifo->List_Control); diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c -index 0618cd5..2a55eb3 100644 +index 0618cd5..aa06a82 100644 --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -1125,7 +1125,7 @@ static unsigned tx_le_req(const struct s @@ -608,7 +609,7 @@ index 0618cd5..2a55eb3 100644 count += skb_shinfo(skb)->nr_frags * count; - if (skb_shinfo(skb)->tso_size) -+ if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) ++count; if (skb->ip_summed == CHECKSUM_HW) @@ -667,7 +668,7 @@ index 5b1af39..11de5af 100644 np->stats.rx_missed_errors += ioread32(ioaddr + RxMissed) & 0xffff; diff --git a/drivers/net/typhoon.c b/drivers/net/typhoon.c -index 4c76cb7..30c48c9 100644 +index 4c76cb7..3d62abc 100644 --- a/drivers/net/typhoon.c +++ b/drivers/net/typhoon.c @@ -340,7 +340,7 @@ #define typhoon_synchronize_irq(x) synch @@ -679,6 +680,24 @@ index 4c76cb7..30c48c9 100644 #define TSO_NUM_DESCRIPTORS 2 #define TSO_OFFLOAD_ON TYPHOON_OFFLOAD_TCP_SEGMENT #else +@@ -805,7 +805,7 @@ typhoon_start_tx(struct sk_buff *skb, st + * If problems develop with TSO, check this first. + */ + numDesc = skb_shinfo(skb)->nr_frags + 1; +- if(skb_tso_size(skb)) ++ if (skb_is_gso(skb)) + numDesc++; + + /* When checking for free space in the ring, we need to also +@@ -845,7 +845,7 @@ typhoon_start_tx(struct sk_buff *skb, st + TYPHOON_TX_PF_VLAN_TAG_SHIFT); + } + +- if(skb_tso_size(skb)) { ++ if (skb_is_gso(skb)) { + first_txd->processFlags |= TYPHOON_TX_PF_TCP_SEGMENT; + first_txd->numDesc++; + diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c index ed1f837..2eb6b5f 100644 --- a/drivers/net/via-velocity.c @@ -769,7 +788,7 @@ index 82cb4af..57cec40 100644 static inline struct qeth_eddp_context * diff --git a/drivers/s390/net/qeth_main.c b/drivers/s390/net/qeth_main.c -index dba7f7f..d9cc997 100644 +index dba7f7f..a3ea8e0 100644 --- a/drivers/s390/net/qeth_main.c +++ b/drivers/s390/net/qeth_main.c @@ -4454,7 +4454,7 @@ qeth_send_packet(struct qeth_card *card, @@ -777,19 +796,20 @@ index dba7f7f..d9cc997 100644 [qeth_get_priority_queue(card, skb, ipv, cast_type)]; - if (skb_shinfo(skb)->tso_size) -+ if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) large_send = card->options.large_send; /*are we able to do TSO ? If so ,prepare and send it from here */ -@@ -4501,7 +4501,7 @@ qeth_send_packet(struct qeth_card *card, +@@ -4501,8 +4501,7 @@ qeth_send_packet(struct qeth_card *card, card->stats.tx_packets++; card->stats.tx_bytes += skb->len; #ifdef CONFIG_QETH_PERF_STATS - if (skb_shinfo(skb)->tso_size && -+ if (skb_shinfo(skb)->gso_size && - !(large_send == QETH_LARGE_SEND_NO)) { +- !(large_send == QETH_LARGE_SEND_NO)) { ++ if (skb_is_gso(skb) && !(large_send == QETH_LARGE_SEND_NO)) { card->perf_stats.large_send_bytes += skb->len; card->perf_stats.large_send_cnt++; + } diff --git a/drivers/s390/net/qeth_tso.h b/drivers/s390/net/qeth_tso.h index 1286dde..89cbf34 100644 --- a/drivers/s390/net/qeth_tso.h @@ -817,7 +837,7 @@ index 93535f0..9269df7 100644 /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h -index 7fda03d..47b0965 100644 +index 7fda03d..9865736 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -230,7 +230,8 @@ enum netdev_state_t @@ -869,16 +889,17 @@ index 7fda03d..47b0965 100644 /* cpu id of processor entered to hard_start_xmit or -1, if nobody entered there. */ -@@ -527,6 +539,8 @@ struct packet_type { +@@ -527,6 +539,9 @@ struct packet_type { struct net_device *, struct packet_type *, struct net_device *); + struct sk_buff *(*gso_segment)(struct sk_buff *skb, + int features); ++ int (*gso_send_check)(struct sk_buff *skb); void *af_packet_priv; struct list_head list; }; -@@ -693,7 +707,8 @@ extern int dev_change_name(struct net_d +@@ -693,7 +708,8 @@ extern int dev_change_name(struct net_d extern int dev_set_mtu(struct net_device *, int); extern int dev_set_mac_address(struct net_device *, struct sockaddr *); @@ -888,7 +909,7 @@ index 7fda03d..47b0965 100644 extern void dev_init(void); -@@ -900,11 +915,43 @@ static inline void __netif_rx_complete(s +@@ -900,11 +916,43 @@ static inline void __netif_rx_complete(s clear_bit(__LINK_STATE_RX_SCHED, &dev->state); } @@ -934,7 +955,7 @@ index 7fda03d..47b0965 100644 } /* These functions live elsewhere (drivers/net/net_init.c, but related) */ -@@ -932,6 +979,7 @@ extern int netdev_max_backlog; +@@ -932,6 +980,7 @@ extern int netdev_max_backlog; extern int weight_p; extern int netdev_set_master(struct net_device *dev, struct net_device *master); extern int skb_checksum_help(struct sk_buff *skb, int inward); @@ -942,27 +963,28 @@ index 7fda03d..47b0965 100644 #ifdef CONFIG_BUG extern void netdev_rx_csum_fault(struct net_device *dev); #else -@@ -951,6 +999,18 @@ #endif +@@ -951,6 +1000,19 @@ #endif extern void linkwatch_run_queue(void); +static inline int skb_gso_ok(struct sk_buff *skb, int features) +{ -+ int feature = skb_shinfo(skb)->gso_size ? -+ skb_shinfo(skb)->gso_type << NETIF_F_GSO_SHIFT : 0; ++ int feature = skb_shinfo(skb)->gso_type << NETIF_F_GSO_SHIFT; + return (features & feature) == feature; +} + +static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb) +{ -+ return !skb_gso_ok(skb, dev->features); ++ return skb_is_gso(skb) && ++ (!skb_gso_ok(skb, dev->features) || ++ unlikely(skb->ip_summed != CHECKSUM_HW)); +} + #endif /* __KERNEL__ */ #endif /* _LINUX_DEV_H */ diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h -index ad7cc22..b19d45d 100644 +index ad7cc22..adfe3a8 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -134,9 +134,10 @@ struct skb_frag_struct { @@ -1041,6 +1063,17 @@ index ad7cc22..b19d45d 100644 static inline void *skb_header_pointer(const struct sk_buff *skb, int offset, int len, void *buffer) +@@ -1377,5 +1403,10 @@ #else /* CONFIG_NETFILTER */ + static inline void nf_reset(struct sk_buff *skb) {} + #endif /* CONFIG_NETFILTER */ + ++static inline int skb_is_gso(const struct sk_buff *skb) ++{ ++ return skb_shinfo(skb)->gso_size; ++} ++ + #endif /* __KERNEL__ */ + #endif /* _LINUX_SKBUFF_H */ diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index b94d1ad..75b5b93 100644 --- a/include/net/pkt_sched.h @@ -1063,13 +1096,14 @@ index b94d1ad..75b5b93 100644 extern int tc_classify(struct sk_buff *skb, struct tcf_proto *tp, diff --git a/include/net/protocol.h b/include/net/protocol.h -index 6dc5970..0d2dcdb 100644 +index 6dc5970..d516c58 100644 --- a/include/net/protocol.h +++ b/include/net/protocol.h -@@ -37,6 +37,8 @@ #define MAX_INET_PROTOS 256 /* Must be +@@ -37,6 +37,9 @@ #define MAX_INET_PROTOS 256 /* Must be struct net_protocol { int (*handler)(struct sk_buff *skb); void (*err_handler)(struct sk_buff *skb, u32 info); ++ int (*gso_send_check)(struct sk_buff *skb); + struct sk_buff *(*gso_segment)(struct sk_buff *skb, + int features); int no_policy; @@ -1094,7 +1128,7 @@ index f63d0d5..a8e8d21 100644 } diff --git a/include/net/tcp.h b/include/net/tcp.h -index 77f21c6..70e1d5f 100644 +index 77f21c6..22dbbac 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -552,13 +552,13 @@ #include <net/tcp_ecn.h> @@ -1113,10 +1147,11 @@ index 77f21c6..70e1d5f 100644 } static inline void tcp_dec_pcount_approx(__u32 *count, -@@ -1063,6 +1063,8 @@ extern struct request_sock_ops tcp_reque +@@ -1063,6 +1063,9 @@ extern struct request_sock_ops tcp_reque extern int tcp_v4_destroy_sock(struct sock *sk); ++extern int tcp_v4_gso_send_check(struct sk_buff *skb); +extern struct sk_buff *tcp_tso_segment(struct sk_buff *skb, int features); + #ifdef CONFIG_PROC_FS @@ -1170,7 +1205,7 @@ index 0b33a7b..180e79b 100644 + NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_GSO_ROBUST; } diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c -index 2d24fb4..00b1128 100644 +index 2d24fb4..b34e76f 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -32,7 +32,7 @@ static inline int should_deliver(const s @@ -1178,7 +1213,7 @@ index 2d24fb4..00b1128 100644 { /* drop mtu oversized packets except tso */ - if (skb->len > skb->dev->mtu && !skb_shinfo(skb)->tso_size) -+ if (skb->len > skb->dev->mtu && !skb_shinfo(skb)->gso_size) ++ if (skb->len > skb->dev->mtu && !skb_is_gso(skb)) kfree_skb(skb); else { #ifdef CONFIG_BRIDGE_NETFILTER @@ -1222,7 +1257,7 @@ index f36b35e..0617146 100644 /* called with RTNL */ diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c -index 9e27373..588207f 100644 +index 9e27373..b2dba74 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -743,7 +743,7 @@ static int br_nf_dev_queue_xmit(struct s @@ -1230,12 +1265,12 @@ index 9e27373..588207f 100644 if (skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu && - !(skb_shinfo(skb)->ufo_size || skb_shinfo(skb)->tso_size)) -+ !skb_shinfo(skb)->gso_size) ++ !skb_is_gso(skb)) return ip_fragment(skb, br_dev_queue_push_xmit); else return br_dev_queue_push_xmit(skb); diff --git a/net/core/dev.c b/net/core/dev.c -index 12a214c..32e1056 100644 +index 12a214c..e814a89 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -115,6 +115,7 @@ #include <linux/wireless.h> /* Note : w @@ -1255,7 +1290,35 @@ index 12a214c..32e1056 100644 { struct packet_type *ptype; -@@ -1106,6 +1107,45 @@ out: +@@ -1082,9 +1083,17 @@ int skb_checksum_help(struct sk_buff *sk + unsigned int csum; + int ret = 0, offset = skb->h.raw - skb->data; + +- if (inward) { +- skb->ip_summed = CHECKSUM_NONE; +- goto out; ++ if (inward) ++ goto out_set_summed; ++ ++ if (unlikely(skb_shinfo(skb)->gso_size)) { ++ static int warned; ++ ++ WARN_ON(!warned); ++ warned = 1; ++ ++ /* Let GSO fix up the checksum. */ ++ goto out_set_summed; + } + + if (skb_cloned(skb)) { +@@ -1101,11 +1110,70 @@ int skb_checksum_help(struct sk_buff *sk + BUG_ON(skb->csum + 2 > offset); + + *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); ++ ++out_set_summed: + skb->ip_summed = CHECKSUM_NONE; + out: return ret; } @@ -1274,17 +1337,35 @@ index 12a214c..32e1056 100644 + struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); + struct packet_type *ptype; + int type = skb->protocol; ++ int err; + + BUG_ON(skb_shinfo(skb)->frag_list); -+ BUG_ON(skb->ip_summed != CHECKSUM_HW); + + skb->mac.raw = skb->data; + skb->mac_len = skb->nh.raw - skb->data; + __skb_pull(skb, skb->mac_len); + ++ if (unlikely(skb->ip_summed != CHECKSUM_HW)) { ++ static int warned; ++ ++ WARN_ON(!warned); ++ warned = 1; ++ ++ if (skb_header_cloned(skb) && ++ (err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) ++ return ERR_PTR(err); ++ } ++ + rcu_read_lock(); + list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & 15], list) { + if (ptype->type == type && !ptype->dev && ptype->gso_segment) { ++ if (unlikely(skb->ip_summed != CHECKSUM_HW)) { ++ err = ptype->gso_send_check(skb); ++ segs = ERR_PTR(err); ++ if (err || skb_gso_ok(skb, features)) ++ break; ++ __skb_push(skb, skb->data - skb->nh.raw); ++ } + segs = ptype->gso_segment(skb, features); + break; + } @@ -1301,7 +1382,7 @@ index 12a214c..32e1056 100644 /* Take action when hardware reception checksum errors are detected. */ #ifdef CONFIG_BUG void netdev_rx_csum_fault(struct net_device *dev) -@@ -1142,75 +1182,108 @@ #else +@@ -1142,75 +1210,108 @@ #else #define illegal_highdma(dev, skb) (0) #endif @@ -1469,7 +1550,7 @@ index 12a214c..32e1056 100644 } \ } -@@ -1246,9 +1319,13 @@ int dev_queue_xmit(struct sk_buff *skb) +@@ -1246,9 +1347,13 @@ int dev_queue_xmit(struct sk_buff *skb) struct Qdisc *q; int rc = -ENOMEM; @@ -1484,7 +1565,7 @@ index 12a214c..32e1056 100644 goto out_kfree_skb; /* Fragmented skb is linearized if device does not support SG, -@@ -1257,25 +1334,26 @@ int dev_queue_xmit(struct sk_buff *skb) +@@ -1257,25 +1362,26 @@ int dev_queue_xmit(struct sk_buff *skb) */ if (skb_shinfo(skb)->nr_frags && (!(dev->features & NETIF_F_SG) || illegal_highdma(dev, skb)) && @@ -1514,7 +1595,7 @@ index 12a214c..32e1056 100644 /* Updates of qdisc are serialized by queue_lock. * The struct Qdisc which is pointed to by qdisc is now a -@@ -1309,8 +1387,8 @@ #endif +@@ -1309,8 +1415,8 @@ #endif /* The device has no queue. Common case for software devices: loopback, all the sorts of tunnels... @@ -1525,7 +1606,7 @@ index 12a214c..32e1056 100644 counters.) However, it is possible, that they rely on protection made by us here. -@@ -1326,11 +1404,8 @@ #endif +@@ -1326,11 +1432,8 @@ #endif HARD_TX_LOCK(dev, cpu); if (!netif_queue_stopped(dev)) { @@ -1538,7 +1619,7 @@ index 12a214c..32e1056 100644 HARD_TX_UNLOCK(dev); goto out; } -@@ -1349,13 +1424,13 @@ #endif +@@ -1349,13 +1452,13 @@ #endif } rc = -ENETDOWN; @@ -1554,7 +1635,7 @@ index 12a214c..32e1056 100644 return rc; } -@@ -2670,7 +2745,7 @@ int register_netdevice(struct net_device +@@ -2670,7 +2773,7 @@ int register_netdevice(struct net_device BUG_ON(dev->reg_state != NETREG_UNINITIALIZED); spin_lock_init(&dev->queue_lock); @@ -1563,7 +1644,7 @@ index 12a214c..32e1056 100644 dev->xmit_lock_owner = -1; #ifdef CONFIG_NET_CLS_ACT spin_lock_init(&dev->ingress_lock); -@@ -2714,9 +2789,7 @@ #endif +@@ -2714,9 +2817,7 @@ #endif /* Fix illegal SG+CSUM combinations. */ if ((dev->features & NETIF_F_SG) && @@ -1574,7 +1655,7 @@ index 12a214c..32e1056 100644 printk("%s: Dropping NETIF_F_SG since no checksum feature.\n", dev->name); dev->features &= ~NETIF_F_SG; -@@ -3268,7 +3341,6 @@ subsys_initcall(net_dev_init); +@@ -3268,7 +3369,6 @@ subsys_initcall(net_dev_init); EXPORT_SYMBOL(__dev_get_by_index); EXPORT_SYMBOL(__dev_get_by_name); EXPORT_SYMBOL(__dev_remove_pack); @@ -2042,7 +2123,7 @@ index 3407f19..a0a25e0 100644 switch(flags & DN_RT_CNTL_MSK) { diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c -index 97c276f..5ba719e 100644 +index 97c276f..0a8c559 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -68,6 +68,7 @@ @@ -2053,10 +2134,44 @@ index 97c276f..5ba719e 100644 #include <linux/errno.h> #include <linux/types.h> #include <linux/socket.h> -@@ -1084,6 +1085,54 @@ int inet_sk_rebuild_header(struct sock * +@@ -1084,6 +1085,88 @@ int inet_sk_rebuild_header(struct sock * EXPORT_SYMBOL(inet_sk_rebuild_header); ++static int inet_gso_send_check(struct sk_buff *skb) ++{ ++ struct iphdr *iph; ++ struct net_protocol *ops; ++ int proto; ++ int ihl; ++ int err = -EINVAL; ++ ++ if (unlikely(!pskb_may_pull(skb, sizeof(*iph)))) ++ goto out; ++ ++ iph = skb->nh.iph; ++ ihl = iph->ihl * 4; ++ if (ihl < sizeof(*iph)) ++ goto out; ++ ++ if (unlikely(!pskb_may_pull(skb, ihl))) ++ goto out; ++ ++ skb->h.raw = __skb_pull(skb, ihl); ++ iph = skb->nh.iph; ++ proto = iph->protocol & (MAX_INET_PROTOS - 1); ++ err = -EPROTONOSUPPORT; ++ ++ rcu_read_lock(); ++ ops = rcu_dereference(inet_protos[proto]); ++ if (likely(ops && ops->gso_send_check)) ++ err = ops->gso_send_check(skb); ++ rcu_read_unlock(); ++ ++out: ++ return err; ++} ++ +static struct sk_buff *inet_gso_segment(struct sk_buff *skb, int features) +{ + struct sk_buff *segs = ERR_PTR(-EINVAL); @@ -2108,24 +2223,26 @@ index 97c276f..5ba719e 100644 #ifdef CONFIG_IP_MULTICAST static struct net_protocol igmp_protocol = { .handler = igmp_rcv, -@@ -1093,6 +1142,7 @@ #endif +@@ -1093,6 +1176,8 @@ #endif static struct net_protocol tcp_protocol = { .handler = tcp_v4_rcv, .err_handler = tcp_v4_err, ++ .gso_send_check = tcp_v4_gso_send_check, + .gso_segment = tcp_tso_segment, .no_policy = 1, }; -@@ -1138,6 +1188,7 @@ static int ipv4_proc_init(void); +@@ -1138,6 +1223,8 @@ static int ipv4_proc_init(void); static struct packet_type ip_packet_type = { .type = __constant_htons(ETH_P_IP), .func = ip_rcv, ++ .gso_send_check = inet_gso_send_check, + .gso_segment = inet_gso_segment, }; static int __init inet_init(void) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c -index 8dcba38..19c3c73 100644 +index 8dcba38..2de887c 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -210,8 +210,7 @@ #if defined(CONFIG_NETFILTER) && defined @@ -2134,7 +2251,7 @@ index 8dcba38..19c3c73 100644 #endif - if (skb->len > dst_mtu(skb->dst) && - !(skb_shinfo(skb)->ufo_size || skb_shinfo(skb)->tso_size)) -+ if (skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->gso_size) ++ if (skb->len > dst_mtu(skb->dst) && !skb_is_gso(skb)) return ip_fragment(skb, ip_finish_output2); else return ip_finish_output2(skb); @@ -2182,7 +2299,7 @@ index 8dcba38..19c3c73 100644 int i; - if (skb_shinfo(skb)->ufo_size) -+ if (skb_shinfo(skb)->gso_size) ++ if (skb_is_gso(skb)) len = size; else { @@ -2372,6 +2489,35 @@ index e9a54ae..defe77a 100644 break; pcount = tcp_skb_pcount(skb); } +diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c +index 233bdf2..b4240b4 100644 +--- a/net/ipv4/tcp_ipv4.c ++++ b/net/ipv4/tcp_ipv4.c +@@ -495,6 +495,24 @@ void tcp_v4_send_check(struct sock *sk, + } + } + ++int tcp_v4_gso_send_check(struct sk_buff *skb) ++{ ++ struct iphdr *iph; ++ struct tcphdr *th; ++ ++ if (!pskb_may_pull(skb, sizeof(*th))) ++ return -EINVAL; ++ ++ iph = skb->nh.iph; ++ th = skb->h.th; ++ ++ th->check = 0; ++ th->check = ~tcp_v4_check(th, skb->len, iph->saddr, iph->daddr, 0); ++ skb->csum = offsetof(struct tcphdr, check); ++ skb->ip_summed = CHECKSUM_HW; ++ return 0; ++} ++ + /* + * This routine will send an RST to the other tcp. + * diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 310f2e6..ee01f69 100644 --- a/net/ipv4/tcp_output.c @@ -2492,7 +2638,7 @@ index 310f2e6..ee01f69 100644 /* Use a previous sequence. This should cause the other * end to send an ack. Don''t queue or clone SKB, just diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c -index 32ad229..737c1db 100644 +index 32ad229..62ead52 100644 --- a/net/ipv4/xfrm4_output.c +++ b/net/ipv4/xfrm4_output.c @@ -9,6 +9,8 @@ @@ -2546,7 +2692,7 @@ index 32ad229..737c1db 100644 + } +#endif + -+ if (!skb_shinfo(skb)->gso_size) ++ if (!skb_is_gso(skb)) + return xfrm4_output_finish2(skb); + + skb->protocol = htons(ETH_P_IP); @@ -2581,7 +2727,7 @@ index 32ad229..737c1db 100644 { return NF_HOOK_COND(PF_INET, NF_IP_POST_ROUTING, skb, NULL, skb->dst->dev, diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c -index 5bf70b1..cf5d17e 100644 +index 5bf70b1..33a5850 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -147,7 +147,7 @@ static int ip6_output2(struct sk_buff *s @@ -2589,7 +2735,7 @@ index 5bf70b1..cf5d17e 100644 int ip6_output(struct sk_buff *skb) { - if ((skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->ufo_size) || -+ if ((skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->gso_size) || ++ if ((skb->len > dst_mtu(skb->dst) && !skb_is_gso(skb)) || dst_allfrag(skb->dst)) return ip6_fragment(skb, ip6_output2); else @@ -2644,7 +2790,7 @@ index d511a88..ef56d5d 100644 /* compression */ plen = skb->len - hdr_len; diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c -index 8024217..39bdeec 100644 +index 8024217..e9ea338 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -151,7 +151,7 @@ error_nolock: @@ -2673,7 +2819,7 @@ index 8024217..39bdeec 100644 +{ + struct sk_buff *segs; + -+ if (!skb_shinfo(skb)->gso_size) ++ if (!skb_is_gso(skb)) + return xfrm6_output_finish2(skb); + + skb->protocol = htons(ETH_P_IP); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 6 Jul 2006, at 21:27, Petersson, Mats wrote:> Looks like the GSO is involved?The new GSO code perturbs the netback code quite a bit, even if GSO isn''t being used (which it isn''t yet because I haven''t enabled it). Looks like there might be one or two lurking issues. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Reasonably Related Threads
- [Bug 339] Kernel panic on bridged packet
- 2.6.31 xenified kernel - not ready for production
- [Bug 765] New: Netfilter crash on bridged/TAP device on 2.6.38 & 3.0 kernels
- soft lockup after set multicast_router of bridge and it's port to 2
- Xen 3.0 and Hyperthreading an issue?