James Dykman
2006-Apr-13 22:29 UTC
[Xen-devel] [PATCH] Fix checksum errors when using network-bridge over VLANs
I set up a config similar to: http://lists.xensource.com/archives/html/xen-users/2006-04/msg00164.html and found that pings worked fine but TCP/UDP traffic would get checksum errors. A strategically placed dump_stack() shows dev_queue_xmit() getting called twice for the same skb: Apr 12 16:32:16 twofish kernel: [<c0105081>] show_trace+0x21/0x30 Apr 12 16:32:16 twofish kernel: [<c01051ee>] dump_stack+0x1e/0x20 Apr 12 16:32:16 twofish kernel: [<c04545a5>] vlan_dev_hwaccel_hard_start_xmit+0x105/0x140 Apr 12 16:32:16 twofish kernel: [<c03eee72>] dev_queue_xmit+0x192/0x350 <----------------------------------- Apr 12 16:32:16 twofish kernel: [<c043abac>] br_dev_queue_push_xmit+0x9c/0x140 Apr 12 16:32:16 twofish kernel: [<c0440ed6>] br_nf_post_routing+0xf6/0x1c0 Apr 12 16:32:16 twofish kernel: [<c03ffeee>] nf_iterate+0x5e/0x90 Apr 12 16:32:16 twofish kernel: [<c03fff8d>] nf_hook_slow+0x6d/0x110 Apr 12 16:32:16 twofish kernel: [<c043acaf>] br_forward_finish+0x5f/0x70 Apr 12 16:32:16 twofish kernel: [<c044068e>] br_nf_forward_finish+0x6e/0x140 Apr 12 16:32:16 twofish kernel: [<c0440847>] br_nf_forward_ip+0xe7/0x1a0 Apr 12 16:32:16 twofish kernel: [<c03ffeee>] nf_iterate+0x5e/0x90 Apr 12 16:32:16 twofish kernel: [<c03fff8d>] nf_hook_slow+0x6d/0x110 Apr 12 16:32:16 twofish kernel: [<c043ada6>] __br_forward+0x76/0x80 Apr 12 16:32:16 twofish kernel: [<c043ae4c>] br_forward+0x3c/0x60 Apr 12 16:32:16 twofish kernel: [<c043bbc9>] br_handle_frame_finish+0xc9/0x160 Apr 12 16:32:16 twofish kernel: [<c043f9c0>] br_nf_pre_routing_finish+0xf0/0x3a0 Apr 12 16:32:16 twofish kernel: [<c044041b>] br_nf_pre_routing+0x3fb/0x580 Apr 12 16:32:16 twofish kernel: [<c03ffeee>] nf_iterate+0x5e/0x90 Apr 12 16:32:16 twofish kernel: [<c03fff8d>] nf_hook_slow+0x6d/0x110 Apr 12 16:32:16 twofish kernel: [<c043be33>] br_handle_frame+0x1d3/0x240 Apr 12 16:32:16 twofish kernel: [<c03ef492>] netif_receive_skb+0x152/0x280 Apr 12 16:32:16 twofish kernel: [<c03ef65f>] process_backlog+0x9f/0x140 Apr 12 16:32:16 twofish kernel: [<c03ef7da>] net_rx_action+0xda/0x150 Apr 12 16:32:16 twofish kernel: [<c011e522>] __do_softirq+0x62/0xd0 Apr 12 16:32:16 twofish kernel: [<c011e5d8>] do_softirq+0x48/0x60 Apr 12 16:32:16 twofish kernel: [<c011e664>] local_bh_enable+0x74/0x80 Apr 12 16:32:16 twofish kernel: [<c03eef95>] dev_queue_xmit+0x2b5/0x350 <----------------------------------------- Apr 12 16:32:16 twofish kernel: [<c0408d4f>] ip_output+0x13f/0x2c0 Apr 12 16:32:16 twofish kernel: [<c040b0ca>] ip_push_pending_frames+0x3fa/0x4c0 Apr 12 16:32:16 twofish kernel: [<c042473d>] raw_sendmsg+0x48d/0x4f0 Apr 12 16:32:16 twofish kernel: [<c042d1e6>] inet_sendmsg+0x46/0x50 Apr 12 16:32:16 twofish kernel: [<c03e47db>] sock_sendmsg+0xbb/0xf0 Apr 12 16:32:16 twofish kernel: [<c03e61d1>] sys_sendmsg+0x1b1/0x250 Apr 12 16:32:16 twofish kernel: [<c03e64f7>] sys_socketcall+0x87/0x240 Apr 12 16:32:16 twofish kernel: [<c0104be9>] syscall_call+0x7/0xb Since we don''t reset the proto_csum_blank flag in the skb, the checksum calculation gets done twice, which is not twice as good as once. With this patch, TCP/UDP checksum errors from dom0 are fixed, and domUs can use TCP/UDP without turning off TX checksum offload. Normal non-VLAN bridged configs still work fine, tested with xm-test. Jim Signed-off-by: Jim Dykman <dykman@us.ibm.com> diff -r 4ed269ac7d84 linux-2.6-xen-sparse/net/core/dev.c --- a/linux-2.6-xen-sparse/net/core/dev.c Mon Apr 10 12:24:58 2006 +++ b/linux-2.6-xen-sparse/net/core/dev.c Thu Apr 13 17:30:45 2006 @@ -1294,6 +1294,7 @@ if ((skb->h.raw + skb->csum + 2) > skb->tail) goto out_kfree_skb; skb->ip_summed = CHECKSUM_HW; + skb->proto_csum_blank = 0; } #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean-Francois Stenuit
2006-May-05 08:21 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when using network-bridge over VLANs
On Thu, 13 Apr 2006 18:29:51 -0400, James Dykman wrote:> I set up a config similar to: > > http://lists.xensource.com/archives/html/xen-users/2006-04/msg00164.html > > and found that pings worked fine but TCP/UDP traffic would get checksum > errors.Just noticed that. Pretty hard to troubleshoot.> With this patch, TCP/UDP checksum errors from dom0 are fixed, and domUs > can use TCP/UDP without turning off TX checksum offload. > Normal non-VLAN bridged configs still work fine, tested with xm-test.Just a quick question : is there a work-around available without a full recompile of Xen kernels ? -- |--- Jean-Francois "Jef" Stenuit _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Dykman
2006-May-05 14:28 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when using network-bridge over VLANs
Jean-Francois Stenuit <jfs@skynet.be> wrote on 05/05/2006 04:21:28 AM:> > On Thu, 13 Apr 2006 18:29:51 -0400, James Dykman wrote: > > > > With this patch, TCP/UDP checksum errors from dom0 are fixed, anddomUs> > can use TCP/UDP without turning off TX checksum offload. > > Normal non-VLAN bridged configs still work fine, tested with xm-test. > > Just a quick question : is there a work-around available without a full > recompile of Xen kernels ? >You can use ethtool to turn off the TX checksum offload, so that the TCP/UDP protocols calculate them. In the domUs, and possibly even dom0 (I haven''t tried it myself): ethtool -K <ethX> tx off Jim _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel