James Dykman
2006-May-09 19:22 UTC
[Xen-devel] [PATCH] Fix checksum errors when firewalling in domU
Another checksum offload problem was reported on xen-users, when using a domU as a firewall: http://lists.xensource.com/archives/html/xen-users/2006-04/msg01150.html It also fails without VLANs. The path from dom0->domU with ip_summed==CHECKSUM_HW/proto_csum_blank==1 is broken. - skb_checksum_setup() assumes that a checksum will definitely be calculated in dev_queue_xmit(), but the destination device is netback, which is advertising NETIF_F_IP_CSUM ---> no checksum. So, added reset_proto_csum_blank() to skbuff.h, and call it in dev_queue_xmit() and xfrm4_output() AFTER the checksum is calculated. - The CHECKSUM_HW/proto_csum_blank vars are getting lost/stomped in various places. Tested on -unstable changeset 9925: - The network xm-tests on -bridge and -route - A domU firewalling between two bridges (similar to the error report, but without VLANs) - The IPSec tunnel config from a previous checksum bug. (#143?) I had to update the patch for the renamed patches/linux-2.6.16.13 directory this morning. It compiles, but I haven''t retested. Signed-off-by: Jim Dykman <dykman@us.ibm.com> Fix checksum errors when firewalling in domU diff -r 1e3977e029fd linux-2.6-xen-sparse/drivers/xen/netback/netback.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Tue May 9 13:38:56 2006 @@ -172,6 +172,7 @@ BUG_ON(ret); nskb->dev = skb->dev; nskb->proto_data_valid = skb->proto_data_valid; + nskb->proto_csum_blank = skb->proto_csum_blank; dev_kfree_skb(skb); skb = nskb; } @@ -340,6 +341,8 @@ flags |= NETRXF_csum_blank | NETRXF_data_validated; else if (skb->proto_data_valid) /* remote but checksummed? */ flags |= NETRXF_data_validated; + else if (skb->proto_csum_blank) /* forwarded, !checksummed */ + flags |= NETRXF_csum_blank; if (make_rx_response(netif, id, status, (unsigned long)skb->data & ~PAGE_MASK, size, flags) && @@ -669,7 +672,10 @@ * can infer it from csum_blank so test both flags. */ if (txreq.flags & (NETTXF_data_validated|NETTXF_csum_blank)) { - skb->ip_summed = CHECKSUM_UNNECESSARY; + if (txreq.flags & NETTXF_csum_blank) + skb->ip_summed = CHECKSUM_HW; + else + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->proto_data_valid = 1; } else { skb->ip_summed = CHECKSUM_NONE; diff -r 1e3977e029fd linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c --- a/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Tue May 9 13:38:56 2006 @@ -819,7 +819,10 @@ * can infer it from csum_blank so test both flags. */ if (rx->flags & (NETRXF_data_validated|NETRXF_csum_blank)) { - skb->ip_summed = CHECKSUM_UNNECESSARY; + if (rx->flags & NETRXF_csum_blank) + skb->ip_summed = CHECKSUM_HW; + else + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->proto_data_valid = 1; } else { skb->ip_summed = CHECKSUM_NONE; diff -r 1e3977e029fd linux-2.6-xen-sparse/include/linux/skbuff.h --- a/linux-2.6-xen-sparse/include/linux/skbuff.h Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/include/linux/skbuff.h Tue May 9 13:38:56 2006 @@ -1281,6 +1281,14 @@ extern void skb_init(void); extern void skb_add_mtu(int mtu); +#ifdef CONFIG_XEN +static inline void reset_proto_csum_blank(struct sk_buff *skb) +{ + skb->proto_csum_blank=0; +} +#else +static inline void reset_proto_csum_blank(const struct sk_buff *skb) { } +#endif /** * skb_get_timestamp - get timestamp from a skb * @skb: skb to get stamp from diff -r 1e3977e029fd linux-2.6-xen-sparse/net/core/dev.c --- a/linux-2.6-xen-sparse/net/core/dev.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/net/core/dev.c Tue May 9 13:38:56 2006 @@ -1246,7 +1246,6 @@ if ((skb->h.raw + skb->csum + 2) > skb->tail) goto out; skb->ip_summed = CHECKSUM_HW; - skb->proto_csum_blank = 0; } return 0; out: @@ -1315,9 +1314,11 @@ if (skb->ip_summed == CHECKSUM_HW && (!(dev->features & (NETIF_F_HW_CSUM | NETIF_F_NO_CSUM)) && (!(dev->features & NETIF_F_IP_CSUM) || - skb->protocol != htons(ETH_P_IP)))) + skb->protocol != htons(ETH_P_IP)))) { if (skb_checksum_help(skb, 0)) goto out_kfree_skb; + reset_proto_csum_blank(skb); + } spin_lock_prefetch(&dev->queue_lock); diff -r 1e3977e029fd patches/linux-2.6.16.13/net-csum.patch --- a/patches/linux-2.6.16.13/net-csum.patch Mon May 8 18:21:41 2006 +++ b/patches/linux-2.6.16.13/net-csum.patch Tue May 9 13:38:56 2006 @@ -40,8 +40,8 @@ return 1; } diff -pruN ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c ./net/ipv4/xfrm4_output.c ---- ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c 2006-05-02 22:38:44.000000000 +0100 -+++ ./net/ipv4/xfrm4_output.c 2006-05-04 17:41:37.000000000 +0100 +--- ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c 2006-05-02 17:38:44.000000000 -0400 ++++ ./net/ipv4/xfrm4_output.c 2006-05-09 13:08:15.000000000 -0400 @@ -17,6 +17,8 @@ #include <net/xfrm.h> #include <net/icmp.h> @@ -51,7 +51,7 @@ /* Add encapsulation header. * * In transport mode, the IP header will be moved forward to make space -@@ -103,6 +105,10 @@ static int xfrm4_output_one(struct sk_bu +@@ -103,10 +105,15 @@ static int xfrm4_output_one(struct sk_bu struct xfrm_state *x = dst->xfrm; int err; @@ -62,3 +62,8 @@ if (skb->ip_summed == CHECKSUM_HW) { err = skb_checksum_help(skb, 0); if (err) + goto error_nolock; ++ reset_proto_csum_blank(skb); + } + + if (x->props.mode) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-May-09 20:05 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when firewalling in domU
On 9 May 2006, at 20:22, James Dykman wrote:> @@ -819,7 +819,10 @@ > * can infer it from csum_blank so test both flags. > */ > if (rx->flags & > (NETRXF_data_validated|NETRXF_csum_blank)) > { > - skb->ip_summed = CHECKSUM_UNNECESSARY; > + if (rx->flags & NETRXF_csum_blank) > + skb->ip_summed = CHECKSUM_HW; > + else > + skb->ip_summed = CHECKSUM_UNNECESSARY; > skb->proto_data_valid = 1; > } else { > skb->ip_summed = CHECKSUM_NONE;This hunk seems dodgy to me. According to the comment in linux/skbuff.h we shouldn''t be passing up CHECKSUM_HW unless we have set skb->csum to the 1s-complement sum of the packet contents. You added code to do this in both netfront and netback, and it doesn''t seem right in either case. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Dykman
2006-May-10 14:51 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when firewalling in domU
xen-devel-bounces@lists.xensource.com wrote on 05/09/2006 04:05:30 PM:> > On 9 May 2006, at 20:22, James Dykman wrote: > > > @@ -819,7 +819,10 @@ > > * can infer it from csum_blank so test both flags. > > */ > > if (rx->flags & > > (NETRXF_data_validated|NETRXF_csum_blank)) > > { > > - skb->ip_summed = CHECKSUM_UNNECESSARY; > > + if (rx->flags & NETRXF_csum_blank) > > + skb->ip_summed = CHECKSUM_HW; > > + else > > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > skb->proto_data_valid = 1; > > } else { > > skb->ip_summed = CHECKSUM_NONE; > > This hunk seems dodgy to me. According to the comment in linux/skbuff.h > we shouldn''t be passing up CHECKSUM_HW unless we have set skb->csum to > the 1s-complement sum of the packet contents. You added code to do this > in both netfront and netback, and it doesn''t seem right in either case. > > -- Keir >Right, that was hiding more brokenness. Fixed below. Signed-off-by: Jim Dykman <dykman@us.ibm.com> Fix checksum errors when firewalling in domU diff -r 1e3977e029fd linux-2.6-xen-sparse/drivers/xen/netback/netback.c --- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Wed May 10 09:57:42 2006 @@ -172,6 +172,7 @@ BUG_ON(ret); nskb->dev = skb->dev; nskb->proto_data_valid = skb->proto_data_valid; + nskb->proto_csum_blank = skb->proto_csum_blank; dev_kfree_skb(skb); skb = nskb; } @@ -338,8 +339,10 @@ flags = 0; if (skb->ip_summed == CHECKSUM_HW) /* local packet? */ flags |= NETRXF_csum_blank | NETRXF_data_validated; - else if (skb->proto_data_valid) /* remote but checksummed? */ + if (skb->proto_data_valid) /* remote but checksummed? */ flags |= NETRXF_data_validated; + if (skb->proto_csum_blank) /* forwarded, !checksummed */ + flags |= NETRXF_csum_blank; if (make_rx_response(netif, id, status, (unsigned long)skb->data & ~PAGE_MASK, size, flags) && diff -r 1e3977e029fd linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c --- a/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c Wed May 10 09:57:42 2006 @@ -699,6 +699,8 @@ tx->flags |= NETTXF_csum_blank | NETTXF_data_validated; if (skb->proto_data_valid) /* remote but checksummed? */ tx->flags |= NETTXF_data_validated; + if (skb->proto_csum_blank) /* remote, not checksummed */ + tx->flags |= NETTXF_csum_blank; np->tx.req_prod_pvt = i + 1; RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->tx, notify); @@ -1024,6 +1026,8 @@ tx->flags |= NETTXF_csum_blank | NETTXF_data_validated; if (skb->proto_data_valid) /* remote but checksummed? */ tx->flags |= NETTXF_data_validated; + if (skb->proto_csum_blank) /* remote, not checksummed */ + tx->flags |= NETTXF_csum_blank; np->stats.tx_bytes += skb->len; np->stats.tx_packets++; diff -r 1e3977e029fd linux-2.6-xen-sparse/include/linux/skbuff.h --- a/linux-2.6-xen-sparse/include/linux/skbuff.h Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/include/linux/skbuff.h Wed May 10 09:57:42 2006 @@ -1281,6 +1281,14 @@ extern void skb_init(void); extern void skb_add_mtu(int mtu); +#ifdef CONFIG_XEN +static inline void reset_proto_csum_blank(struct sk_buff *skb) +{ + skb->proto_csum_blank=0; +} +#else +static inline void reset_proto_csum_blank(const struct sk_buff *skb) { } +#endif /** * skb_get_timestamp - get timestamp from a skb * @skb: skb to get stamp from diff -r 1e3977e029fd linux-2.6-xen-sparse/net/core/dev.c --- a/linux-2.6-xen-sparse/net/core/dev.c Mon May 8 18:21:41 2006 +++ b/linux-2.6-xen-sparse/net/core/dev.c Wed May 10 09:57:42 2006 @@ -1246,7 +1246,6 @@ if ((skb->h.raw + skb->csum + 2) > skb->tail) goto out; skb->ip_summed = CHECKSUM_HW; - skb->proto_csum_blank = 0; } return 0; out: @@ -1315,9 +1314,11 @@ if (skb->ip_summed == CHECKSUM_HW && (!(dev->features & (NETIF_F_HW_CSUM | NETIF_F_NO_CSUM)) && (!(dev->features & NETIF_F_IP_CSUM) || - skb->protocol != htons(ETH_P_IP)))) + skb->protocol != htons(ETH_P_IP)))) { if (skb_checksum_help(skb, 0)) goto out_kfree_skb; + reset_proto_csum_blank(skb); + } spin_lock_prefetch(&dev->queue_lock); diff -r 1e3977e029fd patches/linux-2.6.16.13/net-csum.patch --- a/patches/linux-2.6.16.13/net-csum.patch Mon May 8 18:21:41 2006 +++ b/patches/linux-2.6.16.13/net-csum.patch Wed May 10 09:57:42 2006 @@ -40,8 +40,8 @@ return 1; } diff -pruN ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c ./net/ipv4/xfrm4_output.c ---- ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c 2006-05-02 22:38:44.000000000 +0100 -+++ ./net/ipv4/xfrm4_output.c 2006-05-04 17:41:37.000000000 +0100 +--- ../pristine-linux-2.6.16.13/net/ipv4/xfrm4_output.c 2006-05-02 17:38:44.000000000 -0400 ++++ ./net/ipv4/xfrm4_output.c 2006-05-09 13:08:15.000000000 -0400 @@ -17,6 +17,8 @@ #include <net/xfrm.h> #include <net/icmp.h> @@ -51,7 +51,7 @@ /* Add encapsulation header. * * In transport mode, the IP header will be moved forward to make space -@@ -103,6 +105,10 @@ static int xfrm4_output_one(struct sk_bu +@@ -103,10 +105,15 @@ static int xfrm4_output_one(struct sk_bu struct xfrm_state *x = dst->xfrm; int err; @@ -62,3 +62,8 @@ if (skb->ip_summed == CHECKSUM_HW) { err = skb_checksum_help(skb, 0); if (err) + goto error_nolock; ++ reset_proto_csum_blank(skb); + } + + if (x->props.mode) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-May-10 15:39 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when firewalling in domU
On 10 May 2006, at 15:51, James Dykman wrote:> Right, that was hiding more brokenness. Fixed below. > > Signed-off-by: Jim Dykman <dykman@us.ibm.com> > Fix checksum errors when firewalling in domUTwo more questions: 1. Why do you need to check proto_csum_blank in the start_xmit() handling paths of netfront and netback? skb_checksum_setup() will have set ip_summed=CHECKSUM_HW in all cases won''t it? 2. The change to not aggressively clear proto_csum_blank appears to make sense. However, why not clear it inside skb_checksum_help(), rather than modifying two of its callers to do so? Because it''ll be in one place we can wrap in ifdef XEN rather than needing to add yet another wrapper function to do the clearing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-May-10 16:35 UTC
Re: [Xen-devel] [PATCH] Fix checksum errors when firewalling in domU
On 10 May 2006, at 16:39, Keir Fraser wrote:>> Right, that was hiding more brokenness. Fixed below. >> >> Signed-off-by: Jim Dykman <dykman@us.ibm.com> >> Fix checksum errors when firewalling in domU > > Two more questions: > 1. Why do you need to check proto_csum_blank in the start_xmit() > handling paths of netfront and netback? skb_checksum_setup() will have > set ip_summed=CHECKSUM_HW in all cases won''t it? > > 2. The change to not aggressively clear proto_csum_blank appears to > make sense. However, why not clear it inside skb_checksum_help(), > rather than modifying two of its callers to do so? Because it''ll be in > one place we can wrap in ifdef XEN rather than needing to add yet > another wrapper function to do the clearing.Ah, now I *think* I understand where all these problems are emanating from. You added code to copy proto_csum_blank when we copy a packet in the start_xmit handler of netback. However, what you *really* want to copy is ip_summed, as it''s that which we use to determine if checksum-offload is required or not (not proto_csum_blank). I bet that, with copying of ip_summed added to netback, the rest of your patch will not be required. A similar patch is also needed in two(!) places in netfront. I''ve applied a patch to fix all of them. Let''s see how far that gets us. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel