Paul Durrant
2013-Dec-10 15:39 UTC
[RFC PATCH net-next 0/3] unify netfront and netback checksum setup code
I''m posting this as RFC to xen-devel before going to netdev to get feedback. In patch 1 I''ve basically taken the latest checksum setup code from netback, with all the recent fixes, and pushed it into net/core/dev.c (which seems like the right home for such code). Patch 2 and 3 then just modify netback and netfront to use the new code. Does this look like a reasonable approach? I think it would be good to make the checksum setup code available to all network drivers. There''s nothing Xen specific about it.
Paul Durrant
2013-Dec-10 15:39 UTC
[RFC PATCH net-next 1/3] net: add skb checksum setup functions for TCP and UDP
This patch adds a function to set up partial checksum offsets for both v4 and v6 TCP and UDP packets into the core network code to make it widely available to any network driver. The code is taken from xen-netback and both that driver and xen-netfront will be modified to use the new function. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- include/linux/netdevice.h | 1 + net/core/dev.c | 256 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 257 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 9d55e51..e26ee7b 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2919,6 +2919,7 @@ void *netdev_lower_dev_get_private_rcu(struct net_device *dev, void *netdev_lower_dev_get_private(struct net_device *dev, struct net_device *lower_dev); int skb_checksum_help(struct sk_buff *skb); +int skb_checksum_setup(struct sk_buff *skb, bool recalculate_partial); struct sk_buff *__skb_gso_segment(struct sk_buff *skb, netdev_features_t features, bool tx_path); struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb, diff --git a/net/core/dev.c b/net/core/dev.c index 355df36..e565656 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2282,6 +2282,262 @@ out: } EXPORT_SYMBOL(skb_checksum_help); +static inline int maybe_pull_tail(struct sk_buff *skb, unsigned int len, + unsigned int max) +{ + if (skb_headlen(skb) >= len) + return 0; + + /* If we need to pullup then pullup to the max, so we + * won''t need to do it again. + */ + if (max > skb->len) + max = skb->len; + + if (__pskb_pull_tail(skb, max - skb_headlen(skb)) == NULL) + return -ENOMEM; + + if (skb_headlen(skb) < len) + return -EPROTO; + + return 0; +} + +/* This value should be large enough to cover a tagged ethernet header plus + * maximally sized IP and TCP or UDP headers. + */ +#define MAX_IP_HDR_LEN 128 + +static int checksum_setup_ip(struct sk_buff *skb, bool recalculate) +{ + unsigned int off; + bool fragment; + int err; + + fragment = false; + + err = maybe_pull_tail(skb, + sizeof(struct iphdr), + MAX_IP_HDR_LEN); + if (err < 0) + goto out; + + if (ip_hdr(skb)->frag_off & htons(IP_OFFSET | IP_MF)) + fragment = true; + + off = ip_hdrlen(skb); + + err = -EPROTO; + + switch (ip_hdr(skb)->protocol) { + case IPPROTO_TCP: + if (!skb_partial_csum_set(skb, off, + offsetof(struct tcphdr, check))) + goto out; + + if (recalculate) { + err = maybe_pull_tail(skb, + off + sizeof(struct tcphdr), + MAX_IP_HDR_LEN); + if (err < 0) + goto out; + + tcp_hdr(skb)->check + ~csum_tcpudp_magic(ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, + skb->len - off, + IPPROTO_TCP, 0); + } + break; + case IPPROTO_UDP: + if (!skb_partial_csum_set(skb, off, + offsetof(struct udphdr, check))) + goto out; + + if (recalculate) { + err = maybe_pull_tail(skb, + off + sizeof(struct udphdr), + MAX_IP_HDR_LEN); + if (err < 0) + goto out; + + udp_hdr(skb)->check + ~csum_tcpudp_magic(ip_hdr(skb)->saddr, + ip_hdr(skb)->daddr, + skb->len - off, + IPPROTO_UDP, 0); + } + break; + default: + goto out; + } + + err = 0; + +out: + return err; +} + +/* This value should be large enough to cover a tagged ethernet header plus + * an IPv6 header, all options, and a maximal TCP or UDP header. + */ +#define MAX_IPV6_HDR_LEN 256 + +#define OPT_HDR(type, skb, off) \ + (type *)(skb_network_header(skb) + (off)) + +static int checksum_setup_ipv6(struct sk_buff *skb, bool recalculate) +{ + int err; + u8 nexthdr; + unsigned int off; + unsigned int len; + bool fragment; + bool done; + + fragment = false; + done = false; + + off = sizeof(struct ipv6hdr); + + err = maybe_pull_tail(skb, off, MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + nexthdr = ipv6_hdr(skb)->nexthdr; + + len = sizeof(struct ipv6hdr) + ntohs(ipv6_hdr(skb)->payload_len); + while (off <= len && !done) { + switch (nexthdr) { + case IPPROTO_DSTOPTS: + case IPPROTO_HOPOPTS: + case IPPROTO_ROUTING: { + struct ipv6_opt_hdr *hp; + + err = maybe_pull_tail(skb, + off + + sizeof(struct ipv6_opt_hdr), + MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + hp = OPT_HDR(struct ipv6_opt_hdr, skb, off); + nexthdr = hp->nexthdr; + off += ipv6_optlen(hp); + break; + } + case IPPROTO_AH: { + struct ip_auth_hdr *hp; + + err = maybe_pull_tail(skb, + off + + sizeof(struct ip_auth_hdr), + MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + hp = OPT_HDR(struct ip_auth_hdr, skb, off); + nexthdr = hp->nexthdr; + off += ipv6_authlen(hp); + break; + } + case IPPROTO_FRAGMENT: { + struct frag_hdr *hp; + + err = maybe_pull_tail(skb, + off + + sizeof(struct frag_hdr), + MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + hp = OPT_HDR(struct frag_hdr, skb, off); + + if (hp->frag_off & htons(IP6_OFFSET | IP6_MF)) + fragment = true; + + nexthdr = hp->nexthdr; + off += sizeof(struct frag_hdr); + break; + } + default: + done = true; + break; + } + } + + err = -EPROTO; + + if (!done || fragment) + goto out; + + switch (nexthdr) { + case IPPROTO_TCP: + if (!skb_partial_csum_set(skb, off, + offsetof(struct tcphdr, check))) + goto out; + + if (recalculate) { + err = maybe_pull_tail(skb, + off + sizeof(struct tcphdr), + MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + tcp_hdr(skb)->check + ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr, + &ipv6_hdr(skb)->daddr, + skb->len - off, + IPPROTO_TCP, 0); + } + break; + case IPPROTO_UDP: + if (!skb_partial_csum_set(skb, off, + offsetof(struct udphdr, check))) + goto out; + + if (recalculate) { + err = maybe_pull_tail(skb, + off + sizeof(struct udphdr), + MAX_IPV6_HDR_LEN); + if (err < 0) + goto out; + + udp_hdr(skb)->check + ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr, + &ipv6_hdr(skb)->daddr, + skb->len - off, + IPPROTO_UDP, 0); + } + break; + default: + goto out; + } + + err = 0; + +out: + return err; +} + + +/* Determine correct offset for TCP or UDP checksum, as appropriate, and + * optionally re-calculate the pseudo header checksum. + */ +int skb_checksum_setup(struct sk_buff *skb, bool recalculate) +{ + if (skb->ip_summed != CHECKSUM_PARTIAL) + return -EINVAL; + + if (skb->protocol == htons(ETH_P_IP)) + return checksum_setup_ip(skb, recalculate); + else if (skb->protocol == htons(ETH_P_IPV6)) + return checksum_setup_ipv6(skb, recalculate); + else + return -EPROTO; +} +EXPORT_SYMBOL(skb_checksum_setup); + __be16 skb_network_protocol(struct sk_buff *skb) { __be16 type = skb->protocol; -- 1.7.10.4
Paul Durrant
2013-Dec-10 15:39 UTC
[RFC PATCH net-next 2/3] xen-netback: use new core skb_checksum_setup function
This patch removes the private implementation of partial checksum setup in favour of the core implementation. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- drivers/net/xen-netback/netback.c | 248 +------------------------------------ 1 file changed, 1 insertion(+), 247 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 43341b8..a7a00cd 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1051,249 +1051,8 @@ static int xenvif_set_skb_gso(struct xenvif *vif, return 0; } -static inline int maybe_pull_tail(struct sk_buff *skb, unsigned int len, - unsigned int max) -{ - if (skb_headlen(skb) >= len) - return 0; - - /* If we need to pullup then pullup to the max, so we - * won''t need to do it again. - */ - if (max > skb->len) - max = skb->len; - - if (__pskb_pull_tail(skb, max - skb_headlen(skb)) == NULL) - return -ENOMEM; - - if (skb_headlen(skb) < len) - return -EPROTO; - - return 0; -} - -/* This value should be large enough to cover a tagged ethernet header plus - * maximally sized IP and TCP or UDP headers. - */ -#define MAX_IP_HDR_LEN 128 - -static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb, - int recalculate_partial_csum) -{ - unsigned int off; - bool fragment; - int err; - - fragment = false; - - err = maybe_pull_tail(skb, - sizeof(struct iphdr), - MAX_IP_HDR_LEN); - if (err < 0) - goto out; - - if (ip_hdr(skb)->frag_off & htons(IP_OFFSET | IP_MF)) - fragment = true; - - off = ip_hdrlen(skb); - - err = -EPROTO; - - switch (ip_hdr(skb)->protocol) { - case IPPROTO_TCP: - if (!skb_partial_csum_set(skb, off, - offsetof(struct tcphdr, check))) - goto out; - - if (recalculate_partial_csum) { - err = maybe_pull_tail(skb, - off + sizeof(struct tcphdr), - MAX_IP_HDR_LEN); - if (err < 0) - goto out; - - tcp_hdr(skb)->check - ~csum_tcpudp_magic(ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, - skb->len - off, - IPPROTO_TCP, 0); - } - break; - case IPPROTO_UDP: - if (!skb_partial_csum_set(skb, off, - offsetof(struct udphdr, check))) - goto out; - - if (recalculate_partial_csum) { - err = maybe_pull_tail(skb, - off + sizeof(struct udphdr), - MAX_IP_HDR_LEN); - if (err < 0) - goto out; - - udp_hdr(skb)->check - ~csum_tcpudp_magic(ip_hdr(skb)->saddr, - ip_hdr(skb)->daddr, - skb->len - off, - IPPROTO_UDP, 0); - } - break; - default: - goto out; - } - - err = 0; - -out: - return err; -} - -/* This value should be large enough to cover a tagged ethernet header plus - * an IPv6 header, all options, and a maximal TCP or UDP header. - */ -#define MAX_IPV6_HDR_LEN 256 - -#define OPT_HDR(type, skb, off) \ - (type *)(skb_network_header(skb) + (off)) - -static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb, - int recalculate_partial_csum) -{ - int err; - u8 nexthdr; - unsigned int off; - unsigned int len; - bool fragment; - bool done; - - fragment = false; - done = false; - - off = sizeof(struct ipv6hdr); - - err = maybe_pull_tail(skb, off, MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - nexthdr = ipv6_hdr(skb)->nexthdr; - - len = sizeof(struct ipv6hdr) + ntohs(ipv6_hdr(skb)->payload_len); - while (off <= len && !done) { - switch (nexthdr) { - case IPPROTO_DSTOPTS: - case IPPROTO_HOPOPTS: - case IPPROTO_ROUTING: { - struct ipv6_opt_hdr *hp; - - err = maybe_pull_tail(skb, - off + - sizeof(struct ipv6_opt_hdr), - MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - hp = OPT_HDR(struct ipv6_opt_hdr, skb, off); - nexthdr = hp->nexthdr; - off += ipv6_optlen(hp); - break; - } - case IPPROTO_AH: { - struct ip_auth_hdr *hp; - - err = maybe_pull_tail(skb, - off + - sizeof(struct ip_auth_hdr), - MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - hp = OPT_HDR(struct ip_auth_hdr, skb, off); - nexthdr = hp->nexthdr; - off += ipv6_authlen(hp); - break; - } - case IPPROTO_FRAGMENT: { - struct frag_hdr *hp; - - err = maybe_pull_tail(skb, - off + - sizeof(struct frag_hdr), - MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - hp = OPT_HDR(struct frag_hdr, skb, off); - - if (hp->frag_off & htons(IP6_OFFSET | IP6_MF)) - fragment = true; - - nexthdr = hp->nexthdr; - off += sizeof(struct frag_hdr); - break; - } - default: - done = true; - break; - } - } - - err = -EPROTO; - - if (!done || fragment) - goto out; - - switch (nexthdr) { - case IPPROTO_TCP: - if (!skb_partial_csum_set(skb, off, - offsetof(struct tcphdr, check))) - goto out; - - if (recalculate_partial_csum) { - err = maybe_pull_tail(skb, - off + sizeof(struct tcphdr), - MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - tcp_hdr(skb)->check - ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr, - &ipv6_hdr(skb)->daddr, - skb->len - off, - IPPROTO_TCP, 0); - } - break; - case IPPROTO_UDP: - if (!skb_partial_csum_set(skb, off, - offsetof(struct udphdr, check))) - goto out; - - if (recalculate_partial_csum) { - err = maybe_pull_tail(skb, - off + sizeof(struct udphdr), - MAX_IPV6_HDR_LEN); - if (err < 0) - goto out; - - udp_hdr(skb)->check - ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr, - &ipv6_hdr(skb)->daddr, - skb->len - off, - IPPROTO_UDP, 0); - } - break; - default: - goto out; - } - - err = 0; - -out: - return err; -} - static int checksum_setup(struct xenvif *vif, struct sk_buff *skb) { - int err = -EPROTO; int recalculate_partial_csum = 0; /* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy @@ -1311,12 +1070,7 @@ static int checksum_setup(struct xenvif *vif, struct sk_buff *skb) if (skb->ip_summed != CHECKSUM_PARTIAL) return 0; - if (skb->protocol == htons(ETH_P_IP)) - err = checksum_setup_ip(vif, skb, recalculate_partial_csum); - else if (skb->protocol == htons(ETH_P_IPV6)) - err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum); - - return err; + return skb_checksum_setup(skb, recalculate_partial_csum); } static bool tx_credit_exceeded(struct xenvif *vif, unsigned size) -- 1.7.10.4
Paul Durrant
2013-Dec-10 15:39 UTC
[RFC PATCH net-next 3/3] xen-netfront: use new core skb_checksum_setup function
This patch removes the private implementation of partial checksum setup in favour of the core implementation. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- drivers/net/xen-netfront.c | 44 +------------------------------------------- 1 file changed, 1 insertion(+), 43 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index e59acb1..70f9808 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -859,8 +859,6 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np, static int checksum_setup(struct net_device *dev, struct sk_buff *skb) { - struct iphdr *iph; - int err = -EPROTO; int recalculate_partial_csum = 0; /* @@ -880,47 +878,7 @@ static int checksum_setup(struct net_device *dev, struct sk_buff *skb) if (skb->ip_summed != CHECKSUM_PARTIAL) return 0; - if (skb->protocol != htons(ETH_P_IP)) - goto out; - - iph = (void *)skb->data; - - switch (iph->protocol) { - case IPPROTO_TCP: - if (!skb_partial_csum_set(skb, 4 * iph->ihl, - offsetof(struct tcphdr, check))) - goto out; - - if (recalculate_partial_csum) { - struct tcphdr *tcph = tcp_hdr(skb); - tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, - skb->len - iph->ihl*4, - IPPROTO_TCP, 0); - } - break; - case IPPROTO_UDP: - if (!skb_partial_csum_set(skb, 4 * iph->ihl, - offsetof(struct udphdr, check))) - goto out; - - if (recalculate_partial_csum) { - struct udphdr *udph = udp_hdr(skb); - udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, - skb->len - iph->ihl*4, - IPPROTO_UDP, 0); - } - break; - default: - if (net_ratelimit()) - pr_err("Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n", - iph->protocol); - goto out; - } - - err = 0; - -out: - return err; + return skb_checksum_setup(skb, recalculate_partial_csum); } static int handle_incoming_queue(struct net_device *dev, -- 1.7.10.4
Wei Liu
2013-Dec-10 18:09 UTC
Re: [RFC PATCH net-next 0/3] unify netfront and netback checksum setup code
On Tue, Dec 10, 2013 at 03:39:49PM +0000, Paul Durrant wrote:> I''m posting this as RFC to xen-devel before going to netdev to get feedback. > In patch 1 I''ve basically taken the latest checksum setup code from netback, > with all the recent fixes, and pushed it into net/core/dev.c (which seems > like the right home for such code). Patch 2 and 3 then just modify netback > and netfront to use the new code. > > Does this look like a reasonable approach? I think it would be good to make > the checksum setup code available to all network drivers. There''s nothing > Xen specific about it. > >I agree it is generally good to consolidate duplicated code so that we don''t need to maintain two copies of idential code. But the acceptance of this series will mainly reuqire acknowledge from core driver maintainers. :-) Wei.> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Paul Durrant
2013-Dec-11 10:32 UTC
Re: [RFC PATCH net-next 0/3] unify netfront and netback checksum setup code
> -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: 10 December 2013 18:09 > To: Paul Durrant > Cc: xen-devel@lists.xen.org; Wei Liu > Subject: Re: [Xen-devel] [RFC PATCH net-next 0/3] unify netfront and > netback checksum setup code > > On Tue, Dec 10, 2013 at 03:39:49PM +0000, Paul Durrant wrote: > > I''m posting this as RFC to xen-devel before going to netdev to get > feedback. > > In patch 1 I''ve basically taken the latest checksum setup code from > netback, > > with all the recent fixes, and pushed it into net/core/dev.c (which seems > > like the right home for such code). Patch 2 and 3 then just modify netback > > and netfront to use the new code. > > > > Does this look like a reasonable approach? I think it would be good to make > > the checksum setup code available to all network drivers. There''s nothing > > Xen specific about it. > > > > > > I agree it is generally good to consolidate duplicated code so that we > don''t need to maintain two copies of idential code. > > But the acceptance of this series will mainly reuqire acknowledge from > core driver maintainers. :-) >Indeed. I just wanted a sanity check before I put my head above the parapet :-/ Paul
Ian Campbell
2013-Dec-12 12:48 UTC
Re: [RFC PATCH net-next 1/3] net: add skb checksum setup functions for TCP and UDP
On Tue, 2013-12-10 at 15:39 +0000, Paul Durrant wrote: [...] I assume all that was basically unchanged from the netback version? You may get feedback based on what Eric said about pulling up too much being bad for coalescing now that this is more global and if not the #define MAX_XXX might be requested to go in a header somewhere.> + > +/* Determine correct offset for TCP or UDP checksum, as appropriate, and > + * optionally re-calculate the pseudo header checksum.Not just determine, but also pull up, I think?> + */ > +int skb_checksum_setup(struct sk_buff *skb, bool recalculate) > +{ > + if (skb->ip_summed != CHECKSUM_PARTIAL) > + return -EINVAL; > + > + if (skb->protocol == htons(ETH_P_IP)) > + return checksum_setup_ip(skb, recalculate); > + else if (skb->protocol == htons(ETH_P_IPV6)) > + return checksum_setup_ipv6(skb, recalculate); > + else > + return -EPROTO; > +} > +EXPORT_SYMBOL(skb_checksum_setup); > + > __be16 skb_network_protocol(struct sk_buff *skb) > { > __be16 type = skb->protocol;
Ian Campbell
2013-Dec-12 12:49 UTC
Re: [RFC PATCH net-next 2/3] xen-netback: use new core skb_checksum_setup function
On Tue, 2013-12-10 at 15:39 +0000, Paul Durrant wrote:> This patch removes the private implementation of partial checksum setup > in favour of the core implementation. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>Bit of a no brainer I think: Acked-by: Ian Campbell <ian.campbell@citrix.com> and 3/3 likewise.
Ian Campbell
2013-Dec-12 12:50 UTC
Re: [RFC PATCH net-next 0/3] unify netfront and netback checksum setup code
On Wed, 2013-12-11 at 10:32 +0000, Paul Durrant wrote:> > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: 10 December 2013 18:09 > > To: Paul Durrant > > Cc: xen-devel@lists.xen.org; Wei Liu > > Subject: Re: [Xen-devel] [RFC PATCH net-next 0/3] unify netfront and > > netback checksum setup code > > > > On Tue, Dec 10, 2013 at 03:39:49PM +0000, Paul Durrant wrote: > > > I''m posting this as RFC to xen-devel before going to netdev to get > > feedback. > > > In patch 1 I''ve basically taken the latest checksum setup code from > > netback, > > > with all the recent fixes, and pushed it into net/core/dev.c (which seems > > > like the right home for such code). Patch 2 and 3 then just modify netback > > > and netfront to use the new code. > > > > > > Does this look like a reasonable approach? I think it would be good to make > > > the checksum setup code available to all network drivers. There''s nothing > > > Xen specific about it. > > > > > > > > > > I agree it is generally good to consolidate duplicated code so that we > > don''t need to maintain two copies of idential code. > > > > But the acceptance of this series will mainly reuqire acknowledge from > > core driver maintainers. :-) > > > > Indeed. I just wanted a sanity check before I put my head above the parapet :-/I don''t see any obvious red flags... Ian.