Li Yang
2009-Mar-20 09:04 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
The bridging device used a constant hard_header_len. This will cause headroom shortage for ports with additional hardware header. The patch makes bridging device to use the maximum value of all ports. Signed-off-by: Li Yang <leoli at freescale.com> --- Fixes the following BUG when using bridging with gianfar driver: skb_under_panic: text:c0224b84 len:122 put:8 head:dfb81800 data:dfb817fa tail:0xdfb81874 end:0xdfb818a0 dev:eth1 ------------[ cut here ]------------ Kernel BUG at c02d9444 [verbose debug info unavailable] Oops: Exception in kernel mode, sig: 5 [#1] Call Trace: [df2dbb20] [c02d9444] skb_under_panic+0x48/0x5c (unreliable) [df2dbb30] [c0224b94] gfar_start_xmit+0x384/0x400 [df2dbb60] [c02e1c8c] dev_hard_start_xmit+0x258/0x2cc [df2dbba0] [c02f264c] __qdisc_run+0x5c/0x1f8 [df2dbbd0] [c02e4bf4] dev_queue_xmit+0x264/0x2d0 [df2dbbf0] [c036fdc8] br_dev_queue_push_xmit+0x90/0xf8 [df2dbc00] [c036fcc8] br_flood+0xc8/0x120 [df2dbc30] [c036ebe0] br_dev_xmit+0xbc/0xc0 [df2dbc40] [c02e1c8c] dev_hard_start_xmit+0x258/0x2cc [df2dbc80] [c02e4c04] dev_queue_xmit+0x274/0x2d0 [df2dbca0] [c02ebaa8] neigh_resolve_output+0xfc/0x25c ............ net/bridge/br_if.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 727c5c5..d34303d 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -348,6 +348,7 @@ void br_features_recompute(struct net_bridge *br) { struct net_bridge_port *p; unsigned long features, mask; + unsigned short max_hard_header_len = ETH_HLEN; features = mask = br->feature_mask; if (list_empty(&br->port_list)) @@ -358,7 +359,10 @@ void br_features_recompute(struct net_bridge *br) list_for_each_entry(p, &br->port_list, list) { features = netdev_increment_features(features, p->dev->features, mask); + if (p->dev->hard_header_len > max_hard_header_len) + max_hard_header_len = p->dev->hard_header_len; } + br->dev->hard_header_len = max_hard_header_len; done: br->dev->features = netdev_fix_features(features, NULL); -- 1.5.5.1.248.g4b17
Li Yang
2009-Mar-23 07:59 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
On Fri, Mar 20, 2009 at 5:04 PM, Li Yang <leoli at freescale.com> wrote:> The bridging device used a constant hard_header_len. ?This will cause > headroom shortage for ports with additional hardware header. ?The patch > makes bridging device to use the maximum value of all ports. > > Signed-off-by: Li Yang <leoli at freescale.com> > --- > Fixes the following BUG when using bridging with gianfar driver: > > skb_under_panic: text:c0224b84 len:122 put:8 head:dfb81800 data:dfb817fa tail:0xdfb81874 end:0xdfb818a0 dev:eth1 > ------------[ cut here ]------------ > Kernel BUG at c02d9444 [verbose debug info unavailable] > Oops: Exception in kernel mode, sig: 5 [#1] > Call Trace: > [df2dbb20] [c02d9444] skb_under_panic+0x48/0x5c (unreliable) > [df2dbb30] [c0224b94] gfar_start_xmit+0x384/0x400 > [df2dbb60] [c02e1c8c] dev_hard_start_xmit+0x258/0x2cc > [df2dbba0] [c02f264c] __qdisc_run+0x5c/0x1f8 > [df2dbbd0] [c02e4bf4] dev_queue_xmit+0x264/0x2d0 > [df2dbbf0] [c036fdc8] br_dev_queue_push_xmit+0x90/0xf8 > [df2dbc00] [c036fcc8] br_flood+0xc8/0x120 > [df2dbc30] [c036ebe0] br_dev_xmit+0xbc/0xc0 > [df2dbc40] [c02e1c8c] dev_hard_start_xmit+0x258/0x2cc > [df2dbc80] [c02e4c04] dev_queue_xmit+0x274/0x2d0 > [df2dbca0] [c02ebaa8] neigh_resolve_output+0xfc/0x25c > ............Any comment about this? Is it possible to be included in 2.6.29? - Leo
Stephen Hemminger
2009-Mar-23 15:51 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
On Fri, 20 Mar 2009 17:04:29 +0800 Li Yang <leoli at freescale.com> wrote:> The bridging device used a constant hard_header_len. This will cause > headroom shortage for ports with additional hardware header. The patch > makes bridging device to use the maximum value of all ports. > > Signed-off-by: Li Yang <leoli at freescale.com> > ---That ensures big enough header for locally generated packets, but any drivers that need bigger headroom still must handle bridged packets that come in with smaller space. When bridging packets, the skb comes from the allocation by the receiving driver. Almost all drivers will use dev_alloc_skb() which will allocate NET_SKB_PAD (16) bytes of additional headroom. This is used to hold copy of ethernet header for the bridge/netfilter code. So your patch is fine as an optimization but a driver can not safely depend on any additional headroom. The driver must check if there is space, and if no space is available, reallocate and copy.
David Miller
2009-Mar-23 22:20 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
From: Stephen Hemminger <shemminger at linux-foundation.org> Date: Mon, 23 Mar 2009 08:51:22 -0700> That ensures big enough header for locally generated packets, but > any drivers that need bigger headroom still must handle bridged packets > that come in with smaller space. When bridging packets, the skb comes > from the allocation by the receiving driver. Almost all drivers will > use dev_alloc_skb() which will allocate NET_SKB_PAD (16) bytes of > additional headroom. This is used to hold copy of ethernet header for > the bridge/netfilter code. > > So your patch is fine as an optimization but a driver can not safely > depend on any additional headroom. The driver must check if there > is space, and if no space is available, reallocate and copy.We had some plans to deal with this kind of issue for wireless too. Let me see if I can find the RFC patch from that discussion... Here it is, similar code would be added to the ipv4/ipv6 forwarding paths: diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 7c1d446..6c06fba 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -600,6 +600,7 @@ struct net_device * Cache line mostly used on receive path (including eth_type_trans()) */ unsigned long last_rx; /* Time of last Rx */ + unsigned int rx_alloc_extra; /* Interface address info used in eth_type_trans() */ unsigned char dev_addr[MAX_ADDR_LEN]; /* hw address, (before bcast because most packets are unicast) */ diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index bdd7c35..531e483 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -42,6 +42,22 @@ int br_dev_queue_push_xmit(struct sk_buff *skb) if (nf_bridge_maybe_copy_header(skb)) kfree_skb(skb); else { + unsigned int headroom = skb_headroom(skb); + unsigned int hh_len = LL_RESERVED_SPACE(skb->dev); + + if (headroom < hh_len) { + struct net_device *in_dev; + unsigned int extra; + + in_dev = __dev_get_by_index(dev_net(skb->dev), + skb->iif); + BUG_ON(!in_dev); + + extra = hh_len - headroom; + if (extra >= in_dev->rx_alloc_extra) + in_dev->rx_alloc_extra = extra; + } + skb_push(skb, ETH_HLEN); dev_queue_xmit(skb); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 5c459f2..74a2515 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -255,11 +255,12 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int length, gfp_t gfp_mask) { int node = dev->dev.parent ? dev_to_node(dev->dev.parent) : -1; + unsigned int extra = dev->rx_alloc_extra + NET_SKB_PAD; struct sk_buff *skb; - skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, node); + skb = __alloc_skb(length + extra, gfp_mask, 0, node); if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD); + skb_reserve(skb, extra); skb->dev = dev; } return skb; diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index f35eaea..86f0e36 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -1562,13 +1562,13 @@ int ieee80211_subif_start_xmit(struct sk_buff *skb, * be cloned. This could happen, e.g., with Linux bridge code passing * us broadcast frames. */ - if (head_need > 0 || skb_cloned(skb)) { + if (head_need > 0 || skb_header_cloned(skb)) { #if 0 printk(KERN_DEBUG "%s: need to reallocate buffer for %d bytes " "of headroom\n", dev->name, head_need); #endif - if (skb_cloned(skb)) + if (skb_header_cloned(skb)) I802_DEBUG_INC(local->tx_expand_skb_head_cloned); else I802_DEBUG_INC(local->tx_expand_skb_head);
David Miller
2009-Mar-25 06:40 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
From: Li Yang <leoli at freescale.com> Date: Fri, 20 Mar 2009 17:04:29 +0800> The bridging device used a constant hard_header_len. This will cause > headroom shortage for ports with additional hardware header. The patch > makes bridging device to use the maximum value of all ports. > > Signed-off-by: Li Yang <leoli at freescale.com>Your driver must be able to cope with any amount of available headroom, no matter what hacks we put into the bridging layer. Please fix your driver, I'm not applying this patch.
Li Yang
2009-Mar-25 08:43 UTC
[Bridge] [PATCH] net/bridge: use the maximum hard_header_len of ports for bridging device
On Tue, Mar 24, 2009 at 6:20 AM, David Miller <davem at davemloft.net> wrote:> From: Stephen Hemminger <shemminger at linux-foundation.org> > Date: Mon, 23 Mar 2009 08:51:22 -0700 > >> That ensures big enough header for locally generated packets, but >> any drivers that need bigger headroom still must handle bridged packets >> that come in with smaller space. When bridging packets, the skb comes >> from the allocation by the receiving driver. Almost all drivers will >> use dev_alloc_skb() which will allocate NET_SKB_PAD (16) bytes of >> additional headroom. This is used to hold copy of ethernet header for >> the bridge/netfilter code. >> >> So your patch is fine as an optimization but a driver can not safely >> depend on any additional headroom. The driver must check if there >> is space, and if no space is available, reallocate and copy. > > We had some plans to deal with this kind of issue for wireless > too. ?Let me see if I can find the RFC patch from that discussion... > > Here it is, similar code would be added to the ipv4/ipv6 forwarding > paths: > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 7c1d446..6c06fba 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -600,6 +600,7 @@ struct net_device > ?* Cache line mostly used on receive path (including eth_type_trans()) > ?*/ > ? ? ? ?unsigned long ? ? ? ? ? last_rx; ? ? ? ?/* Time of last Rx ? ? ?*/ > + ? ? ? unsigned int ? ? ? ? ? ?rx_alloc_extra; > ? ? ? ?/* Interface address info used in eth_type_trans() */ > ? ? ? ?unsigned char ? ? ? ? ? dev_addr[MAX_ADDR_LEN]; /* hw address, (before bcast > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?because most packets are unicast) */ > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c > index bdd7c35..531e483 100644 > --- a/net/bridge/br_forward.c > +++ b/net/bridge/br_forward.c > @@ -42,6 +42,22 @@ int br_dev_queue_push_xmit(struct sk_buff *skb) > ? ? ? ? ? ? ? ?if (nf_bridge_maybe_copy_header(skb)) > ? ? ? ? ? ? ? ? ? ? ? ?kfree_skb(skb); > ? ? ? ? ? ? ? ?else { > + ? ? ? ? ? ? ? ? ? ? ? unsigned int headroom = skb_headroom(skb); > + ? ? ? ? ? ? ? ? ? ? ? unsigned int hh_len = LL_RESERVED_SPACE(skb->dev); > + > + ? ? ? ? ? ? ? ? ? ? ? if (headroom < hh_len) { > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct net_device *in_dev; > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? unsigned int extra; > + > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? in_dev = __dev_get_by_index(dev_net(skb->dev), > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? skb->iif); > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? BUG_ON(!in_dev); > + > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? extra = hh_len - headroom; > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if (extra >= in_dev->rx_alloc_extra) > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? in_dev->rx_alloc_extra = extra; > + ? ? ? ? ? ? ? ? ? ? ? } > + > ? ? ? ? ? ? ? ? ? ? ? ?skb_push(skb, ETH_HLEN); > > ? ? ? ? ? ? ? ? ? ? ? ?dev_queue_xmit(skb);Dynamically adjusting is a good idea, but the rx_alloc_extra can only go up not the other way down in your code. Another thought is that if you re-allocate skb here the driver would be saved from checking the headroom in the fastpath, am I right? - Leo