Hi Ian, Today i fired up an old VM with a bittorrent client, trying to download some torrents. I seem to be hitting the unlikely case of "xennet: skb rides the rocket: xx slots" and this results in some dropped packets in domU, I don''t see any warnings in dom0. I have added some extra info, but i don''t have enough knowledge if this could/should be prevented from happening ? [16798.629141] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16800.575182] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16801.589166] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 [16803.279039] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:3 page_size:4096 prot:0800 [16809.973268] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16811.420048] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 [16814.872686] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 [16815.359099] xennet: skb rides the rocket: 21 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:20 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 [16825.851906] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16828.295083] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 [16837.386684] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16838.609683] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16841.783729] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 [16843.841678] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 [16849.847614] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:3690 skb_headlen:54 skb->len:64294, skb->data_len:64240 skb->truesize:65008 nr_frags:4 page_size:4096 prot:0800 [16853.787128] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 # ifconfig eth0 Link encap:Ethernet HWaddr 00:16:3e:c4:20:46 inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:9090214 errors:0 dropped:0 overruns:0 frame:0 TX packets:5902090 errors:0 dropped:304 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:19757934770 (18.4 GiB) TX bytes:16855238200 (15.6 GiB) Interrupt:25 diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index c26e28b..50ac403 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -552,7 +552,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) xennet_count_skb_frag_slots(skb); if (unlikely(slots > MAX_SKB_FRAGS + 1)) { net_alert_ratelimited( - "xennet: skb rides the rocket: %d slots\n", slots); + "xennet: skb rides the rocket: %d slots MAX_SKB_FRAGS: %d div_roundup:%d xennet_count_skb_frag_slots:%d offset:%d skb_headlen:%d skb->len:%d, skb->data_len:%d skb->truesize:%d nr_frags:%d page_size:%d prot:%04x \n + slots, MAX_SKB_FRAGS,DIV_ROUND_UP(offset + len, PAGE_SIZE), + xennet_count_skb_frag_slots(skb), offset, len, skb->len, + skb->data_len, skb->truesize, skb_shinfo(skb)->nr_frags, + PAGE_SIZE, ntohs(skb->protocol)); goto drop; }
On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote:> Hi Ian,I''m not actually the netfront maintainer. Adding Konrad. Annie too since IIRC she was at one point looking at various buffer coalescing schemes.> Today i fired up an old VM with a bittorrent client, trying to > download some torrents.How old? Which kernel?> I seem to be hitting the unlikely case of "xennet: skb rides the > rocket: xx slots" and this results in some dropped packets in domU, I > don''t see any warnings in dom0. > > I have added some extra info, but i don''t have enough knowledge if > this could/should be prevented from happening ?MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not sure this can have ever worked. These SKBs seem to be pretty big (not quite 64KB), seemingly most of the data is contained in a smallish number of frags, which suggests compound pages and therefore a reasonably modern kernel? This probably relates somewhat to the issues described in the "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the solution to that would necessarily involve fixing this issue. In the meantime you could try disabling sg and/or various offloads for that domain''s vif.> [16798.629141] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16800.575182] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16801.589166] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 > [16803.279039] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:3 page_size:4096 prot:0800 > [16809.973268] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16811.420048] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 > [16814.872686] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 > [16815.359099] xennet: skb rides the rocket: 21 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:20 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 > [16825.851906] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16828.295083] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 > [16837.386684] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16838.609683] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16841.783729] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 > [16843.841678] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > [16849.847614] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:3690 skb_headlen:54 skb->len:64294, skb->data_len:64240 skb->truesize:65008 nr_frags:4 page_size:4096 prot:0800 > [16853.787128] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 > > # ifconfig > eth0 Link encap:Ethernet HWaddr 00:16:3e:c4:20:46 > inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:9090214 errors:0 dropped:0 overruns:0 frame:0 > TX packets:5902090 errors:0 dropped:304 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:19757934770 (18.4 GiB) TX bytes:16855238200 (15.6 GiB) > Interrupt:25 > > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index c26e28b..50ac403 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -552,7 +552,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) > xennet_count_skb_frag_slots(skb); > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > net_alert_ratelimited( > - "xennet: skb rides the rocket: %d slots\n", slots); > + "xennet: skb rides the rocket: %d slots MAX_SKB_FRAGS: %d div_roundup:%d xennet_count_skb_frag_slots:%d offset:%d skb_headlen:%d skb->len:%d, skb->data_len:%d skb->truesize:%d nr_frags:%d page_size:%d prot:%04x \n > + slots, MAX_SKB_FRAGS,DIV_ROUND_UP(offset + len, PAGE_SIZE), > + xennet_count_skb_frag_slots(skb), offset, len, skb->len, > + skb->data_len, skb->truesize, skb_shinfo(skb)->nr_frags, > + PAGE_SIZE, ntohs(skb->protocol)); > goto drop; > } >
Monday, January 7, 2013, 11:55:15 AM, you wrote:> On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: >> Hi Ian,> I''m not actually the netfront maintainer. Adding Konrad. Annie too since > IIRC she was at one point looking at various buffer coalescing schemes.Ah ok, i thought you were :-) sorry>> Today i fired up an old VM with a bittorrent client, trying to >> download some torrents.> How old? Which kernel?3.8.0-rc2 at the moment, it''s a PV guest.>> I seem to be hitting the unlikely case of "xennet: skb rides the >> rocket: xx slots" and this results in some dropped packets in domU, I >> don''t see any warnings in dom0. >> >> I have added some extra info, but i don''t have enough knowledge if >> this could/should be prevented from happening ?> MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not sure > this can have ever worked.> These SKBs seem to be pretty big (not quite 64KB), seemingly most of the > data is contained in a smallish number of frags, which suggests compound > pages and therefore a reasonably modern kernel?> This probably relates somewhat to the issues described in the > "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the solution > to that would necessarily involve fixing this issue.> In the meantime you could try disabling sg and/or various offloads for > that domain''s vif.Will do.>> [16798.629141] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16800.575182] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16801.589166] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16803.279039] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:3 page_size:4096 prot:0800 >> [16809.973268] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16811.420048] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >> [16814.872686] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16815.359099] xennet: skb rides the rocket: 21 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:20 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16825.851906] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16828.295083] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16837.386684] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16838.609683] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16841.783729] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >> [16843.841678] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16849.847614] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:3690 skb_headlen:54 skb->len:64294, skb->data_len:64240 skb->truesize:65008 nr_frags:4 page_size:4096 prot:0800 >> [16853.787128] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> >> # ifconfig >> eth0 Link encap:Ethernet HWaddr 00:16:3e:c4:20:46 >> inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >> RX packets:9090214 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:5902090 errors:0 dropped:304 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:19757934770 (18.4 GiB) TX bytes:16855238200 (15.6 GiB) >> Interrupt:25 >> >> >> >> >> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c >> index c26e28b..50ac403 100644 >> --- a/drivers/net/xen-netfront.c >> +++ b/drivers/net/xen-netfront.c >> @@ -552,7 +552,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) >> xennet_count_skb_frag_slots(skb); >> if (unlikely(slots > MAX_SKB_FRAGS + 1)) { >> net_alert_ratelimited( >> - "xennet: skb rides the rocket: %d slots\n", slots); >> + "xennet: skb rides the rocket: %d slots MAX_SKB_FRAGS: %d div_roundup:%d xennet_count_skb_frag_slots:%d offset:%d skb_headlen:%d skb->len:%d, skb->data_len:%d skb->truesize:%d nr_frags:%d page_size:%d prot:%04x \n >> + slots, MAX_SKB_FRAGS,DIV_ROUND_UP(offset + len, PAGE_SIZE), >> + xennet_count_skb_frag_slots(skb), offset, len, skb->len, >> + skb->data_len, skb->truesize, skb_shinfo(skb)->nr_frags, >> + PAGE_SIZE, ntohs(skb->protocol)); >> goto drop; >> } >>
On Mon, 2013-01-07 at 12:30 +0000, Sander Eikelenboom wrote:> Monday, January 7, 2013, 11:55:15 AM, you wrote: > > > On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: > >> Hi Ian, > > > I''m not actually the netfront maintainer. Adding Konrad. Annie too since > > IIRC she was at one point looking at various buffer coalescing schemes. > > Ah ok, i thought you were :-) sorryI''m netback ;-)> >> Today i fired up an old VM with a bittorrent client, trying to > >> download some torrents. > > > How old? Which kernel? > > 3.8.0-rc2 at the moment, it''s a PV guest.Are you sure? That''s pretty new, not old. Ian.
Monday, January 7, 2013, 2:27:50 PM, you wrote:> On Mon, 2013-01-07 at 12:30 +0000, Sander Eikelenboom wrote: >> Monday, January 7, 2013, 11:55:15 AM, you wrote: >> >> > On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: >> >> Hi Ian, >> >> > I''m not actually the netfront maintainer. Adding Konrad. Annie too since >> > IIRC she was at one point looking at various buffer coalescing schemes. >> >> Ah ok, i thought you were :-) sorry> I''m netback ;-)Ah i thought of front and back as sides of the same coin :-p>> >> Today i fired up an old VM with a bittorrent client, trying to >> >> download some torrents. >> >> > How old? Which kernel? >> >> 3.8.0-rc2 at the moment, it''s a PV guest.> Are you sure? That''s pretty new, not old.Yes i''m sure, with old i meant not booted for some time. All my PV guests use the same from a dom0 supplied kernel, which at the moment happens to be the same as dom0''s too.> Ian.
On Mon, 2013-01-07 at 14:05 +0000, Sander Eikelenboom wrote:> Monday, January 7, 2013, 2:27:50 PM, you wrote: > > > On Mon, 2013-01-07 at 12:30 +0000, Sander Eikelenboom wrote: > >> Monday, January 7, 2013, 11:55:15 AM, you wrote: > >> > >> > On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: > >> >> Hi Ian, > >> > >> > I''m not actually the netfront maintainer. Adding Konrad. Annie too since > >> > IIRC she was at one point looking at various buffer coalescing schemes. > >> > >> Ah ok, i thought you were :-) sorry > > > I''m netback ;-) > > Ah i thought of front and back as sides of the same coin :-pIMHO Having them be somewhat separate encourages greater transparency when evolving the protocol. Not least because there are other front and backends other than just the Linux ones.> >> >> Today i fired up an old VM with a bittorrent client, trying to > >> >> download some torrents. > >> > >> > How old? Which kernel? > >> > >> 3.8.0-rc2 at the moment, it''s a PV guest. > > > Are you sure? That''s pretty new, not old. > > Yes i''m sure, with old i meant not booted for some time. > All my PV guests use the same from a dom0 supplied kernel, which at the moment happens to be the same as dom0''s too.AH, I see. I assumed you meant old kernel.> > > > Ian. > > >
On 2013-1-7 18:55, Ian Campbell wrote:> On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: >> Hi Ian, > I''m not actually the netfront maintainer. Adding Konrad. Annie too since > IIRC she was at one point looking at various buffer coalescing schemes. > >> Today i fired up an old VM with a bittorrent client, trying to >> download some torrents. > How old? Which kernel? > >> I seem to be hitting the unlikely case of "xennet: skb rides the >> rocket: xx slots" and this results in some dropped packets in domU, I >> don''t see any warnings in dom0.If debug is enabled, following netback code should print out messages in netbk_count_requests. if (unlikely(frags >= MAX_SKB_FRAGS)) { netdev_dbg(vif->dev, "Too many frags\n"); return -frags; }>> >> I have added some extra info, but i don''t have enough knowledge if >> this could/should be prevented from happening ? > MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not sure > this can have ever worked. > > These SKBs seem to be pretty big (not quite 64KB), seemingly most of the > data is contained in a smallish number of frags, which suggests compound > pages and therefore a reasonably modern kernel? > > This probably relates somewhat to the issues described in the > "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the solution > to that would necessarily involve fixing this issue.If netback complains about "Too many frags", then it should be MAX_SKB_FRAGS limitation in netback results in dropping packets in netfront. It is possible that other netfronts(windows?) also hit this. Thanks Annie> > In the meantime you could try disabling sg and/or various offloads for > that domain''s vif. > >> [16798.629141] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16800.575182] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16801.589166] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16803.279039] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:3 page_size:4096 prot:0800 >> [16809.973268] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16811.420048] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >> [16814.872686] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16815.359099] xennet: skb rides the rocket: 21 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:20 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16825.851906] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16828.295083] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >> [16837.386684] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16838.609683] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16841.783729] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >> [16843.841678] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> [16849.847614] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:3690 skb_headlen:54 skb->len:64294, skb->data_len:64240 skb->truesize:65008 nr_frags:4 page_size:4096 prot:0800 >> [16853.787128] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >> >> # ifconfig >> eth0 Link encap:Ethernet HWaddr 00:16:3e:c4:20:46 >> inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >> RX packets:9090214 errors:0 dropped:0 overruns:0 frame:0 >> TX packets:5902090 errors:0 dropped:304 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:19757934770 (18.4 GiB) TX bytes:16855238200 (15.6 GiB) >> Interrupt:25 >> >> >> >> >> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c >> index c26e28b..50ac403 100644 >> --- a/drivers/net/xen-netfront.c >> +++ b/drivers/net/xen-netfront.c >> @@ -552,7 +552,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) >> xennet_count_skb_frag_slots(skb); >> if (unlikely(slots> MAX_SKB_FRAGS + 1)) { >> net_alert_ratelimited( >> - "xennet: skb rides the rocket: %d slots\n", slots); >> + "xennet: skb rides the rocket: %d slots MAX_SKB_FRAGS: %d div_roundup:%d xennet_count_skb_frag_slots:%d offset:%d skb_headlen:%d skb->len:%d, skb->data_len:%d skb->truesize:%d nr_frags:%d page_size:%d prot:%04x \n >> + slots, MAX_SKB_FRAGS,DIV_ROUND_UP(offset + len, PAGE_SIZE), >> + xennet_count_skb_frag_slots(skb), offset, len, skb->len, >> + skb->data_len, skb->truesize, skb_shinfo(skb)->nr_frags, >> + PAGE_SIZE, ntohs(skb->protocol)); >> goto drop; >> } >> >
On Tue, 2013-01-08 at 02:12 +0000, ANNIE LI wrote:> >> > >> I have added some extra info, but i don''t have enough knowledge if > >> this could/should be prevented from happening ? > > MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not sure > > this can have ever worked. > > > > These SKBs seem to be pretty big (not quite 64KB), seemingly most of the > > data is contained in a smallish number of frags, which suggests compound > > pages and therefore a reasonably modern kernel? > > > > This probably relates somewhat to the issues described in the > > "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the solution > > to that would necessarily involve fixing this issue. > > If netback complains about "Too many frags", then it should be > MAX_SKB_FRAGS limitation in netback results in dropping packets in > netfront. It is possible that other netfronts(windows?) also hit this.It''s very possible. I rather suspect that non-Linux frontends have workarounds (e.g. manual resegmenting etc) for this case. Ian.
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > bounces@lists.xen.org] On Behalf Of Ian Campbell > Sent: 08 January 2013 10:06 > To: ANNIE LI > Cc: Sander Eikelenboom; Konrad Rzeszutek Wilk; xen-devel > Subject: Re: [Xen-devel] xennet: skb rides the rocket: 20 slots > > On Tue, 2013-01-08 at 02:12 +0000, ANNIE LI wrote: > > > >> > > >> I have added some extra info, but i don''t have enough knowledge if > > >> this could/should be prevented from happening ? > > > MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not > > > sure this can have ever worked. > > > > > > These SKBs seem to be pretty big (not quite 64KB), seemingly most of > > > the data is contained in a smallish number of frags, which suggests > > > compound pages and therefore a reasonably modern kernel? > > > > > > This probably relates somewhat to the issues described in the > > > "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the > > > solution to that would necessarily involve fixing this issue. > > > > If netback complains about "Too many frags", then it should be > > MAX_SKB_FRAGS limitation in netback results in dropping packets in > > netfront. It is possible that other netfronts(windows?) also hit this. > > It''s very possible. I rather suspect that non-Linux frontends have > workarounds (e.g. manual resegmenting etc) for this case. >Citrix Windows PV drivers certainly do; packets will be copied if they have more than MAX_SKB_FRAGS fragments. We also choose a maximal TSO size carefully to try to avoid Windows'' network stack sending down more than MAX_SKB_FRAGS fragments in the first place. Paul
Tuesday, January 8, 2013, 3:12:33 AM, you wrote:> On 2013-1-7 18:55, Ian Campbell wrote: >> On Fri, 2013-01-04 at 16:28 +0000, Sander Eikelenboom wrote: >>> Hi Ian, >> I''m not actually the netfront maintainer. Adding Konrad. Annie too since >> IIRC she was at one point looking at various buffer coalescing schemes. >> >>> Today i fired up an old VM with a bittorrent client, trying to >>> download some torrents. >> How old? Which kernel? >> >>> I seem to be hitting the unlikely case of "xennet: skb rides the >>> rocket: xx slots" and this results in some dropped packets in domU, I >>> don''t see any warnings in dom0.> If debug is enabled, following netback code should print out messages in > netbk_count_requests.> if (unlikely(frags >= MAX_SKB_FRAGS)) { > netdev_dbg(vif->dev, "Too many frags\n"); > return -frags; > }I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index f2d6b78..2f02da9 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -902,11 +902,13 @@ static int netbk_count_requests(struct xenvif *vif, do { if (frags >= work_to_do) { netdev_dbg(vif->dev, "Need more frags\n"); + net_warn_ratelimited("%s: Need more frags:%d, work_to_do:%d \n",vif->dev->name, frags, work_to_do); return -frags; } if (unlikely(frags >= MAX_SKB_FRAGS)) { netdev_dbg(vif->dev, "Too many frags\n"); + net_warn_ratelimited("%s: Too many frags:%d, MAX_SKB_FRAGS:%d \n",vif->dev->name, frags, MAX_SKB_FRAGS); return -frags; } @@ -914,6 +916,7 @@ static int netbk_count_requests(struct xenvif *vif, sizeof(*txp)); if (txp->size > first->size) { netdev_dbg(vif->dev, "Frags galore\n"); + net_warn_ratelimited("%s: Frags galore:%d, txp->size:%d first->size:%d\n",vif->dev->name, frags, txp->size , first->size); return -frags; } @@ -923,6 +926,7 @@ static int netbk_count_requests(struct xenvif *vif, if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) { netdev_dbg(vif->dev, "txp->offset: %x, size: %u\n", txp->offset, txp->size); + net_warn_ratelimited("%s: Hmm:%d, (txp->offset + txp->size):%d PAGE_SIZE:%d\n",vif->dev->name, frags, (txp->offset + txp->size) ,PAGE_SIZE); return -frags; } } while ((txp++)->flags & XEN_NETTXF_more_data);>>> >>> I have added some extra info, but i don''t have enough knowledge if >>> this could/should be prevented from happening ? >> MAX_SKB_FRAGS has never, AFAIK, been as big as 19 or 20 so I''m not sure >> this can have ever worked. >> >> These SKBs seem to be pretty big (not quite 64KB), seemingly most of the >> data is contained in a smallish number of frags, which suggests compound >> pages and therefore a reasonably modern kernel? >> >> This probably relates somewhat to the issues described in the >> "netchannel vs MAX_SKB_FRAGS" thread last year, or at least the solution >> to that would necessarily involve fixing this issue.> If netback complains about "Too many frags", then it should be > MAX_SKB_FRAGS limitation in netback results in dropping packets in > netfront. It is possible that other netfronts(windows?) also hit this.> Thanks > Annie >> >> In the meantime you could try disabling sg and/or various offloads for >> that domain''s vif. >> >>> [16798.629141] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16800.575182] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16801.589166] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >>> [16803.279039] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:3 page_size:4096 prot:0800 >>> [16809.973268] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16811.420048] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >>> [16814.872686] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >>> [16815.359099] xennet: skb rides the rocket: 21 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:20 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >>> [16825.851906] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16828.295083] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:5 page_size:4096 prot:0800 >>> [16837.386684] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16838.609683] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16841.783729] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:106 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:6 page_size:4096 prot:0800 >>> [16843.841678] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> [16849.847614] xennet: skb rides the rocket: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:18 offset:3690 skb_headlen:54 skb->len:64294, skb->data_len:64240 skb->truesize:65008 nr_frags:4 page_size:4096 prot:0800 >>> [16853.787128] xennet: skb rides the rocket: 20 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:19 offset:2154 skb_headlen:1622 skb->len:64294, skb->data_len:62672 skb->truesize:64976 nr_frags:4 page_size:4096 prot:0800 >>> >>> # ifconfig >>> eth0 Link encap:Ethernet HWaddr 00:16:3e:c4:20:46 >>> inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 >>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>> RX packets:9090214 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:5902090 errors:0 dropped:304 overruns:0 carrier:0 >>> collisions:0 txqueuelen:1000 >>> RX bytes:19757934770 (18.4 GiB) TX bytes:16855238200 (15.6 GiB) >>> Interrupt:25 >>> >>> >>> >>> >>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c >>> index c26e28b..50ac403 100644 >>> --- a/drivers/net/xen-netfront.c >>> +++ b/drivers/net/xen-netfront.c >>> @@ -552,7 +552,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) >>> xennet_count_skb_frag_slots(skb); >>> if (unlikely(slots> MAX_SKB_FRAGS + 1)) { >>> net_alert_ratelimited( >>> - "xennet: skb rides the rocket: %d slots\n", slots); >>> + "xennet: skb rides the rocket: %d slots MAX_SKB_FRAGS: %d div_roundup:%d xennet_count_skb_frag_slots:%d offset:%d skb_headlen:%d skb->len:%d, skb->data_len:%d skb->truesize:%d nr_frags:%d page_size:%d prot:%04x \n >>> + slots, MAX_SKB_FRAGS,DIV_ROUND_UP(offset + len, PAGE_SIZE), >>> + xennet_count_skb_frag_slots(skb), offset, len, skb->len, >>> + skb->data_len, skb->truesize, skb_shinfo(skb)->nr_frags, >>> + PAGE_SIZE, ntohs(skb->protocol)); >>> goto drop; >>> } >>> >>
> > If netback complains about "Too many frags", then it should be > > MAX_SKB_FRAGS limitation in netback results in dropping packets in > > netfront. It is possible that other netfronts(windows?) also hit this. > > It''s very possible. I rather suspect that non-Linux frontends have > workarounds (e.g. manual resegmenting etc) for this case. >GPLPV gathers fragments together if there are too many. Mostly this is just the header but there could be a whole lot of copying going on in a worst-case packet. It would be nice if a max-frags value was written to xenstore... do different backends (solaris? Bsd?) have different requirements? Or maybe that''s already been discussed - I haven''t followed this thread closely. James
On Tue, Jan 08, 2013 at 08:57:19PM +0000, James Harper wrote:> > > If netback complains about "Too many frags", then it should be > > > MAX_SKB_FRAGS limitation in netback results in dropping packets in > > > netfront. It is possible that other netfronts(windows?) also hit this. > > > > It''s very possible. I rather suspect that non-Linux frontends have > > workarounds (e.g. manual resegmenting etc) for this case. > > > > GPLPV gathers fragments together if there are too many. Mostly this is just the header but there could be a whole lot of copying going on in a worst-case packet. > > It would be nice if a max-frags value was written to xenstore... do different backends (solaris? Bsd?) have different requirements? Or maybe that''s already been discussed - I haven''t followed this thread closely.I think we discussed it last year as something that should be exported via XenBus. But never got down to figure out what it entails in - I think. This was pre-holidays.> > James
On 2013-1-9 4:55, Sander Eikelenboom wrote:>> if (unlikely(frags>= MAX_SKB_FRAGS)) { >> netdev_dbg(vif->dev, "Too many frags\n"); >> return -frags; >> } > I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" ..Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, and it is not caused by netback checking code, but they all concern about the same thing(frags >= MAX_SKB_FRAGS ). I thought those packets were dropped by backend check, sorry for the confusion. In netfront, following code would check whether required slots exceed MAX_SKB_FRAGS, and drop skbs which does not meet this requirement directly. if (unlikely(slots > MAX_SKB_FRAGS + 1)) { net_alert_ratelimited( "xennet: skb rides the rocket: %d slots\n", slots); goto drop; } In netback, following code also compared frags with MAX_SKB_FRAGS, and create error response for netfront which does not meet this requirment. In this case, netfront will also drop corresponding skbs. if (unlikely(frags >= MAX_SKB_FRAGS)) { netdev_dbg(vif->dev, "Too many frags\n"); return -frags; } So it is correct that netback log was not print out because those packets are drops directly by frontend check, not by backend check. Without the frontend check, it is likely that netback check would block these skbs and create error response for netfront. So two ways are available: workaround in netfront for those packets, doing re-fragment copying, but not sure how copying hurt performance. Another is to implement in netback, as discussed in "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all necessary? Thanks Annie> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index f2d6b78..2f02da9 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -902,11 +902,13 @@ static int netbk_count_requests(struct xenvif *vif, > do { > if (frags>= work_to_do) { > netdev_dbg(vif->dev, "Need more frags\n"); > + net_warn_ratelimited("%s: Need more frags:%d, work_to_do:%d \n",vif->dev->name, frags, work_to_do); > return -frags; > } > > if (unlikely(frags>= MAX_SKB_FRAGS)) { > netdev_dbg(vif->dev, "Too many frags\n"); > + net_warn_ratelimited("%s: Too many frags:%d, MAX_SKB_FRAGS:%d \n",vif->dev->name, frags, MAX_SKB_FRAGS); > return -frags; > } > > @@ -914,6 +916,7 @@ static int netbk_count_requests(struct xenvif *vif, > sizeof(*txp)); > if (txp->size> first->size) { > netdev_dbg(vif->dev, "Frags galore\n"); > + net_warn_ratelimited("%s: Frags galore:%d, txp->size:%d first->size:%d\n",vif->dev->name, frags, txp->size , first->size); > return -frags; > } > > @@ -923,6 +926,7 @@ static int netbk_count_requests(struct xenvif *vif, > if (unlikely((txp->offset + txp->size)> PAGE_SIZE)) { > netdev_dbg(vif->dev, "txp->offset: %x, size: %u\n", > txp->offset, txp->size); > + net_warn_ratelimited("%s: Hmm:%d, (txp->offset + txp->size):%d PAGE_SIZE:%d\n",vif->dev->name, frags, (txp->offset + txp->size) ,PAGE_SIZE); > return -frags; > } > } while ((txp++)->flags& XEN_NETTXF_more_data);
On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote:> > > On 2013-1-9 4:55, Sander Eikelenboom wrote: > >> if (unlikely(frags>= MAX_SKB_FRAGS)) { > >> netdev_dbg(vif->dev, "Too many frags\n"); > >> return -frags; > >> } > >I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. > > Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, > and it is not caused by netback checking code, but they all concern > about the same thing(frags >= MAX_SKB_FRAGS ). I thought those > packets were dropped by backend check, sorry for the confusion. > > In netfront, following code would check whether required slots > exceed MAX_SKB_FRAGS, and drop skbs which does not meet this > requirement directly. > > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > net_alert_ratelimited( > "xennet: skb rides the rocket: %d slots\n", slots); > goto drop; > } > > In netback, following code also compared frags with MAX_SKB_FRAGS, > and create error response for netfront which does not meet this > requirment. In this case, netfront will also drop corresponding > skbs. > > if (unlikely(frags >= MAX_SKB_FRAGS)) { > netdev_dbg(vif->dev, "Too many frags\n"); > return -frags; > } > > So it is correct that netback log was not print out because those > packets are drops directly by frontend check, not by backend check. > Without the frontend check, it is likely that netback check would > block these skbs and create error response for netfront. > > So two ways are available: workaround in netfront for those packets, > doing re-fragment copying, but not sure how copying hurt > performance. Another is to implement in netback, as discussed inThere is already some copying done (the copying of the socket data from userspace to the kernel) - so the extra copy might not be that bad as the data can be in the cache. This would probably be a way to deal with old backends that cannot deal with this new feature-flag.> "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all > necessary?Lets see first if this is indeed the problem. Perhaps a simple debug patch that just does: s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ #define DEBUG_MAX_FRAGS 21 in both netback and netfront to set the maximum number of frags we can handle to 21? If that works with Sander test - then yes, it looks like we really need to get this ''feature-max-skb-frags'' done.
On Wed, 2013-01-09 at 15:08 +0000, Konrad Rzeszutek Wilk wrote:> On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote: > > > > > > On 2013-1-9 4:55, Sander Eikelenboom wrote: > > >> if (unlikely(frags>= MAX_SKB_FRAGS)) { > > >> netdev_dbg(vif->dev, "Too many frags\n"); > > >> return -frags; > > >> } > > >I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. > > > > Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, > > and it is not caused by netback checking code, but they all concern > > about the same thing(frags >= MAX_SKB_FRAGS ). I thought those > > packets were dropped by backend check, sorry for the confusion. > > > > In netfront, following code would check whether required slots > > exceed MAX_SKB_FRAGS, and drop skbs which does not meet this > > requirement directly. > > > > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > > net_alert_ratelimited( > > "xennet: skb rides the rocket: %d slots\n", slots); > > goto drop; > > } > > > > In netback, following code also compared frags with MAX_SKB_FRAGS, > > and create error response for netfront which does not meet this > > requirment. In this case, netfront will also drop corresponding > > skbs. > > > > if (unlikely(frags >= MAX_SKB_FRAGS)) { > > netdev_dbg(vif->dev, "Too many frags\n"); > > return -frags; > > } > > > > So it is correct that netback log was not print out because those > > packets are drops directly by frontend check, not by backend check. > > Without the frontend check, it is likely that netback check would > > block these skbs and create error response for netfront. > > > > So two ways are available: workaround in netfront for those packets, > > doing re-fragment copying, but not sure how copying hurt > > performance. Another is to implement in netback, as discussed in > > There is already some copying done (the copying of the socket data > from userspace to the kernel) - so the extra copy might not be that > bad as the data can be in the cache. This would probably be a way > to deal with old backends that cannot deal with this new feature-flag.Or for any backend which doesn''t handle enough slots for the current skb. Essentially you need to do a fragmentation in software in the frontend. For backends which don''t support the flag we should just hardcode some number (some historically common value) as the default unless negotiation says otherwise, no need to copy everything...> > > "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all > > necessary? > > Lets see first if this is indeed the problem. Perhaps a simple debug > patch that just does: > > s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ > #define DEBUG_MAX_FRAGS 21Do you mean to do this globally or just in the netfront (or back) driver? Doing it globally would just be a case of changing the define, without the s//, but there would be performance implications to growing the shinfo. Just changing netfront won''t work because there will only be MAX_SKB_FRAGS in the skb''s shinfo so you won''t be able to add any more. Ian.
On Wed, Jan 09, 2013 at 04:34:01PM +0000, Ian Campbell wrote:> On Wed, 2013-01-09 at 15:08 +0000, Konrad Rzeszutek Wilk wrote: > > On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote: > > > > > > > > > On 2013-1-9 4:55, Sander Eikelenboom wrote: > > > >> if (unlikely(frags>= MAX_SKB_FRAGS)) { > > > >> netdev_dbg(vif->dev, "Too many frags\n"); > > > >> return -frags; > > > >> } > > > >I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. > > > > > > Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, > > > and it is not caused by netback checking code, but they all concern > > > about the same thing(frags >= MAX_SKB_FRAGS ). I thought those > > > packets were dropped by backend check, sorry for the confusion. > > > > > > In netfront, following code would check whether required slots > > > exceed MAX_SKB_FRAGS, and drop skbs which does not meet this > > > requirement directly. > > > > > > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > > > net_alert_ratelimited( > > > "xennet: skb rides the rocket: %d slots\n", slots); > > > goto drop; > > > } > > > > > > In netback, following code also compared frags with MAX_SKB_FRAGS, > > > and create error response for netfront which does not meet this > > > requirment. In this case, netfront will also drop corresponding > > > skbs. > > > > > > if (unlikely(frags >= MAX_SKB_FRAGS)) { > > > netdev_dbg(vif->dev, "Too many frags\n"); > > > return -frags; > > > } > > > > > > So it is correct that netback log was not print out because those > > > packets are drops directly by frontend check, not by backend check. > > > Without the frontend check, it is likely that netback check would > > > block these skbs and create error response for netfront. > > > > > > So two ways are available: workaround in netfront for those packets, > > > doing re-fragment copying, but not sure how copying hurt > > > performance. Another is to implement in netback, as discussed in > > > > There is already some copying done (the copying of the socket data > > from userspace to the kernel) - so the extra copy might not be that > > bad as the data can be in the cache. This would probably be a way > > to deal with old backends that cannot deal with this new feature-flag. > > Or for any backend which doesn''t handle enough slots for the current > skb. Essentially you need to do a fragmentation in software in the > frontend. > > For backends which don''t support the flag we should just hardcode some > number (some historically common value) as the default unless > negotiation says otherwise, no need to copy everything... > > > > > > "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all > > > necessary? > > > > Lets see first if this is indeed the problem. Perhaps a simple debug > > patch that just does: > > > > s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ > > #define DEBUG_MAX_FRAGS 21 > > Do you mean to do this globally or just in the netfront (or back) > driver?Our frontend and backend driver.> > Doing it globally would just be a case of changing the define, without > the s//, but there would be performance implications to growing the > shinfo. > > Just changing netfront won''t work because there will only be > MAX_SKB_FRAGS in the skb''s shinfo so you won''t be able to add any more.Oh, then doing the negotiation for the "slots" is gated to a maximum of MAX_SKB_FRAGS. Hmmm.. that presents a problem for this bug as just negotiating the max_skb_frags feature won''t solve the "riding the rocket" problem.
On Wed, 2013-01-09 at 17:05 +0000, Konrad Rzeszutek Wilk wrote:> On Wed, Jan 09, 2013 at 04:34:01PM +0000, Ian Campbell wrote: > > On Wed, 2013-01-09 at 15:08 +0000, Konrad Rzeszutek Wilk wrote: > > > On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote: > > > > > > > > > > > > On 2013-1-9 4:55, Sander Eikelenboom wrote: > > > > >> if (unlikely(frags>= MAX_SKB_FRAGS)) { > > > > >> netdev_dbg(vif->dev, "Too many frags\n"); > > > > >> return -frags; > > > > >> } > > > > >I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. > > > > > > > > Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, > > > > and it is not caused by netback checking code, but they all concern > > > > about the same thing(frags >= MAX_SKB_FRAGS ). I thought those > > > > packets were dropped by backend check, sorry for the confusion. > > > > > > > > In netfront, following code would check whether required slots > > > > exceed MAX_SKB_FRAGS, and drop skbs which does not meet this > > > > requirement directly. > > > > > > > > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > > > > net_alert_ratelimited( > > > > "xennet: skb rides the rocket: %d slots\n", slots); > > > > goto drop; > > > > } > > > > > > > > In netback, following code also compared frags with MAX_SKB_FRAGS, > > > > and create error response for netfront which does not meet this > > > > requirment. In this case, netfront will also drop corresponding > > > > skbs. > > > > > > > > if (unlikely(frags >= MAX_SKB_FRAGS)) { > > > > netdev_dbg(vif->dev, "Too many frags\n"); > > > > return -frags; > > > > } > > > > > > > > So it is correct that netback log was not print out because those > > > > packets are drops directly by frontend check, not by backend check. > > > > Without the frontend check, it is likely that netback check would > > > > block these skbs and create error response for netfront. > > > > > > > > So two ways are available: workaround in netfront for those packets, > > > > doing re-fragment copying, but not sure how copying hurt > > > > performance. Another is to implement in netback, as discussed in > > > > > > There is already some copying done (the copying of the socket data > > > from userspace to the kernel) - so the extra copy might not be that > > > bad as the data can be in the cache. This would probably be a way > > > to deal with old backends that cannot deal with this new feature-flag. > > > > Or for any backend which doesn''t handle enough slots for the current > > skb. Essentially you need to do a fragmentation in software in the > > frontend. > > > > For backends which don''t support the flag we should just hardcode some > > number (some historically common value) as the default unless > > negotiation says otherwise, no need to copy everything... > > > > > > > > > "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all > > > > necessary? > > > > > > Lets see first if this is indeed the problem. Perhaps a simple debug > > > patch that just does: > > > > > > s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ > > > #define DEBUG_MAX_FRAGS 21 > > > > Do you mean to do this globally or just in the netfront (or back) > > driver? > > Our frontend and backend driver. > > > > Doing it globally would just be a case of changing the define, without > > the s//, but there would be performance implications to growing the > > shinfo. > > > > Just changing netfront won''t work because there will only be > > MAX_SKB_FRAGS in the skb''s shinfo so you won''t be able to add any more. > > Oh, then doing the negotiation for the "slots" is gated to a maximum > of MAX_SKB_FRAGS. Hmmm.. that presents a problem for this bug as > just negotiating the max_skb_frags feature won''t solve the "riding the rocket" > problem.Right, I think a corallary to the negotiation is that having agreed a maximum you are responsible for ensuring you only send out requests which meet it. IOW if you advertise this feature you also have to be prepared to resegment if the OS above you gives you something larger (either too many frags or via compound pages). e.g. if the dom0 OS can support 16 frags and the domU 18 then netfront and netback would agree on 16 slots maximum. If the domU OS passes an SKB which requests more slots to netfront then netfront would have to segment it before sending. Ian.
On 2013-1-9 23:08, Konrad Rzeszutek Wilk wrote:> On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote: >> >> On 2013-1-9 4:55, Sander Eikelenboom wrote: >>>> if (unlikely(frags>= MAX_SKB_FRAGS)) { >>>> netdev_dbg(vif->dev, "Too many frags\n"); >>>> return -frags; >>>> } >>> I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. >> Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, >> and it is not caused by netback checking code, but they all concern >> about the same thing(frags>= MAX_SKB_FRAGS ). I thought those >> packets were dropped by backend check, sorry for the confusion. >> >> In netfront, following code would check whether required slots >> exceed MAX_SKB_FRAGS, and drop skbs which does not meet this >> requirement directly. >> >> if (unlikely(slots> MAX_SKB_FRAGS + 1)) { >> net_alert_ratelimited( >> "xennet: skb rides the rocket: %d slots\n", slots); >> goto drop; >> } >> >> In netback, following code also compared frags with MAX_SKB_FRAGS, >> and create error response for netfront which does not meet this >> requirment. In this case, netfront will also drop corresponding >> skbs. >> >> if (unlikely(frags>= MAX_SKB_FRAGS)) { >> netdev_dbg(vif->dev, "Too many frags\n"); >> return -frags; >> } >> >> So it is correct that netback log was not print out because those >> packets are drops directly by frontend check, not by backend check. >> Without the frontend check, it is likely that netback check would >> block these skbs and create error response for netfront. >> >> So two ways are available: workaround in netfront for those packets, >> doing re-fragment copying, but not sure how copying hurt >> performance. Another is to implement in netback, as discussed in > There is already some copying done (the copying of the socket data > from userspace to the kernel) - so the extra copy might not be that > bad as the data can be in the cache. This would probably be a way > to deal with old backends that cannot deal with this new feature-flag.I am thinking to do re-fragment in netfront for these skbs like following, Create a new skb, copy linear data and frag data from original skb into this one, and make every frags data size is PAGE_SIZE except for the last fragment. It is possible that the last fragment length is less than PAGE_SIZE, then free the original skb. The skb packet is large, and there will be lots of copys. struct skbuff *xennet_refrag_skb(skb) { create newskb copying data and doing fragmentation return newskb } ....... if (unlikely(slots> MAX_SKB_FRAGS + 1)) { net_alert_ratelimited( "xennet: skb rides the rocket: %d slots\n", slots); skb = xennet_refrag_skb(skb); } ..... Thanks Annie> >> "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all >> necessary? > Lets see first if this is indeed the problem. Perhaps a simple debug > patch that just does: > > s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ > #define DEBUG_MAX_FRAGS 21 > > in both netback and netfront to set the maximum number of frags we can > handle to 21? If that works with Sander test - then yes, it looks like > we really need to get this ''feature-max-skb-frags'' done. >
Thursday, January 10, 2013, 12:22:22 PM, you wrote:> On 2013-1-9 23:08, Konrad Rzeszutek Wilk wrote: >> On Wed, Jan 09, 2013 at 03:10:56PM +0800, ANNIE LI wrote: >>> >>> On 2013-1-9 4:55, Sander Eikelenboom wrote: >>>>> if (unlikely(frags>= MAX_SKB_FRAGS)) { >>>>> netdev_dbg(vif->dev, "Too many frags\n"); >>>>> return -frags; >>>>> } >>>> I have added some rate limited warns in this function. However none seems to be triggered while the pv-guest reports the "skb rides the rocket" .. >>> Oh, yes, "skb rides the rocket" is a protect mechanism in netfront, >>> and it is not caused by netback checking code, but they all concern >>> about the same thing(frags>= MAX_SKB_FRAGS ). I thought those >>> packets were dropped by backend check, sorry for the confusion. >>> >>> In netfront, following code would check whether required slots >>> exceed MAX_SKB_FRAGS, and drop skbs which does not meet this >>> requirement directly. >>> >>> if (unlikely(slots> MAX_SKB_FRAGS + 1)) { >>> net_alert_ratelimited( >>> "xennet: skb rides the rocket: %d slots\n", slots); >>> goto drop; >>> } >>> >>> In netback, following code also compared frags with MAX_SKB_FRAGS, >>> and create error response for netfront which does not meet this >>> requirment. In this case, netfront will also drop corresponding >>> skbs. >>> >>> if (unlikely(frags>= MAX_SKB_FRAGS)) { >>> netdev_dbg(vif->dev, "Too many frags\n"); >>> return -frags; >>> } >>> >>> So it is correct that netback log was not print out because those >>> packets are drops directly by frontend check, not by backend check. >>> Without the frontend check, it is likely that netback check would >>> block these skbs and create error response for netfront. >>> >>> So two ways are available: workaround in netfront for those packets, >>> doing re-fragment copying, but not sure how copying hurt >>> performance. Another is to implement in netback, as discussed in >> There is already some copying done (the copying of the socket data >> from userspace to the kernel) - so the extra copy might not be that >> bad as the data can be in the cache. This would probably be a way >> to deal with old backends that cannot deal with this new feature-flag.> I am thinking to do re-fragment in netfront for these skbs like following,> Create a new skb, copy linear data and frag data from original skb into > this one, and make every frags data size is PAGE_SIZE except for the > last fragment. It is possible that the last fragment length is less than > PAGE_SIZE, then free the original skb. The skb packet is large, and > there will be lots of copys.> struct skbuff *xennet_refrag_skb(skb) > { > create newskb > copying data and doing fragmentation > return newskb > }> .......if (unlikely(slots>> MAX_SKB_FRAGS + 1)) {> net_alert_ratelimited( > "xennet: skb rides the rocket: %d slots\n", slots); > skb = xennet_refrag_skb(skb); > } > .....> Thanks > Annie >> >>> "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all >>> necessary? >> Lets see first if this is indeed the problem. Perhaps a simple debug >> patch that just does: >> >> s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ >> #define DEBUG_MAX_FRAGS 21 >> >> in both netback and netfront to set the maximum number of frags we can >> handle to 21? If that works with Sander test - then yes, it looks like >> we really need to get this ''feature-max-skb-frags'' done. >>What i''m wondering about (without any knowledge about the innerworkings of the network stack): - Why can''t netfront handle the largers packets in the first place ? - Would it be right to do something like xennet_get_responses() does for rx: int max = MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD); if (unlikely(frags > max)) { net_warn_ratelimited("%s: Too many frags max:%d, frags:%d \n",skb->dev->name, max, frags); err = -E2BIG; } Instead of dropping the packet ? Don''t know if this propagates via network protocols and will cause clients to reduce the packetsize and by that way avoid the need to copying in the first place ? -- Sander
On Thu, 2013-01-10 at 11:22 +0000, ANNIE LI wrote:> I am thinking to do re-fragment in netfront for these skbs like following, > > Create a new skb, copy linear data and frag data from original skb into > this one, and make every frags data size is PAGE_SIZE except for the > last fragment. It is possible that the last fragment length is less than > PAGE_SIZE, then free the original skb. The skb packet is large, and > there will be lots of copys.You don''t need (or I suspect want) to copy, you can directly add pages from the source skb''s frags to the destination skb''s frags, with appropriate refcount frobbing. You can also share a page between two (or more) skbs in the case where the boundary between two skbs happens to be in the middle of a page. But more importantly than all that you need to do more than just refragment, you actually need to resegment i.e. you need to duplicate the headers (Ethernet, IP, TCP) at the front of each new skb and adjust the (psuedo-)checksums as appropriate (which will depend on whether the SKB is GSO, or just has checksum offload or nothing). So you need to go from <Ether><IP><TCP><...data... ...data...> to <Ether><IP><TCP><...data...> <Ether><IP><TCP><...data...> Where the headers are adjusted to cope with this. Some of data might be in frags but equally some might be in skb->data. I''m guessing that Linux already has code which can do this for you, since it has a software fallback for GSO. Ian.> > struct skbuff *xennet_refrag_skb(skb) > { > create newskb > copying data and doing fragmentation > return newskb > } > > ....... > > if (unlikely(slots> MAX_SKB_FRAGS + 1)) { > net_alert_ratelimited( > "xennet: skb rides the rocket: %d slots\n", slots); > skb = xennet_refrag_skb(skb); > } > ..... > > Thanks > Annie > > > >> "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all > >> necessary? > > Lets see first if this is indeed the problem. Perhaps a simple debug > > patch that just does: > > > > s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ > > #define DEBUG_MAX_FRAGS 21 > > > > in both netback and netfront to set the maximum number of frags we can > > handle to 21? If that works with Sander test - then yes, it looks like > > we really need to get this ''feature-max-skb-frags'' done. > >
On Thu, Jan 10, 2013 at 12:26:56PM +0000, Ian Campbell wrote:> On Thu, 2013-01-10 at 11:22 +0000, ANNIE LI wrote: > > I am thinking to do re-fragment in netfront for these skbs like following, > > > > Create a new skb, copy linear data and frag data from original skb into > > this one, and make every frags data size is PAGE_SIZE except for the > > last fragment. It is possible that the last fragment length is less than > > PAGE_SIZE, then free the original skb. The skb packet is large, and > > there will be lots of copys. > > You don''t need (or I suspect want) to copy, you can directly add pages > from the source skb''s frags to the destination skb''s frags, with > appropriate refcount frobbing. You can also share a page between two (or > more) skbs in the case where the boundary between two skbs happens to be > in the middle of a page.That is allowed by the networking subsystem? I guess it will allow it as the refcount keeps the users of the skb sane so that you don''t inadvertly pull the page out of the skb (or for highman pages - unmap it).> > But more importantly than all that you need to do more than just > refragment, you actually need to resegment i.e. you need to duplicate > the headers (Ethernet, IP, TCP) at the front of each new skb and adjust > the (psuedo-)checksums as appropriate (which will depend on whether the > SKB is GSO, or just has checksum offload or nothing). > > So you need to go from > <Ether><IP><TCP><...data... ...data...> > to > <Ether><IP><TCP><...data...> <Ether><IP><TCP><...data...> > > Where the headers are adjusted to cope with this. Some of data might be > in frags but equally some might be in skb->data. > > I''m guessing that Linux already has code which can do this for you, > since it has a software fallback for GSO.Hm, my fear is that global extra copying will mean that we lose the benefit of having this in the cache - as the payload from user-space will not land on the same cache-lines anymore with this adjustment. This will require a bit more thinking on how to solve this particular problem.
On Thu, 2013-01-10 at 15:39 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Jan 10, 2013 at 12:26:56PM +0000, Ian Campbell wrote: > > On Thu, 2013-01-10 at 11:22 +0000, ANNIE LI wrote: > > > I am thinking to do re-fragment in netfront for these skbs like following, > > > > > > Create a new skb, copy linear data and frag data from original skb into > > > this one, and make every frags data size is PAGE_SIZE except for the > > > last fragment. It is possible that the last fragment length is less than > > > PAGE_SIZE, then free the original skb. The skb packet is large, and > > > there will be lots of copys. > > > > You don''t need (or I suspect want) to copy, you can directly add pages > > from the source skb''s frags to the destination skb''s frags, with > > appropriate refcount frobbing. You can also share a page between two (or > > more) skbs in the case where the boundary between two skbs happens to be > > in the middle of a page. > > That is allowed by the networking subsystem?Yes, happens all the time (well not quite, but it happens)> I guess it will allow it as > the refcount keeps the users of the skb sane so that you don''t inadvertly > pull the page out of the skb (or for highman pages - unmap it).Correct.> > > > > But more importantly than all that you need to do more than just > > refragment, you actually need to resegment i.e. you need to duplicate > > the headers (Ethernet, IP, TCP) at the front of each new skb and adjust > > the (psuedo-)checksums as appropriate (which will depend on whether the > > SKB is GSO, or just has checksum offload or nothing). > > > > So you need to go from > > <Ether><IP><TCP><...data... ...data...> > > to > > <Ether><IP><TCP><...data...> <Ether><IP><TCP><...data...> > > > > Where the headers are adjusted to cope with this. Some of data might be > > in frags but equally some might be in skb->data. > > > > I''m guessing that Linux already has code which can do this for you, > > since it has a software fallback for GSO. > > Hm, my fear is that global extra copying will mean that we lose the benefit > of having this in the cache - as the payload from user-space will not > land on the same cache-lines anymore with this adjustment.Remember that this only happens at all in extreme or unusual cases. The common case should be that an skb fits into the negotiated number of slots. Also there isn''t that much copying. You need to copy the heads from skb->data to new_skb->data but the frags you can just reference directly.> This will require a bit more thinking on how to solve this particular > problem.
On 2013-1-10 20:26, Ian Campbell wrote:> On Thu, 2013-01-10 at 11:22 +0000, ANNIE LI wrote: >> I am thinking to do re-fragment in netfront for these skbs like following, >> >> Create a new skb, copy linear data and frag data from original skb into >> this one, and make every frags data size is PAGE_SIZE except for the >> last fragment. It is possible that the last fragment length is less than >> PAGE_SIZE, then free the original skb. The skb packet is large, and >> there will be lots of copys. > You don''t need (or I suspect want) to copy, you can directly add pages > from the source skb''s frags to the destination skb''s frags, with > appropriate refcount frobbing. You can also share a page between two (or > more) skbs in the case where the boundary between two skbs happens to be > in the middle of a page. > > But more importantly than all that you need to do more than just > refragment, you actually need to resegment i.e. you need to duplicate > the headers (Ethernet, IP, TCP) at the front of each new skb and adjust > the (psuedo-)checksums as appropriate (which will depend on whether the > SKB is GSO, or just has checksum offload or nothing). > > So you need to go from > <Ether><IP><TCP><...data... ...data...> > to > <Ether><IP><TCP><...data...> <Ether><IP><TCP><...data...> > > Where the headers are adjusted to cope with this. Some of data might be > in frags but equally some might be in skb->data. > > I''m guessing that Linux already has code which can do this for you, > since it has a software fallback for GSO.There is "skb_gso_segment" existing to perform segmentation on skb and return a list of segments. From the code, it seems this function is specific to GSO packets, maybe it is not available to other non-gso large packets which requires slots is larger than MAX_SKB_FRAGS + 1". So for non-gso offload packets, I would go forward to write a function to do resegment. Thanks Annie> > Ian. > >> struct skbuff *xennet_refrag_skb(skb) >> { >> create newskb >> copying data and doing fragmentation >> return newskb >> } >> >> ....... >> >> if (unlikely(slots> MAX_SKB_FRAGS + 1)) { >> net_alert_ratelimited( >> "xennet: skb rides the rocket: %d slots\n", slots); >> skb = xennet_refrag_skb(skb); >> } >> ..... >> >> Thanks >> Annie >>>> "netchannel vs MAX_SKB_FRAGS". Maybe these two mechanism are all >>>> necessary? >>> Lets see first if this is indeed the problem. Perhaps a simple debug >>> patch that just does: >>> >>> s/MAX_SKB_FRAGS/DEBUG_MAX_FRAGS/ >>> #define DEBUG_MAX_FRAGS 21 >>> >>> in both netback and netfront to set the maximum number of frags we can >>> handle to 21? If that works with Sander test - then yes, it looks like >>> we really need to get this ''feature-max-skb-frags'' done. >>> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Fri, 2013-01-11 at 07:34 +0000, ANNIE LI wrote:> > I''m guessing that Linux already has code which can do this for you, > > since it has a software fallback for GSO. > > There is "skb_gso_segment" existing to perform segmentation on skb and > return a list of segments. From the code, it seems this function is > specific to GSO packets, maybe it is not available to other non-gso > large packets which requires slots is larger than MAX_SKB_FRAGS + 1". So > for non-gso offload packets, I would go forward to write a function to > do resegment.Without GSO I don''t think you should be seeing packets larger than the MTU, which would normally be either ~1500 or ~9000 and fit easily within any sensible negotiation for the max frags. I don''t think you should worry unduly about this case. Ian.
> -----Original Message----- > Without GSO I don''t think you should be seeing packets larger than the MTU, > which would normally be either ~1500 or ~9000 and fit easily within any > sensible negotiation for the max frags. I don''t think you should worry unduly > about this case. >A stack could still send down a packet with one byte per frag though, right? A copy-and-coalesce path would still be needed in this case. Paul
On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote:> > -----Original Message----- > > Without GSO I don''t think you should be seeing packets larger than the MTU, > > which would normally be either ~1500 or ~9000 and fit easily within any > > sensible negotiation for the max frags. I don''t think you should worry unduly > > about this case. > > > > A stack could still send down a packet with one byte per frag though, > right? A copy-and-coalesce path would still be needed in this case.True. In that case skb_linearise would probably do the job on Linux. Ian
This is a forwarded message From: ANNIE LI <annie.li@oracle.com> To: Ian Campbell <Ian.Campbell@citrix.com> Date: Monday, January 14, 2013, 10:39:53 AM Subject: [Xen-devel] xennet: skb rides the rocket: 20 slots Resend because xen-devel wasn''t copied on the original ... ===8<==============Original message text==============Hi I created a patch for this, but I failed to reproduce this issue and verify it. The patch was attached, Thanks Annie On 2013-1-11 18:16, Ian Campbell wrote:> On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote: >>> -----Original Message----- >>> Without GSO I don''t think you should be seeing packets larger than the MTU, >>> which would normally be either ~1500 or ~9000 and fit easily within any >>> sensible negotiation for the max frags. I don''t think you should worry unduly >>> about this case. >>> >> A stack could still send down a packet with one byte per frag though, >> right? A copy-and-coalesce path would still be needed in this case. > True. In that case skb_linearise would probably do the job on Linux. > > Ian > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel===8<===========End of original message text========== _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
This is a forwarded message From: Sander Eikelenboom <linux@eikelenboom.it> To: ANNIE LI <annie.li@oracle.com> Date: Thursday, January 24, 2013, 9:45:42 AM Subject: [Xen-devel] xennet: skb rides the rocket: 20 slots Resend because xen-devel wasn''t copied on the original ... ===8<==============Original message text============== Monday, January 14, 2013, 10:39:53 AM, you wrote:> Hi> I created a patch for this, but I failed to reproduce this issue and > verify it. The patch was attached,Hi Annie, I finally had time to seriously test the patch. I put in some more warn''s and made the bracing a bit more explicit (i hope i did the bracing right). Problem is the current code crashes with: [ 4189.815911] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. [29601.932324] xennet: xennet_xmit_skb err_end: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:0 offset:106 skb_headlen:54 skb->len:64454, skb->data_len:0 skb->truesize:65168 nr_frags:0 page_size:4096 prot:0800 gso:1 linearize:0 gso_segs:46 dev:eth0 transp:0006 [29601.932426] BUG: unable to handle kernel NULL pointer dereference at (null) [29601.932461] IP: [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 [29601.932498] PGD 2d497067 PUD 2dd94067 PMD 0 [29601.932526] Oops: 0000 [#1] PREEMPT SMP [29601.932549] Modules linked in: [29601.932566] CPU 0 [29601.932581] Pid: 2948, comm: deluged Not tainted 3.8.0-rc4-20130123-netpatched-rocketscience-radeon-qmax-new-a #1 [29601.932615] RIP: e030:[<ffffffff816551f4>] [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 [29601.932650] RSP: e02b:ffff88002cd95698 EFLAGS: 00010207 [29601.932669] RAX: 0000000000000000 RBX: ffff8800049574e8 RCX: 0000000000000036 [29601.932691] RDX: ffff88000398006a RSI: ffff88002ce4b8c0 RDI: 0000000000000000 [29601.932712] RBP: ffff88002cd95758 R08: 0000000000000000 R09: 0000000000000000 [29601.932734] R10: 0000000000000000 R11: 0000000000000002 R12: ffff88002cc98000 [29601.932755] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000036 [29601.932786] FS: 00007fa9ca911700(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000 [29601.932811] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [29601.932830] CR2: 0000000000000000 CR3: 000000002cf0d000 CR4: 0000000000000660 [29601.932853] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [29601.932877] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [29601.932900] Process deluged (pid: 2948, threadinfo ffff88002cd94000, task ffff88002ce4b150) [29601.932923] Stack: [29601.932934] ffff880000000036 000000000000fbc6 ffff880000000000 24e008890000fe90 [29601.932970] ffff880000000000 0000000000001000 ffff880000000800 ffff880000000001 [29601.933004] 0000000000000000 ffff88000000002e ffff88002cc98000 ffff880000000006 [29601.933795] Call Trace: [29601.933795] [<ffffffff818570b9>] dev_hard_start_xmit+0x219/0x480 [29601.933795] [<ffffffff81873856>] sch_direct_xmit+0xf6/0x290 [29601.933795] [<ffffffff818574c6>] dev_queue_xmit+0x1a6/0x5a0 [29601.933795] [<ffffffff81857320>] ? dev_hard_start_xmit+0x480/0x480 [29601.933795] [<ffffffff810af255>] ? trace_softirqs_off+0x85/0x1b0 [29601.933795] [<ffffffff818f2596>] ip_finish_output+0x226/0x530 [29601.933795] [<ffffffff818f243d>] ? ip_finish_output+0xcd/0x530 [29601.933795] [<ffffffff818f28f9>] ip_output+0x59/0xe0 [29601.933795] [<ffffffff818f1438>] ip_local_out+0x28/0x90 [29601.933795] [<ffffffff818f19df>] ip_queue_xmit+0x17f/0x490 [29601.933795] [<ffffffff818f1860>] ? ip_send_unicast_reply+0x330/0x330 [29601.933795] [<ffffffff810a55f7>] ? getnstimeofday+0x47/0xe0 [29601.933795] [<ffffffff818471e9>] ? __skb_clone+0x29/0x120 [29601.933795] [<ffffffff81907c2d>] tcp_transmit_skb+0x3fd/0x8d0 [29601.933795] [<ffffffff8190ac9a>] tcp_write_xmit+0x22a/0xa80 [29601.933795] [<ffffffff81137ebe>] ? alloc_pages_current+0xde/0x1c0 [29601.933795] [<ffffffff8190b51b>] tcp_push_one+0x2b/0x40 [29601.933795] [<ffffffff818fbdc4>] tcp_sendmsg+0x8d4/0xe10 [29601.933795] [<ffffffff819225d6>] inet_sendmsg+0xa6/0x100 [29601.933795] [<ffffffff81922530>] ? inet_autobind+0x60/0x60 [29601.933795] [<ffffffff8183eb52>] sock_sendmsg+0x82/0xb0 [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 [29601.933795] [<ffffffff81118a34>] ? might_fault+0x84/0x90 [29601.933795] [<ffffffff811189eb>] ? might_fault+0x3b/0x90 [29601.933795] [<ffffffff8114082b>] ? __kmalloc+0xfb/0x160 [29601.933795] [<ffffffff8184c9ad>] ? verify_iovec+0x7d/0xf0 [29601.933795] [<ffffffff8183fb73>] __sys_sendmsg+0x393/0x3a0 [29601.933795] [<ffffffff819bafc5>] ? _raw_spin_unlock_irqrestore+0x75/0xa0 [29601.933795] [<ffffffff810b58f8>] ? lock_acquire+0xd8/0x100 [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 [29601.933795] [<ffffffff81168777>] ? fget_light+0xd7/0x140 [29601.933795] [<ffffffff811686da>] ? fget_light+0x3a/0x140 [29601.933795] [<ffffffff8183fd34>] sys_sendmsg+0x44/0x80 [29601.933795] [<ffffffff819bbe29>] system_call_fastpath+0x16/0x1b [29601.933795] Code: e9 ca fe ff ff 49 8b b4 24 80 00 00 00 48 89 df 45 31 ed e8 1f 16 20 00 48 3d 00 f0 ff ff 48 89 c7 76 08 e9 17 01 00 00 4c 89 f7 <4c> 8b 37 4c 89 e6 48 c7 07 00 00 00 00 41 ff c5 e8 b7 f3 ff ff [29601.933795] RIP [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 [29601.933795] RSP <ffff88002cd95698> [29601.933795] CR2: 0000000000000000 [29602.018741] ---[ end trace 5ec54203e8f81a1b ]--- [29602.018747] Kernel panic - not syncing: Fatal exception in interrupt Which accoring to addr2line is: segs = segs->next; I have attached the resulting patch -- Sander> Thanks > Annie> On 2013-1-11 18:16, Ian Campbell wrote: >> On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote: >>>> -----Original Message----- >>>> Without GSO I don''t think you should be seeing packets larger than the MTU, >>>> which would normally be either ~1500 or ~9000 and fit easily within any >>>> sensible negotiation for the max frags. I don''t think you should worry unduly >>>> about this case. >>>> >>> A stack could still send down a packet with one byte per frag though, >>> right? A copy-and-coalesce path would still be needed in this case. >> True. In that case skb_linearise would probably do the job on Linux. >> >> Ian >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel===8<===========End of original message text========== -- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 2013-3-9 20:57, Sander Eikelenboom wrote:> This is a forwarded message > From: Sander Eikelenboom<linux@eikelenboom.it> > To: ANNIE LI<annie.li@oracle.com> > Date: Thursday, January 24, 2013, 9:45:42 AM > Subject: [Xen-devel] xennet: skb rides the rocket: 20 slots > > > Resend because xen-devel wasn''t copied on the original ... > > ===8<==============Original message text==============> > Monday, January 14, 2013, 10:39:53 AM, you wrote: > >> Hi >> I created a patch for this, but I failed to reproduce this issue and >> verify it. The patch was attached, > Hi Annie, > > I finally had time to seriously test the patch. > I put in some more warn''s and made the bracing a bit more explicit (i hope i did the bracing right). > > Problem is the current code crashes with: > > [ 4189.815911] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. > [29601.932324] xennet: xennet_xmit_skb err_end: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:0 offset:106 skb_headlen:54 skb->len:64454, skb->data_len:0 skb->truesize:65168 nr_frags:0 page_size:4096 prot:0800 gso:1 linearize:0 gso_segs:46 dev:eth0 transp:0006 > [29601.932426] BUG: unable to handle kernel NULL pointer dereference at (null) > [29601.932461] IP: [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932498] PGD 2d497067 PUD 2dd94067 PMD 0 > [29601.932526] Oops: 0000 [#1] PREEMPT SMP > [29601.932549] Modules linked in: > [29601.932566] CPU 0 > [29601.932581] Pid: 2948, comm: deluged Not tainted 3.8.0-rc4-20130123-netpatched-rocketscience-radeon-qmax-new-a #1 > [29601.932615] RIP: e030:[<ffffffff816551f4>] [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932650] RSP: e02b:ffff88002cd95698 EFLAGS: 00010207 > [29601.932669] RAX: 0000000000000000 RBX: ffff8800049574e8 RCX: 0000000000000036 > [29601.932691] RDX: ffff88000398006a RSI: ffff88002ce4b8c0 RDI: 0000000000000000 > [29601.932712] RBP: ffff88002cd95758 R08: 0000000000000000 R09: 0000000000000000 > [29601.932734] R10: 0000000000000000 R11: 0000000000000002 R12: ffff88002cc98000 > [29601.932755] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000036 > [29601.932786] FS: 00007fa9ca911700(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000 > [29601.932811] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [29601.932830] CR2: 0000000000000000 CR3: 000000002cf0d000 CR4: 0000000000000660 > [29601.932853] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [29601.932877] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [29601.932900] Process deluged (pid: 2948, threadinfo ffff88002cd94000, task ffff88002ce4b150) > [29601.932923] Stack: > [29601.932934] ffff880000000036 000000000000fbc6 ffff880000000000 24e008890000fe90 > [29601.932970] ffff880000000000 0000000000001000 ffff880000000800 ffff880000000001 > [29601.933004] 0000000000000000 ffff88000000002e ffff88002cc98000 ffff880000000006 > [29601.933795] Call Trace: > [29601.933795] [<ffffffff818570b9>] dev_hard_start_xmit+0x219/0x480 > [29601.933795] [<ffffffff81873856>] sch_direct_xmit+0xf6/0x290 > [29601.933795] [<ffffffff818574c6>] dev_queue_xmit+0x1a6/0x5a0 > [29601.933795] [<ffffffff81857320>] ? dev_hard_start_xmit+0x480/0x480 > [29601.933795] [<ffffffff810af255>] ? trace_softirqs_off+0x85/0x1b0 > [29601.933795] [<ffffffff818f2596>] ip_finish_output+0x226/0x530 > [29601.933795] [<ffffffff818f243d>] ? ip_finish_output+0xcd/0x530 > [29601.933795] [<ffffffff818f28f9>] ip_output+0x59/0xe0 > [29601.933795] [<ffffffff818f1438>] ip_local_out+0x28/0x90 > [29601.933795] [<ffffffff818f19df>] ip_queue_xmit+0x17f/0x490 > [29601.933795] [<ffffffff818f1860>] ? ip_send_unicast_reply+0x330/0x330 > [29601.933795] [<ffffffff810a55f7>] ? getnstimeofday+0x47/0xe0 > [29601.933795] [<ffffffff818471e9>] ? __skb_clone+0x29/0x120 > [29601.933795] [<ffffffff81907c2d>] tcp_transmit_skb+0x3fd/0x8d0 > [29601.933795] [<ffffffff8190ac9a>] tcp_write_xmit+0x22a/0xa80 > [29601.933795] [<ffffffff81137ebe>] ? alloc_pages_current+0xde/0x1c0 > [29601.933795] [<ffffffff8190b51b>] tcp_push_one+0x2b/0x40 > [29601.933795] [<ffffffff818fbdc4>] tcp_sendmsg+0x8d4/0xe10 > [29601.933795] [<ffffffff819225d6>] inet_sendmsg+0xa6/0x100 > [29601.933795] [<ffffffff81922530>] ? inet_autobind+0x60/0x60 > [29601.933795] [<ffffffff8183eb52>] sock_sendmsg+0x82/0xb0 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81118a34>] ? might_fault+0x84/0x90 > [29601.933795] [<ffffffff811189eb>] ? might_fault+0x3b/0x90 > [29601.933795] [<ffffffff8114082b>] ? __kmalloc+0xfb/0x160 > [29601.933795] [<ffffffff8184c9ad>] ? verify_iovec+0x7d/0xf0 > [29601.933795] [<ffffffff8183fb73>] __sys_sendmsg+0x393/0x3a0 > [29601.933795] [<ffffffff819bafc5>] ? _raw_spin_unlock_irqrestore+0x75/0xa0 > [29601.933795] [<ffffffff810b58f8>] ? lock_acquire+0xd8/0x100 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81168777>] ? fget_light+0xd7/0x140 > [29601.933795] [<ffffffff811686da>] ? fget_light+0x3a/0x140 > [29601.933795] [<ffffffff8183fd34>] sys_sendmsg+0x44/0x80 > [29601.933795] [<ffffffff819bbe29>] system_call_fastpath+0x16/0x1b > [29601.933795] Code: e9 ca fe ff ff 49 8b b4 24 80 00 00 00 48 89 df 45 31 ed e8 1f 16 20 00 48 3d 00 f0 ff ff 48 89 c7 76 08 e9 17 01 00 00 4c 89 f7<4c> 8b 37 4c 89 e6 48 c7 07 00 00 00 00 41 ff c5 e8 b7 f3 ff ff > [29601.933795] RIP [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.933795] RSP<ffff88002cd95698> > [29601.933795] CR2: 0000000000000000 > [29602.018741] ---[ end trace 5ec54203e8f81a1b ]--- > [29602.018747] Kernel panic - not syncing: Fatal exception in interrupt > > > > Which accoring to addr2line is: > > segs = segs->next; > > I have attached the resulting patchHi Sander, Thanks for testing this. Maybe something wrong when doing segment... Ian, It is likely that coalescing work in netback you mentioned in thread(Is: SKB_MAX_LEN bites again. Was: Re: bug disabling guest interface) is not enough for some extremely large packets. So I assume the segment work in netfront is also necessary. Thanks Annie> > -- > Sander > >> Thanks >> Annie >> On 2013-1-11 18:16, Ian Campbell wrote: >>> On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote: >>>>> -----Original Message----- >>>>> Without GSO I don''t think you should be seeing packets larger than the MTU, >>>>> which would normally be either ~1500 or ~9000 and fit easily within any >>>>> sensible negotiation for the max frags. I don''t think you should worry unduly >>>>> about this case. >>>>> >>>> A stack could still send down a packet with one byte per frag though, >>>> right? A copy-and-coalesce path would still be needed in this case. >>> True. In that case skb_linearise would probably do the job on Linux. >>> >>> Ian >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel > ===8<===========End of original message text==========> > >
On Sun, 2013-03-10 at 05:22 +0000, ANNIE LI wrote:> It is likely that coalescing work in netback you mentioned in thread(Is: > SKB_MAX_LEN bites again. Was: Re: bug disabling guest interface) is not > enough for some extremely large packets. So I assume the segment work in > netfront is also necessary.I expect so too, although the segmentation thing seems like it is lower priority, at least right now. Ian.
Hi Sander, I checked this patch, and am thinking skb_gso_segment may return NULL. (see description of skb_gso_segment). And maybe this cause the NULL pointer deference issue. BTW, how did you send out those skbs with 64454 size? only copying files or doing iperf/netperf test? I hope I can verify it locally. Thanks Annie On 2013-3-9 20:57, Sander Eikelenboom wrote:> This is a forwarded message > From: Sander Eikelenboom <linux@eikelenboom.it> > To: ANNIE LI <annie.li@oracle.com> > Date: Thursday, January 24, 2013, 9:45:42 AM > Subject: [Xen-devel] xennet: skb rides the rocket: 20 slots > > > Resend because xen-devel wasn''t copied on the original ... > > ===8<==============Original message text==============> > Monday, January 14, 2013, 10:39:53 AM, you wrote: > >> Hi >> I created a patch for this, but I failed to reproduce this issue and >> verify it. The patch was attached, > Hi Annie, > > I finally had time to seriously test the patch. > I put in some more warn''s and made the bracing a bit more explicit (i hope i did the bracing right). > > Problem is the current code crashes with: > > [ 4189.815911] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. > [29601.932324] xennet: xennet_xmit_skb err_end: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:0 offset:106 skb_headlen:54 skb->len:64454, skb->data_len:0 skb->truesize:65168 nr_frags:0 page_size:4096 prot:0800 gso:1 linearize:0 gso_segs:46 dev:eth0 transp:0006 > [29601.932426] BUG: unable to handle kernel NULL pointer dereference at (null) > [29601.932461] IP: [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932498] PGD 2d497067 PUD 2dd94067 PMD 0 > [29601.932526] Oops: 0000 [#1] PREEMPT SMP > [29601.932549] Modules linked in: > [29601.932566] CPU 0 > [29601.932581] Pid: 2948, comm: deluged Not tainted 3.8.0-rc4-20130123-netpatched-rocketscience-radeon-qmax-new-a #1 > [29601.932615] RIP: e030:[<ffffffff816551f4>] [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932650] RSP: e02b:ffff88002cd95698 EFLAGS: 00010207 > [29601.932669] RAX: 0000000000000000 RBX: ffff8800049574e8 RCX: 0000000000000036 > [29601.932691] RDX: ffff88000398006a RSI: ffff88002ce4b8c0 RDI: 0000000000000000 > [29601.932712] RBP: ffff88002cd95758 R08: 0000000000000000 R09: 0000000000000000 > [29601.932734] R10: 0000000000000000 R11: 0000000000000002 R12: ffff88002cc98000 > [29601.932755] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000036 > [29601.932786] FS: 00007fa9ca911700(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000 > [29601.932811] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [29601.932830] CR2: 0000000000000000 CR3: 000000002cf0d000 CR4: 0000000000000660 > [29601.932853] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [29601.932877] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [29601.932900] Process deluged (pid: 2948, threadinfo ffff88002cd94000, task ffff88002ce4b150) > [29601.932923] Stack: > [29601.932934] ffff880000000036 000000000000fbc6 ffff880000000000 24e008890000fe90 > [29601.932970] ffff880000000000 0000000000001000 ffff880000000800 ffff880000000001 > [29601.933004] 0000000000000000 ffff88000000002e ffff88002cc98000 ffff880000000006 > [29601.933795] Call Trace: > [29601.933795] [<ffffffff818570b9>] dev_hard_start_xmit+0x219/0x480 > [29601.933795] [<ffffffff81873856>] sch_direct_xmit+0xf6/0x290 > [29601.933795] [<ffffffff818574c6>] dev_queue_xmit+0x1a6/0x5a0 > [29601.933795] [<ffffffff81857320>] ? dev_hard_start_xmit+0x480/0x480 > [29601.933795] [<ffffffff810af255>] ? trace_softirqs_off+0x85/0x1b0 > [29601.933795] [<ffffffff818f2596>] ip_finish_output+0x226/0x530 > [29601.933795] [<ffffffff818f243d>] ? ip_finish_output+0xcd/0x530 > [29601.933795] [<ffffffff818f28f9>] ip_output+0x59/0xe0 > [29601.933795] [<ffffffff818f1438>] ip_local_out+0x28/0x90 > [29601.933795] [<ffffffff818f19df>] ip_queue_xmit+0x17f/0x490 > [29601.933795] [<ffffffff818f1860>] ? ip_send_unicast_reply+0x330/0x330 > [29601.933795] [<ffffffff810a55f7>] ? getnstimeofday+0x47/0xe0 > [29601.933795] [<ffffffff818471e9>] ? __skb_clone+0x29/0x120 > [29601.933795] [<ffffffff81907c2d>] tcp_transmit_skb+0x3fd/0x8d0 > [29601.933795] [<ffffffff8190ac9a>] tcp_write_xmit+0x22a/0xa80 > [29601.933795] [<ffffffff81137ebe>] ? alloc_pages_current+0xde/0x1c0 > [29601.933795] [<ffffffff8190b51b>] tcp_push_one+0x2b/0x40 > [29601.933795] [<ffffffff818fbdc4>] tcp_sendmsg+0x8d4/0xe10 > [29601.933795] [<ffffffff819225d6>] inet_sendmsg+0xa6/0x100 > [29601.933795] [<ffffffff81922530>] ? inet_autobind+0x60/0x60 > [29601.933795] [<ffffffff8183eb52>] sock_sendmsg+0x82/0xb0 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81118a34>] ? might_fault+0x84/0x90 > [29601.933795] [<ffffffff811189eb>] ? might_fault+0x3b/0x90 > [29601.933795] [<ffffffff8114082b>] ? __kmalloc+0xfb/0x160 > [29601.933795] [<ffffffff8184c9ad>] ? verify_iovec+0x7d/0xf0 > [29601.933795] [<ffffffff8183fb73>] __sys_sendmsg+0x393/0x3a0 > [29601.933795] [<ffffffff819bafc5>] ? _raw_spin_unlock_irqrestore+0x75/0xa0 > [29601.933795] [<ffffffff810b58f8>] ? lock_acquire+0xd8/0x100 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81168777>] ? fget_light+0xd7/0x140 > [29601.933795] [<ffffffff811686da>] ? fget_light+0x3a/0x140 > [29601.933795] [<ffffffff8183fd34>] sys_sendmsg+0x44/0x80 > [29601.933795] [<ffffffff819bbe29>] system_call_fastpath+0x16/0x1b > [29601.933795] Code: e9 ca fe ff ff 49 8b b4 24 80 00 00 00 48 89 df 45 31 ed e8 1f 16 20 00 48 3d 00 f0 ff ff 48 89 c7 76 08 e9 17 01 00 00 4c 89 f7 <4c> 8b 37 4c 89 e6 48 c7 07 00 00 00 00 41 ff c5 e8 b7 f3 ff ff > [29601.933795] RIP [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.933795] RSP <ffff88002cd95698> > [29601.933795] CR2: 0000000000000000 > [29602.018741] ---[ end trace 5ec54203e8f81a1b ]--- > [29602.018747] Kernel panic - not syncing: Fatal exception in interrupt > > > > Which accoring to addr2line is: > > segs = segs->next; > > I have attached the resulting patch > > -- > Sander > >> Thanks >> Annie >> On 2013-1-11 18:16, Ian Campbell wrote: >>> On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote: >>>>> -----Original Message----- >>>>> Without GSO I don''t think you should be seeing packets larger than the MTU, >>>>> which would normally be either ~1500 or ~9000 and fit easily within any >>>>> sensible negotiation for the max frags. I don''t think you should worry unduly >>>>> about this case. >>>>> >>>> A stack could still send down a packet with one byte per frag though, >>>> right? A copy-and-coalesce path would still be needed in this case. >>> True. In that case skb_linearise would probably do the job on Linux. >>> >>> Ian >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel > ===8<===========End of original message text==========> > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Sander Eikelenboom
2013-Mar-15 21:29 UTC
Re: Fwd: Re: xennet: skb rides the rocket: 20 slots
Friday, March 15, 2013, 6:14:47 AM, you wrote:> Hi Sander,> I checked this patch, and am thinking skb_gso_segment may return NULL. (see description of skb_gso_segment). And maybe this cause the NULL pointer deference issue.Will adjust the patch and see what is does ..> BTW, how did you send out those skbs with 64454 size? only copying files or doing iperf/netperf test? I hope I can verify it locally.Unfortunately I wasn''t able to determine a simple and reliable test, based on iperf/netperf and the likes. I can quite reliably trigger it in a PV guest with the deluge torrent client. Since the VM uses network based storage for the data (glusterfs) that also causes some traffic. But i think it''s the torrent traffic that causes it (since bittorrent can stretch networks quite to some limits (in bandwidth, packet size, number of connections).> Thanks > Annie> On 2013-3-9 20:57, Sander Eikelenboom wrote: > > > This is a forwarded message > From: Sander Eikelenboom <linux@eikelenboom.it> > To: ANNIE LI <annie.li@oracle.com> > Date: Thursday, January 24, 2013, 9:45:42 AM > Subject: [Xen-devel] xennet: skb rides the rocket: 20 slots> Resend because xen-devel wasn''t copied on the original ...> ===8<==============Original message text============== > Monday, January 14, 2013, 10:39:53 AM, you wrote:> > > Hi> >> > > I created a patch for this, but I failed to reproduce this issue and > verify it. The patch was attached,> >> Hi Annie,> I finally had time to seriously test the patch. > I put in some more warn''s and made the bracing a bit more explicit (i hope i did the bracing right).> Problem is the current code crashes with:> [ 4189.815911] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. > [29601.932324] xennet: xennet_xmit_skb err_end: 19 slots MAX_SKB_FRAGS: 17 div_roundup:1 xennet_count_skb_frag_slots:0 offset:106 skb_headlen:54 skb->len:64454, skb->data_len:0 skb->truesize:65168 nr_frags:0 page_size:4096 prot:0800 gso:1 linearize:0 gso_segs:46 dev:eth0 transp:0006 > [29601.932426] BUG: unable to handle kernel NULL pointer dereference at (null) > [29601.932461] IP: [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932498] PGD 2d497067 PUD 2dd94067 PMD 0 > [29601.932526] Oops: 0000 [#1] PREEMPT SMP > [29601.932549] Modules linked in: > [29601.932566] CPU 0 > [29601.932581] Pid: 2948, comm: deluged Not tainted 3.8.0-rc4-20130123-netpatched-rocketscience-radeon-qmax-new-a #1 > [29601.932615] RIP: e030:[<ffffffff816551f4>] [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.932650] RSP: e02b:ffff88002cd95698 EFLAGS: 00010207 > [29601.932669] RAX: 0000000000000000 RBX: ffff8800049574e8 RCX: 0000000000000036 > [29601.932691] RDX: ffff88000398006a RSI: ffff88002ce4b8c0 RDI: 0000000000000000 > [29601.932712] RBP: ffff88002cd95758 R08: 0000000000000000 R09: 0000000000000000 > [29601.932734] R10: 0000000000000000 R11: 0000000000000002 R12: ffff88002cc98000 > [29601.932755] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000036 > [29601.932786] FS: 00007fa9ca911700(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000 > [29601.932811] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [29601.932830] CR2: 0000000000000000 CR3: 000000002cf0d000 CR4: 0000000000000660 > [29601.932853] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [29601.932877] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [29601.932900] Process deluged (pid: 2948, threadinfo ffff88002cd94000, task ffff88002ce4b150) > [29601.932923] Stack: > [29601.932934] ffff880000000036 000000000000fbc6 ffff880000000000 24e008890000fe90 > [29601.932970] ffff880000000000 0000000000001000 ffff880000000800 ffff880000000001 > [29601.933004] 0000000000000000 ffff88000000002e ffff88002cc98000 ffff880000000006 > [29601.933795] Call Trace: > [29601.933795] [<ffffffff818570b9>] dev_hard_start_xmit+0x219/0x480 > [29601.933795] [<ffffffff81873856>] sch_direct_xmit+0xf6/0x290 > [29601.933795] [<ffffffff818574c6>] dev_queue_xmit+0x1a6/0x5a0 > [29601.933795] [<ffffffff81857320>] ? dev_hard_start_xmit+0x480/0x480 > [29601.933795] [<ffffffff810af255>] ? trace_softirqs_off+0x85/0x1b0 > [29601.933795] [<ffffffff818f2596>] ip_finish_output+0x226/0x530 > [29601.933795] [<ffffffff818f243d>] ? ip_finish_output+0xcd/0x530 > [29601.933795] [<ffffffff818f28f9>] ip_output+0x59/0xe0 > [29601.933795] [<ffffffff818f1438>] ip_local_out+0x28/0x90 > [29601.933795] [<ffffffff818f19df>] ip_queue_xmit+0x17f/0x490 > [29601.933795] [<ffffffff818f1860>] ? ip_send_unicast_reply+0x330/0x330 > [29601.933795] [<ffffffff810a55f7>] ? getnstimeofday+0x47/0xe0 > [29601.933795] [<ffffffff818471e9>] ? __skb_clone+0x29/0x120 > [29601.933795] [<ffffffff81907c2d>] tcp_transmit_skb+0x3fd/0x8d0 > [29601.933795] [<ffffffff8190ac9a>] tcp_write_xmit+0x22a/0xa80 > [29601.933795] [<ffffffff81137ebe>] ? alloc_pages_current+0xde/0x1c0 > [29601.933795] [<ffffffff8190b51b>] tcp_push_one+0x2b/0x40 > [29601.933795] [<ffffffff818fbdc4>] tcp_sendmsg+0x8d4/0xe10 > [29601.933795] [<ffffffff819225d6>] inet_sendmsg+0xa6/0x100 > [29601.933795] [<ffffffff81922530>] ? inet_autobind+0x60/0x60 > [29601.933795] [<ffffffff8183eb52>] sock_sendmsg+0x82/0xb0 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81118a34>] ? might_fault+0x84/0x90 > [29601.933795] [<ffffffff811189eb>] ? might_fault+0x3b/0x90 > [29601.933795] [<ffffffff8114082b>] ? __kmalloc+0xfb/0x160 > [29601.933795] [<ffffffff8184c9ad>] ? verify_iovec+0x7d/0xf0 > [29601.933795] [<ffffffff8183fb73>] __sys_sendmsg+0x393/0x3a0 > [29601.933795] [<ffffffff819bafc5>] ? _raw_spin_unlock_irqrestore+0x75/0xa0 > [29601.933795] [<ffffffff810b58f8>] ? lock_acquire+0xd8/0x100 > [29601.933795] [<ffffffff810b5d87>] ? lock_release+0x117/0x250 > [29601.933795] [<ffffffff81168777>] ? fget_light+0xd7/0x140 > [29601.933795] [<ffffffff811686da>] ? fget_light+0x3a/0x140 > [29601.933795] [<ffffffff8183fd34>] sys_sendmsg+0x44/0x80 > [29601.933795] [<ffffffff819bbe29>] system_call_fastpath+0x16/0x1b > [29601.933795] Code: e9 ca fe ff ff 49 8b b4 24 80 00 00 00 48 89 df 45 31 ed e8 1f 16 20 00 48 3d 00 f0 ff ff 48 89 c7 76 08 e9 17 01 00 00 4c 89 f7 <4c> 8b 37 4c 89 e6 48 c7 07 00 00 00 00 41 ff c5 e8 b7 f3 ff ff > [29601.933795] RIP [<ffffffff816551f4>] xennet_xmit_skb+0x204/0x370 > [29601.933795] RSP <ffff88002cd95698> > [29601.933795] CR2: 0000000000000000 > [29602.018741] ---[ end trace 5ec54203e8f81a1b ]--- > [29602.018747] Kernel panic - not syncing: Fatal exception in interrupt> Which accoring to addr2line is:segs = segs->>next;> I have attached the resulting patch> -- > Sander> > > Thanks > Annie> >> > > On 2013-1-11 18:16, Ian Campbell wrote:> > > On Fri, 2013-01-11 at 10:09 +0000, Paul Durrant wrote:> > > > -----Original Message----- > Without GSO I don''t think you should be seeing packets larger than the MTU, > which would normally be either ~1500 or ~9000 and fit easily within any > sensible negotiation for the max frags. I don''t think you should worry unduly > about this case.> > > A stack could still send down a packet with one byte per frag though, > right? A copy-and-coalesce path would still be needed in this case.> > > True. In that case skb_linearise would probably do the job on Linux.> Ian> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel> > > > ===8<===========End of original message text==========> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel> > >
Seemingly Similar Threads
- [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
- [PATCH v2 1/1] xen/netback: correctly calculate required slots of skb.
- Re: [PATCH] add netconsole support for xen-netfront
- [PATCH RFC] xen/netback: Count ring slots properly when larger MTU sizes are used
- [PATCH RFC] xen-netback: remove guest RX path dependence on MAX_SKB_FRAGS