The `size'' field of Xen network wired format is uint16_t, anything bigger than 65535 will cause overflow. The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when it''s discovered to be sending corrupted skbs. However, it looks like Linux kernel will generate some bad skbs sometimes, so drop those skbs before sending to over netback to avoid being disconnected. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/net/xen-netfront.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 5527663..284059b 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -547,6 +547,18 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) unsigned int len = skb_headlen(skb); unsigned long flags; + /* + * wired format of xen_netif_tx_request only supports skb->len + * < 64K, because size field in xen_netif_tx_request is + * uint16_t. + */ + if (unlikely(skb->len > (uint16_t)(~((uint16_t)0)))) { + net_alert_ratelimited( + "xennet: skb->len = %d, too big for wired format\n", + skb->len); + goto drop; + } + slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) + xennet_count_skb_frag_slots(skb); if (unlikely(slots > MAX_SKB_FRAGS + 1)) { -- 1.7.10.4
Oops, I made a mistake in Konrad''s address. Please remove the wrong address in future CC. Wei. On Fri, 2013-03-01 at 16:31 +0000, Wei Liu wrote:> The `size'' field of Xen network wired format is uint16_t, anything bigger than > 65535 will cause overflow. > > The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when > it''s discovered to be sending corrupted skbs. However, it looks like Linux > kernel will generate some bad skbs sometimes, so drop those skbs before > sending to over netback to avoid being disconnected. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/net/xen-netfront.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 5527663..284059b 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -547,6 +547,18 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) > unsigned int len = skb_headlen(skb); > unsigned long flags; > > + /* > + * wired format of xen_netif_tx_request only supports skb->len > + * < 64K, because size field in xen_netif_tx_request is > + * uint16_t. > + */ > + if (unlikely(skb->len > (uint16_t)(~((uint16_t)0)))) { > + net_alert_ratelimited( > + "xennet: skb->len = %d, too big for wired format\n", > + skb->len); > + goto drop; > + } > + > slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) + > xennet_count_skb_frag_slots(skb); > if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
Jan Beulich
2013-Mar-01 16:48 UTC
Re: [PATCH] xen-netfront: drop skb when skb->len > 65535
>>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: > The `size'' field of Xen network wired format is uint16_t, anything bigger > than > 65535 will cause overflow. > > The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when > it''s discovered to be sending corrupted skbs. However, it looks like Linux > kernel will generate some bad skbs sometimes, so drop those skbs before > sending to over netback to avoid being disconnected.While fixing the frontend is certainly desirable, we can''t expect everyone to deploy fixed netfronts in all their VMs - some OS versions used in there may even be out of service. So we ought to find a way to also more gracefully deal with the situation in netback, without re-opening the security issue that prompted those changes. Jan
On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:> >>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: > > The `size'' field of Xen network wired format is uint16_t, anything bigger > > than > > 65535 will cause overflow. > > > > The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when > > it''s discovered to be sending corrupted skbs. However, it looks like Linux > > kernel will generate some bad skbs sometimes, so drop those skbs before > > sending to over netback to avoid being disconnected. > > While fixing the frontend is certainly desirable, we can''t expect > everyone to deploy fixed netfronts in all their VMs - some OS > versions used in there may even be out of service. So we > ought to find a way to also more gracefully deal with the > situation in netback, without re-opening the security issue > that prompted those changes. >Regarding the punishment bit, I think its worth discussing it a bit. But the bug is always there, it drew no attention until revealed by XSA-39. It ought to be fixed anyway. :-) Wei.> Jan >
Ian Campbell
2013-Mar-02 02:54 UTC
Re: [PATCH] xen-netfront: drop skb when skb->len > 65535
On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote:> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote: > > >>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: > > > The `size'' field of Xen network wired format is uint16_t, anything bigger > > > than > > > 65535 will cause overflow. > > > > > > The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when > > > it''s discovered to be sending corrupted skbs. However, it looks like Linux > > > kernel will generate some bad skbs sometimes, so drop those skbs before > > > sending to over netback to avoid being disconnected. > > > > While fixing the frontend is certainly desirable, we can''t expect > > everyone to deploy fixed netfronts in all their VMs - some OS > > versions used in there may even be out of service. So we > > ought to find a way to also more gracefully deal with the > > situation in netback, without re-opening the security issue > > that prompted those changes. > > > > Regarding the punishment bit, I think its worth discussing it a bit.Yes, the trick is figuring out what to do without reintroducing the softlockup which XSA-39 fixed. Perhaps we should allow silently consume (and drop) oversize skbs and only shutdown the rings if they also consume too many (FSVO too many) slots?> But the bug is always there, it drew no attention until revealed by > XSA-39. It ought to be fixed anyway. :-)I would have sworn that skb->len was also limited to 64k, but looking at the header I see it is actually an int and the only limit of that sort is related to MAX_SKB_FRAGS (which doesn''t actually limit the total size). OOI how big were the skbs you were seeing? Not that it really matters but do we have a handle on why the prexisting bug didn''t already cause connectivity issues? Does the retransmit (which I suppose must be happening) somehow end up using a smaller skb size? BTW you mean "wire protocol" not "wired protocol" in the comments etc. Ian.
On Sat, Mar 02, 2013 at 02:54:17AM +0000, Ian Campbell wrote:> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote: > > On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote: > > > >>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: > > > > The `size'' field of Xen network wired format is uint16_t, anything bigger > > > > than > > > > 65535 will cause overflow. > > > > > > > > The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when > > > > it''s discovered to be sending corrupted skbs. However, it looks like Linux > > > > kernel will generate some bad skbs sometimes, so drop those skbs before > > > > sending to over netback to avoid being disconnected. > > > > > > While fixing the frontend is certainly desirable, we can''t expect > > > everyone to deploy fixed netfronts in all their VMs - some OS > > > versions used in there may even be out of service. So we > > > ought to find a way to also more gracefully deal with the > > > situation in netback, without re-opening the security issue > > > that prompted those changes. > > > > > > > Regarding the punishment bit, I think its worth discussing it a bit. > > Yes, the trick is figuring out what to do without reintroducing the > softlockup which XSA-39 fixed. > > Perhaps we should allow silently consume (and drop) oversize skbs and > only shutdown the rings if they also consume too many (FSVO too many) > slots? > > > But the bug is always there, it drew no attention until revealed by > > XSA-39. It ought to be fixed anyway. :-) > > I would have sworn that skb->len was also limited to 64k, but looking at > the header I see it is actually an int and the only limit of that sort > is related to MAX_SKB_FRAGS (which doesn''t actually limit the total > size).I had the impression that skb->len was limited to 64k, too. But it turned out I was wrong.> > OOI how big were the skbs you were seeing?As Nick (npegg@linode.com) pointed out in his email, he saw size 65538. I can reproduce this as well by setting vif''s mtu to 100 then run iperf. 100 was just a random number I came up with when I played with fragmentation.> > Not that it really matters but do we have a handle on why the prexisting > bug didn''t already cause connectivity issues? Does the retransmit (which > I suppose must be happening) somehow end up using a smaller skb size? >Not sure. I didn''t have enough time to look into this yesterday. :-(> BTW you mean "wire protocol" not "wired protocol" in the comments etc. >Yes. Wei.> Ian. >
On 2013-3-2 10:54, Ian Campbell wrote:> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote: >> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote: >>>>>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: >>>> The `size'' field of Xen network wired format is uint16_t, anything bigger >>>> than >>>> 65535 will cause overflow. >>>> >>>> The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when >>>> it''s discovered to be sending corrupted skbs. However, it looks like Linux >>>> kernel will generate some bad skbs sometimes, so drop those skbs before >>>> sending to over netback to avoid being disconnected. >>> While fixing the frontend is certainly desirable, we can''t expect >>> everyone to deploy fixed netfronts in all their VMs - some OS >>> versions used in there may even be out of service. So we >>> ought to find a way to also more gracefully deal with the >>> situation in netback, without re-opening the security issue >>> that prompted those changes. >>> >> Regarding the punishment bit, I think its worth discussing it a bit. > Yes, the trick is figuring out what to do without reintroducing the > softlockup which XSA-39 fixed. > > Perhaps we should allow silently consume (and drop) oversize skbs and > only shutdown the rings if they also consume too many (FSVO too many) > slots? > >> But the bug is always there, it drew no attention until revealed by >> XSA-39. It ought to be fixed anyway. :-) > I would have sworn that skb->len was also limited to 64k, but looking at > the header I see it is actually an int and the only limit of that sort > is related to MAX_SKB_FRAGS (which doesn''t actually limit the total > size).If compound page enabled, it is very possible that netback runs into following code. if (unlikely(frags >= MAX_SKB_FRAGS)) { netdev_err(vif->dev, "Too many frags\n"); netbk_fatal_tx_err(vif); return -frags; } MAX_SKB_FRAGS does not actually limit the total size in theory. But it does limit the total size in netback since netback only supports one-page-sized fragments. Thanks Annie
On 2013-3-2 21:32, Wei Liu wrote:> On Sat, Mar 02, 2013 at 02:54:17AM +0000, Ian Campbell wrote: >> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote: >>> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote: >>>>>>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote: >>>>> The `size'' field of Xen network wired format is uint16_t, anything bigger >>>>> than >>>>> 65535 will cause overflow. >>>>> >>>>> The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when >>>>> it''s discovered to be sending corrupted skbs. However, it looks like Linux >>>>> kernel will generate some bad skbs sometimes, so drop those skbs before >>>>> sending to over netback to avoid being disconnected. >>>> While fixing the frontend is certainly desirable, we can''t expect >>>> everyone to deploy fixed netfronts in all their VMs - some OS >>>> versions used in there may even be out of service. So we >>>> ought to find a way to also more gracefully deal with the >>>> situation in netback, without re-opening the security issue >>>> that prompted those changes. >>>> >>> Regarding the punishment bit, I think its worth discussing it a bit. >> Yes, the trick is figuring out what to do without reintroducing the >> softlockup which XSA-39 fixed. >> >> Perhaps we should allow silently consume (and drop) oversize skbs and >> only shutdown the rings if they also consume too many (FSVO too many) >> slots? >> >>> But the bug is always there, it drew no attention until revealed by >>> XSA-39. It ought to be fixed anyway. :-) >> I would have sworn that skb->len was also limited to 64k, but looking at >> the header I see it is actually an int and the only limit of that sort >> is related to MAX_SKB_FRAGS (which doesn''t actually limit the total >> size). > I had the impression that skb->len was limited to 64k, too. But it > turned out I was wrong. > >> OOI how big were the skbs you were seeing? > As Nick (npegg@linode.com) pointed out in his email, he saw size 65538. > I can reproduce this as well by setting vif''s mtu to 100 then run iperf. > 100 was just a random number I came up with when I played with > fragmentation.Did you notice any GSO packets during your iperf test? Thanks Annie
On 3/2/13 8:32 AM, Wei Liu wrote:> > As Nick (npegg@linode.com) pointed out in his email, he saw size 65538. > I can reproduce this as well by setting vif''s mtu to 100 then run iperf. > 100 was just a random number I came up with when I played with > fragmentation. >Wei: How exactly did you set the MTU on the vif and what were your arguments to iperf? I tried this on a test host and was unable to trigger the XSA-39 protection. As for a more graceful punishment, I noticed that xenvif_carrier_off() uses netif_carrier_off() to discard queued packets. Would doing a netif_carrier_off() and then a netif_carrier_on() be sufficient to drop the bad packets and keep netback from spinning? Would blinking the vif''s carrier in this way allow the DomU to gracefully resume network traffic? -Nick
On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote:> On 3/2/13 8:32 AM, Wei Liu wrote: > > > > As Nick (npegg@linode.com) pointed out in his email, he saw size 65538. > > I can reproduce this as well by setting vif''s mtu to 100 then run iperf. > > 100 was just a random number I came up with when I played with > > fragmentation. > > > > Wei: How exactly did you set the MTU on the vif and what were your > arguments to iperf? I tried this on a test host and was unable to > trigger the XSA-39 protection. >Sorry I didn''t state this clearly. I tend to call the interface inside VM as vif as well. I did this inside a VM, ifconfig eth0 mtu 100. Nothing fancy added to iperf command line, just `iperf -c XXX` in VM and `iperf -s` in host (I limited the test time with -t though).> As for a more graceful punishment, I noticed that xenvif_carrier_off() > uses netif_carrier_off() to discard queued packets. Would doing a > netif_carrier_off() and then a netif_carrier_on() be sufficient to drop > the bad packets and keep netback from spinning? Would blinking the vif''s > carrier in this way allow the DomU to gracefully resume network traffic? >I think the spinning has something to do with the ring, not packets -- I''m not very sure though, because I didn''t take part in the discussion / development of XSA-39. Draining host queues is not likely to fix this. Wei.> > -Nick
Jacek Milewicz
2013-Mar-06 17:57 UTC
Re: [PATCH] xen-netfront: drop skb when skb->len > 65535
> On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote: > > On 3/2/13 8:32 AM, Wei Liu wrote: > > > > > > As Nick (npegg@linode.com) pointed out in his email, he saw size65538.> > > I can reproduce this as well by setting vif''s mtu to 100 then runiperf.> > > 100 was just a random number I came up with when I played with > > > fragmentation. > > > > > > > Wei: How exactly did you set the MTU on the vif and what were your > > arguments to iperf? I tried this on a test host and was unable to > > trigger the XSA-39 protection. > > > > Sorry I didn''t state this clearly. I tend to call the interface insideVM as vif as> well. I did this inside a VM, ifconfig eth0 mtu 100. > Nothing fancy added to iperf command line, just `iperf -c XXX` in VM and > `iperf -s` in host (I limited the test time with -t though).We have recently hit the same problem (however in our case, larger NFS transfer was enough to trigger it) and I''ve found that it only happens if when MAX_SKB_FRAGS is different on domU and dom0 (as in, domU using older kernel (without this change https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commi t/drivers/net/xen-netback/netback.c?id=48856286b64e4b66ec62b94e504d0b29c1a de664 )) can you confirm your domU kernel version?
On Wed, 2013-03-06 at 17:57 +0000, Jacek Milewicz wrote:> > On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote: > > > On 3/2/13 8:32 AM, Wei Liu wrote: > > > > > > > > As Nick (npegg@linode.com) pointed out in his email, he saw size > 65538. > > > > I can reproduce this as well by setting vif''s mtu to 100 then run > iperf. > > > > 100 was just a random number I came up with when I played with > > > > fragmentation. > > > > > > > > > > Wei: How exactly did you set the MTU on the vif and what were your > > > arguments to iperf? I tried this on a test host and was unable to > > > trigger the XSA-39 protection. > > > > > > > Sorry I didn''t state this clearly. I tend to call the interface inside > VM as vif as > > well. I did this inside a VM, ifconfig eth0 mtu 100. > > Nothing fancy added to iperf command line, just `iperf -c XXX` in VM and > > `iperf -s` in host (I limited the test time with -t though). > > We have recently hit the same problem (however in our case, larger NFS > transfer was enough to trigger it) and I''ve found that it only happens if > when MAX_SKB_FRAGS is different on domU and dom0 (as in, domU using older > kernel (without this change > https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commi > t/drivers/net/xen-netback/netback.c?id=48856286b64e4b66ec62b94e504d0b29c1a > de664 )) can you confirm your domU kernel version? >Actually I''m replying to your other email now. :-) The root cause of your problem is a bit different than the one I discovered, but they ended up the same error path anyway. My kernel should be 3.9-rc or something. I use it for both Dom0 and DomU. Wei.>