thr3ads.net - Xen devel - [PATCH] xen-netfront: drop skb when skb->len

If this information is useful, please help other people find it:
Share via:

Wei Liu

2013-Mar-01 16:31 UTC

[PATCH] xen-netfront: drop skb when skb->len > 65535

The `size'' field of Xen network wired format is uint16_t, anything
bigger than
65535 will cause overflow.

The punishment introduced by XSA-39 is quite harsh - DomU is disconnected when
it''s discovered to be sending corrupted skbs. However, it looks like
Linux
kernel will generate some bad skbs sometimes, so drop those skbs before
sending to over netback to avoid being disconnected.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 5527663..284059b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -547,6 +547,18 @@ static int xennet_start_xmit(struct sk_buff *skb, struct
net_device *dev)
 	unsigned int len = skb_headlen(skb);
 	unsigned long flags;
 
+	/*
+	 * wired format of xen_netif_tx_request only supports skb->len
+	 * < 64K, because size field in xen_netif_tx_request is
+	 * uint16_t.
+	 */
+	if (unlikely(skb->len > (uint16_t)(~((uint16_t)0)))) {
+		net_alert_ratelimited(
+			"xennet: skb->len = %d, too big for wired format\n",
+			skb->len);
+		goto drop;
+	}
+
 	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
 		xennet_count_skb_frag_slots(skb);
 	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
-- 
1.7.10.4

Wei Liu

2013-Mar-01 16:34 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Oops, I made a mistake in Konrad''s address.

Please remove the wrong address in future CC.


Wei.


On Fri, 2013-03-01 at 16:31 +0000, Wei Liu wrote:> The `size'' field of Xen network wired format is uint16_t, anything
bigger than
> 65535 will cause overflow.
> 
> The punishment introduced by XSA-39 is quite harsh - DomU is disconnected
when
> it''s discovered to be sending corrupted skbs. However, it looks
like Linux
> kernel will generate some bad skbs sometimes, so drop those skbs before
> sending to over netback to avoid being disconnected.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netfront.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 5527663..284059b 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -547,6 +547,18 @@ static int xennet_start_xmit(struct sk_buff *skb,
struct net_device *dev)
>  	unsigned int len = skb_headlen(skb);
>  	unsigned long flags;
>  
> +	/*
> +	 * wired format of xen_netif_tx_request only supports skb->len
> +	 * < 64K, because size field in xen_netif_tx_request is
> +	 * uint16_t.
> +	 */
> +	if (unlikely(skb->len > (uint16_t)(~((uint16_t)0)))) {
> +		net_alert_ratelimited(
> +			"xennet: skb->len = %d, too big for wired format\n",
> +			skb->len);
> +		goto drop;
> +	}
> +
>  	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
>  		xennet_count_skb_frag_slots(skb);
>  	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {

Jan Beulich

2013-Mar-01 16:48 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

>>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com> wrote:
> The `size'' field of Xen network wired format is uint16_t, anything
bigger
> than
> 65535 will cause overflow.
> 
> The punishment introduced by XSA-39 is quite harsh - DomU is disconnected
when
> it''s discovered to be sending corrupted skbs. However, it looks
like Linux
> kernel will generate some bad skbs sometimes, so drop those skbs before
> sending to over netback to avoid being disconnected.
While fixing the frontend is certainly desirable, we can''t expect
everyone to deploy fixed netfronts in all their VMs - some OS
versions used in there may even be out of service. So we
ought to find a way to also more gracefully deal with the
situation in netback, without re-opening the security issue
that prompted those changes.

Jan

Wei Liu

2013-Mar-01 17:00 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:> >>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com>
wrote:
> > The `size'' field of Xen network wired format is uint16_t,
anything bigger
> > than
> > 65535 will cause overflow.
> > 
> > The punishment introduced by XSA-39 is quite harsh - DomU is
disconnected when
> > it''s discovered to be sending corrupted skbs. However, it
looks like Linux
> > kernel will generate some bad skbs sometimes, so drop those skbs
before
> > sending to over netback to avoid being disconnected.
> 
> While fixing the frontend is certainly desirable, we can''t expect
> everyone to deploy fixed netfronts in all their VMs - some OS
> versions used in there may even be out of service. So we
> ought to find a way to also more gracefully deal with the
> situation in netback, without re-opening the security issue
> that prompted those changes.
> 
Regarding the punishment bit, I think its worth discussing it a bit.

But the bug is always there, it drew no attention until revealed by
XSA-39. It ought to be fixed anyway. :-)


Wei.
> Jan
>

Ian Campbell

2013-Mar-02 02:54 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote:> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:
> > >>> On 01.03.13 at 17:31, Wei Liu <wei.liu2@citrix.com>
wrote:
> > > The `size'' field of Xen network wired format is
uint16_t, anything bigger
> > > than
> > > 65535 will cause overflow.
> > > 
> > > The punishment introduced by XSA-39 is quite harsh - DomU is
disconnected when
> > > it''s discovered to be sending corrupted skbs. However,
it looks like Linux
> > > kernel will generate some bad skbs sometimes, so drop those skbs
before
> > > sending to over netback to avoid being disconnected.
> > 
> > While fixing the frontend is certainly desirable, we can''t
expect
> > everyone to deploy fixed netfronts in all their VMs - some OS
> > versions used in there may even be out of service. So we
> > ought to find a way to also more gracefully deal with the
> > situation in netback, without re-opening the security issue
> > that prompted those changes.
> > 
> 
> Regarding the punishment bit, I think its worth discussing it a bit.
Yes, the trick is figuring out what to do without reintroducing the
softlockup which XSA-39 fixed.

Perhaps we should allow silently consume (and drop) oversize skbs and
only shutdown the rings if they also consume too many (FSVO too many)
slots?
> But the bug is always there, it drew no attention until revealed by
> XSA-39. It ought to be fixed anyway. :-)
I would have sworn that skb->len was also limited to 64k, but looking at
the header I see it is actually an int and the only limit of that sort
is related to MAX_SKB_FRAGS (which doesn''t actually limit the total
size).

OOI how big were the skbs you were seeing?

Not that it really matters but do we have a handle on why the prexisting
bug didn''t already cause connectivity issues? Does the retransmit
(which
I suppose must be happening) somehow end up using a smaller skb size?

BTW you mean "wire protocol" not "wired protocol" in the
comments etc.

Ian.

Wei Liu

2013-Mar-02 13:32 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On Sat, Mar 02, 2013 at 02:54:17AM +0000, Ian Campbell
wrote:> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote:
> > On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:
> > > >>> On 01.03.13 at 17:31, Wei Liu
<wei.liu2@citrix.com> wrote:
> > > > The `size'' field of Xen network wired format is
uint16_t, anything bigger
> > > > than
> > > > 65535 will cause overflow.
> > > > 
> > > > The punishment introduced by XSA-39 is quite harsh - DomU is
disconnected when
> > > > it''s discovered to be sending corrupted skbs.
However, it looks like Linux
> > > > kernel will generate some bad skbs sometimes, so drop those
skbs before
> > > > sending to over netback to avoid being disconnected.
> > > 
> > > While fixing the frontend is certainly desirable, we
can''t expect
> > > everyone to deploy fixed netfronts in all their VMs - some OS
> > > versions used in there may even be out of service. So we
> > > ought to find a way to also more gracefully deal with the
> > > situation in netback, without re-opening the security issue
> > > that prompted those changes.
> > > 
> > 
> > Regarding the punishment bit, I think its worth discussing it a bit.
> 
> Yes, the trick is figuring out what to do without reintroducing the
> softlockup which XSA-39 fixed.
> 
> Perhaps we should allow silently consume (and drop) oversize skbs and
> only shutdown the rings if they also consume too many (FSVO too many)
> slots?
> 
> > But the bug is always there, it drew no attention until revealed by
> > XSA-39. It ought to be fixed anyway. :-)
> 
> I would have sworn that skb->len was also limited to 64k, but looking at
> the header I see it is actually an int and the only limit of that sort
> is related to MAX_SKB_FRAGS (which doesn''t actually limit the
total
> size).
I had the impression that skb->len was limited to 64k, too. But it
turned out I was wrong.
> 
> OOI how big were the skbs you were seeing?
As Nick (npegg@linode.com) pointed out in his email, he saw size 65538.
I can reproduce this as well by setting vif''s mtu to 100 then run
iperf.
100 was just a random number I came up with when I played with
fragmentation.
> 
> Not that it really matters but do we have a handle on why the prexisting
> bug didn''t already cause connectivity issues? Does the retransmit
(which
> I suppose must be happening) somehow end up using a smaller skb size?
> 
Not sure. I didn''t have enough time to look into this yesterday. :-(
> BTW you mean "wire protocol" not "wired protocol" in
the comments etc.
> 
Yes.


Wei.
> Ian.
>

annie li

2013-Mar-03 04:13 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On 2013-3-2 10:54, Ian Campbell wrote:> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote:
>> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:
>>>>>> On 01.03.13 at 17:31, Wei Liu
<wei.liu2@citrix.com> wrote:
>>>> The `size'' field of Xen network wired format is
uint16_t, anything bigger
>>>> than
>>>> 65535 will cause overflow.
>>>>
>>>> The punishment introduced by XSA-39 is quite harsh - DomU is
disconnected when
>>>> it''s discovered to be sending corrupted skbs. However,
it looks like Linux
>>>> kernel will generate some bad skbs sometimes, so drop those
skbs before
>>>> sending to over netback to avoid being disconnected.
>>> While fixing the frontend is certainly desirable, we can''t
expect
>>> everyone to deploy fixed netfronts in all their VMs - some OS
>>> versions used in there may even be out of service. So we
>>> ought to find a way to also more gracefully deal with the
>>> situation in netback, without re-opening the security issue
>>> that prompted those changes.
>>>
>> Regarding the punishment bit, I think its worth discussing it a bit.
> Yes, the trick is figuring out what to do without reintroducing the
> softlockup which XSA-39 fixed.
>
> Perhaps we should allow silently consume (and drop) oversize skbs and
> only shutdown the rings if they also consume too many (FSVO too many)
> slots?
>
>> But the bug is always there, it drew no attention until revealed by
>> XSA-39. It ought to be fixed anyway. :-)
> I would have sworn that skb->len was also limited to 64k, but looking at
> the header I see it is actually an int and the only limit of that sort
> is related to MAX_SKB_FRAGS (which doesn''t actually limit the
total
> size).
If compound page enabled, it is very possible that netback runs into 
following code.

                 if (unlikely(frags >= MAX_SKB_FRAGS)) {
                         netdev_err(vif->dev, "Too many frags\n");
                         netbk_fatal_tx_err(vif);
                         return -frags;
                 }

MAX_SKB_FRAGS does not actually limit the total size in theory. But it does
limit the total size in netback since netback only supports one-page-sized
fragments.

Thanks
Annie

annie li

2013-Mar-03 05:02 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On 2013-3-2 21:32, Wei Liu wrote:> On Sat, Mar 02, 2013 at 02:54:17AM +0000, Ian Campbell wrote:
>> On Fri, 2013-03-01 at 17:00 +0000, Wei Liu wrote:
>>> On Fri, 2013-03-01 at 16:48 +0000, Jan Beulich wrote:
>>>>>>> On 01.03.13 at 17:31, Wei Liu
<wei.liu2@citrix.com> wrote:
>>>>> The `size'' field of Xen network wired format is
uint16_t, anything bigger
>>>>> than
>>>>> 65535 will cause overflow.
>>>>>
>>>>> The punishment introduced by XSA-39 is quite harsh - DomU
is disconnected when
>>>>> it''s discovered to be sending corrupted skbs.
However, it looks like Linux
>>>>> kernel will generate some bad skbs sometimes, so drop those
skbs before
>>>>> sending to over netback to avoid being disconnected.
>>>> While fixing the frontend is certainly desirable, we
can''t expect
>>>> everyone to deploy fixed netfronts in all their VMs - some OS
>>>> versions used in there may even be out of service. So we
>>>> ought to find a way to also more gracefully deal with the
>>>> situation in netback, without re-opening the security issue
>>>> that prompted those changes.
>>>>
>>> Regarding the punishment bit, I think its worth discussing it a
bit.
>> Yes, the trick is figuring out what to do without reintroducing the
>> softlockup which XSA-39 fixed.
>>
>> Perhaps we should allow silently consume (and drop) oversize skbs and
>> only shutdown the rings if they also consume too many (FSVO too many)
>> slots?
>>
>>> But the bug is always there, it drew no attention until revealed by
>>> XSA-39. It ought to be fixed anyway. :-)
>> I would have sworn that skb->len was also limited to 64k, but
looking at
>> the header I see it is actually an int and the only limit of that sort
>> is related to MAX_SKB_FRAGS (which doesn''t actually limit the
total
>> size).
> I had the impression that skb->len was limited to 64k, too. But it
> turned out I was wrong.
>
>> OOI how big were the skbs you were seeing?
> As Nick (npegg@linode.com) pointed out in his email, he saw size 65538.
> I can reproduce this as well by setting vif''s mtu to 100 then run
iperf.
> 100 was just a random number I came up with when I played with
> fragmentation.
Did you notice any GSO packets during your iperf test?

Thanks
Annie

Nick Pegg

2013-Mar-06 17:20 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On 3/2/13 8:32 AM, Wei Liu wrote:> 
> As Nick (npegg@linode.com) pointed out in his email, he saw size 65538.
> I can reproduce this as well by setting vif''s mtu to 100 then run
iperf.
> 100 was just a random number I came up with when I played with
> fragmentation.
> 
Wei: How exactly did you set the MTU on the vif and what were your
arguments to iperf? I tried this on a test host and was unable to
trigger the XSA-39 protection.

As for a more graceful punishment, I noticed that xenvif_carrier_off()
uses netif_carrier_off() to discard queued packets. Would doing a
netif_carrier_off() and then a netif_carrier_on() be sufficient to drop
the bad packets and keep netback from spinning? Would blinking the
vif''s
carrier in this way allow the DomU to gracefully resume network traffic?

-Nick

Wei Liu

2013-Mar-06 17:31 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote:> On 3/2/13 8:32 AM, Wei Liu wrote:
> > 
> > As Nick (npegg@linode.com) pointed out in his email, he saw size
65538.
> > I can reproduce this as well by setting vif''s mtu to 100 then
run iperf.
> > 100 was just a random number I came up with when I played with
> > fragmentation.
> > 
> 
> Wei: How exactly did you set the MTU on the vif and what were your
> arguments to iperf? I tried this on a test host and was unable to
> trigger the XSA-39 protection.
> 
Sorry I didn''t state this clearly. I tend to call the interface inside
VM as vif as well. I did this inside a VM, ifconfig eth0 mtu 100.
Nothing fancy added to iperf command line, just `iperf -c XXX` in VM and
`iperf -s` in host (I limited the test time with -t though).
> As for a more graceful punishment, I noticed that xenvif_carrier_off()
> uses netif_carrier_off() to discard queued packets. Would doing a
> netif_carrier_off() and then a netif_carrier_on() be sufficient to drop
> the bad packets and keep netback from spinning? Would blinking the
vif''s
> carrier in this way allow the DomU to gracefully resume network traffic?
> 
I think the spinning has something to do with the ring, not packets --
I''m not very sure though, because I didn''t take part in the
discussion /
development of XSA-39. Draining host queues is not likely to fix this.


Wei.
> 
> -Nick

Jacek Milewicz

2013-Mar-06 17:57 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

> On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote:
> > On 3/2/13 8:32 AM, Wei Liu wrote:
> > >
> > > As Nick (npegg@linode.com) pointed out in his email, he saw size
65538.> > > I can reproduce this as well by setting vif''s mtu to 100
then run
iperf.> > > 100 was just a random number I came up with when I played with
> > > fragmentation.
> > >
> >
> > Wei: How exactly did you set the MTU on the vif and what were your
> > arguments to iperf? I tried this on a test host and was unable to
> > trigger the XSA-39 protection.
> >
> 
> Sorry I didn''t state this clearly. I tend to call the interface
inside
VM as vif as> well. I did this inside a VM, ifconfig eth0 mtu 100.
> Nothing fancy added to iperf command line, just `iperf -c XXX` in VM and
> `iperf -s` in host (I limited the test time with -t though).
We have recently hit the same problem (however in our case, larger NFS
transfer was enough to trigger it) and I''ve found that it only happens
if
when MAX_SKB_FRAGS is different on domU and dom0 (as in, domU using older
kernel (without this change
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commi
t/drivers/net/xen-netback/netback.c?id=48856286b64e4b66ec62b94e504d0b29c1a
de664 )) can you confirm your domU kernel version?

Wei Liu

2013-Mar-06 18:04 UTC

head link

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

On Wed, 2013-03-06 at 17:57 +0000, Jacek Milewicz wrote:> > On Wed, 2013-03-06 at 17:20 +0000, Nick Pegg wrote:
> > > On 3/2/13 8:32 AM, Wei Liu wrote:
> > > >
> > > > As Nick (npegg@linode.com) pointed out in his email, he saw
size
> 65538.
> > > > I can reproduce this as well by setting vif''s mtu
to 100 then run
> iperf.
> > > > 100 was just a random number I came up with when I played
with
> > > > fragmentation.
> > > >
> > >
> > > Wei: How exactly did you set the MTU on the vif and what were
your
> > > arguments to iperf? I tried this on a test host and was unable to
> > > trigger the XSA-39 protection.
> > >
> > 
> > Sorry I didn''t state this clearly. I tend to call the
interface inside
> VM as vif as
> > well. I did this inside a VM, ifconfig eth0 mtu 100.
> > Nothing fancy added to iperf command line, just `iperf -c XXX` in VM
and
> > `iperf -s` in host (I limited the test time with -t though).
> 
> We have recently hit the same problem (however in our case, larger NFS
> transfer was enough to trigger it) and I''ve found that it only
happens if
> when MAX_SKB_FRAGS is different on domU and dom0 (as in, domU using older
> kernel (without this change
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commi
> t/drivers/net/xen-netback/netback.c?id=48856286b64e4b66ec62b94e504d0b29c1a
> de664 )) can you confirm your domU kernel version?
> 
Actually I''m replying to your other email now. :-)

The root cause of your problem is a bit different than the one I
discovered, but they ended up the same error path anyway.

My kernel should be 3.9-rc or something. I use it for both Dom0 and
DomU.


Wei.
>

Xen devel - Mar 2013 - [PATCH] xen-netfront: drop skb when skb->len > 65535

[PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535

Re: [PATCH] xen-netfront: drop skb when skb->len > 65535