thr3ads.net - Virtualization - [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set. [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Hannes Frederic Sowa

2015-Jan-28 10:34 UTC

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

Hi,

On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin
wrote:> On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
> > Hello,
> > 
> > On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> > > On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa
wrote:
> > > > On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
> > > > > On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
> > > > > > On Di, 2015-01-27 at 10:42 +0200, Michael S.
Tsirkin wrote:
> > > > > >> On Tue, Jan 27, 2015 at 02:47:54AM +0000, Ben
Hutchings wrote:
> > > > > >>> On Mon, 2015-01-26 at 09:37 -0500,
Vladislav Yasevich wrote:
> > > > > >>>> If the IPv6 fragment id has not been
set and we perform
> > > > > >>>> fragmentation due to UFO, select a new
fragment id.
> > > > > >>>> When we store the fragment id into
skb_shinfo, set the bit
> > > > > >>>> in the skb so we can re-use the
selected id.
> > > > > >>>> This preserves the behavior of UFO
packets generated on the
> > > > > >>>> host and solves the issue of id
generation for packet sockets
> > > > > >>>> and tap/macvtap devices.
> > > > > >>>>
> > > > > >>>> This patch moves ipv6_select_ident()
back in to the header file.
> > > > > >>>> It also provides the helper function
that sets skb_shinfo() frag
> > > > > >>>> id and sets the bit.
> > > > > >>>>
> > > > > >>>> It also makes sure that we select the
fragment id when doing
> > > > > >>>> just gso validation, since it's
possible for the packet to
> > > > > >>>> come from an untrusted source (VM) and
be forwarded through
> > > > > >>>> a UFO enabled device which will expect
the fragment id.
> > > > > >>>>
> > > > > >>>> CC: Eric Dumazet <edumazet at
google.com>
> > > > > >>>> Signed-off-by: Vladislav Yasevich
<vyasevic at redhat.com>
> > > > > >>>> ---
> > > > > >>>>  include/linux/skbuff.h |  3 ++-
> > > > > >>>>  include/net/ipv6.h     |  2 ++
> > > > > >>>>  net/ipv6/ip6_output.c  |  4 ++--
> > > > > >>>>  net/ipv6/output_core.c |  9 ++++++++-
> > > > > >>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> > > > > >>>>  5 files changed, 23 insertions(+), 5
deletions(-)
> > > > > >>>>
> > > > > >>>> diff --git a/include/linux/skbuff.h
b/include/linux/skbuff.h
> > > > > >>>> index 85ab7d7..3ad5203 100644
> > > > > >>>> --- a/include/linux/skbuff.h
> > > > > >>>> +++ b/include/linux/skbuff.h
> > > > > >>>> @@ -605,7 +605,8 @@ struct sk_buff {
> > > > > >>>>  	__u8			ipvs_property:1;
> > > > > >>>>  	__u8			inner_protocol_type:1;
> > > > > >>>>  	__u8			remcsum_offload:1;
> > > > > >>>> -	/* 3 or 5 bit hole */
> > > > > >>>> +	__u8			ufo_fragid_set:1;
> > > > > >>> [...]
> > > > > >>>
> > > > > >>> Doesn't the flag belong in struct
skb_shared_info, rather than struct
> > > > > >>> sk_buff?  Otherwise this looks fine.
> > > > > >>>
> > > > > >>> Ben.
> > > > > >>
> > > > > >> Hmm we seem to be out of tx flags.
> > > > > >> Maybe ip6_frag_id == 0 should mean "not
set".
> > > > > > 
> > > > > > Maybe that is the best idea. Definitely the
ufo_fragid_set bit should
> > > > > > move into the skb_shared_info area.
> > > > > 
> > > > > That's what I originally wanted to do, but had to
move and grow txflags thus
> > > > > skb_shinfo ended up growing.  I wanted to avoid that,
so stole an skb flag.
> > > > > 
> > > > > I considered treating fragid == 0 as unset, but a 0
fragid is perfectly valid
> > > > > from the protocol perspective and could actually be
generated by the id generator
> > > > > functions.  This may cause us to call the id generation
multiple times.
> > > > 
> > > > Are there plans in the long run to let virtio_net transmit
auxiliary
> > > > data to the other end so we can clean all of this this up
one day?
> > > > 
> > > > I don't like the whole situation: looking into the
virtio_net headers
> > > > just adding a field for ipv6 fragmentation ids to those
small structs
> > > > seems bloated, not doing it feels incorrect. :/
> > > > 
> > > > Thoughts?
> > > > 
> > > > Bye,
> > > > Hannes
> > > 
> > > I'm not sure - what will be achieved by generating the IDs
guest side as
> > > opposed to host side?  It's certainly harder to get hold of
entropy
> > > guest-side.
> > 
> > It is not only about entropy but about uniqueness.  Also fragmentation
> > ids should not be discoverable,
> 
> I belive "predictable" is the language used by the IETF draft.
> 
> > so there are several aspects:
> > 
> > I see fragmentation id generation still as security critical:
> > When Eric patched the frag id generator in 04ca6973f7c1a0d ("ip:
make IP
> > identifiers less predictable") I could patch my kernels and use
the
> > patch regardless of the machine being virtualized or not. It was not
> > dependent on the hypervisor.
> 
> And now it's even easier - just patch the hypervisor, and all VMs
> automatically benefit.
Sometimes the hypervisor is not under my control. You would need to
patch both kernels in your case - non gso frames would still get the
fragmentation id generated in the host kernel.
> > I think that is the same reasoning why we
> > don't support TOE.
> > If we use one generator in the hypervisor in an openstack alike
setting,
> > the host deals with quite a lot of overlay networks. A lot of default
> > configurations use the same addresses internally, so on the hypervisor
> > the frag id generators would interfere by design.
> > I could come up with an attack scenario for DNS servers (again :) ):
> > 
> > You are sitting next to a DNS server on the same hypervisor and can
send
> > packets without source validation (because that is handled later on in
> > case of openvswitch when the packet is put into the corresponding
> > overlay network). You emit a gso packet with the same source and
> > destination addresses as the DNS server would do and would get an
> > fragmentation id which is linearly (+ time delta) incremented
depending
> > on the source and destination address. With such a leak you could
start
> > trying attack and spoof DNS responses (fragmentation attacks etc.).
> > See also details on such kind of attacks in the description of commit
> > 04ca6973f7c1a0d.
> > 
> > AFAIK IETF tried with IPv6 to push fragmentation id generation to the
> > end hosts, that's also the reason for the introduction of atomic
> > fragments (which are now being rolled back ;) ).
> > 
> > Still it is better to generate a frag id on the hypervisor than just
> > sending a 0, so I am ok with this change, albeit not happy.
> > 
> > Thanks,
> > Hannes
> > 
> 
> OK so to summarize, identifiers are only re-randomized once per jiffy,
> so you worry that within this window, an external observer can discover
> past fragment ID values and so predict the future ones.
> All that's required is that two paths go through the same box
performing
> fragmentation.
> 
> Is that a fair summary?
> 
> If yes, we can make this a bit harder by mixing in some
> data per input and/or output devices.
> 
> For example, just to give you the idea:
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 683d493..4faa7ef 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct sk_buff
*skb, bool pfmemalloc)
>  	trace_netif_receive_skb(skb);
>  
>  	orig_dev = skb->dev;
> +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
>  
>  	skb_reset_network_header(skb);
>  	if (!skb_transport_header_was_set(skb))
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index ce69a12..819a821 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct sock
*sk,
>  				     sizeof(struct frag_hdr)) & ~7;
>  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
>  	ipv6_select_ident(&fhdr, rt);
> -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> +						   fhdr.identification);
>  
>  append:
>  	return skb_append_datato_frags(sk, skb, getfrag, from,
> 
I thought about mixing in the incoming interface identifier into the
frag id generation, but that could hurt us badly as soon as a VM has
more than one interface to the outside world and uses e.g. ECMP. We need
to make sure that those frag ids are unique and the kernel needs to be
better than just using a random number generator.

Bye,
Hannes

Hannes Frederic Sowa

2015-Jan-28 10:39 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

On Mi, 2015-01-28 at 11:34 +0100, Hannes Frederic Sowa
wrote:> 
> > And now it's even easier - just patch the hypervisor, and all VMs
> > automatically benefit.
> 
> Sometimes the hypervisor is not under my control. You would need to
> patch both kernels in your case - non gso frames would still get the
> fragmentation id generated in the host kernel.
Actually this now became my biggest concern. :/

Bye,
Hannes

Michael S. Tsirkin

2015-Jan-28 13:43 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa
wrote:> Hi,
> 
> On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
> > On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
> > > Hello,
> > > 
> > > On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> > > > On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic
Sowa wrote:
> > > > > On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
> > > > > > On 01/27/2015 08:47 AM, Hannes Frederic Sowa
wrote:
> > > > > > > On Di, 2015-01-27 at 10:42 +0200, Michael S.
Tsirkin wrote:
> > > > > > >> On Tue, Jan 27, 2015 at 02:47:54AM +0000,
Ben Hutchings wrote:
> > > > > > >>> On Mon, 2015-01-26 at 09:37 -0500,
Vladislav Yasevich wrote:
> > > > > > >>>> If the IPv6 fragment id has not
been set and we perform
> > > > > > >>>> fragmentation due to UFO, select
a new fragment id.
> > > > > > >>>> When we store the fragment id
into skb_shinfo, set the bit
> > > > > > >>>> in the skb so we can re-use the
selected id.
> > > > > > >>>> This preserves the behavior of
UFO packets generated on the
> > > > > > >>>> host and solves the issue of id
generation for packet sockets
> > > > > > >>>> and tap/macvtap devices.
> > > > > > >>>>
> > > > > > >>>> This patch moves
ipv6_select_ident() back in to the header file.
> > > > > > >>>> It also provides the helper
function that sets skb_shinfo() frag
> > > > > > >>>> id and sets the bit.
> > > > > > >>>>
> > > > > > >>>> It also makes sure that we select
the fragment id when doing
> > > > > > >>>> just gso validation, since
it's possible for the packet to
> > > > > > >>>> come from an untrusted source
(VM) and be forwarded through
> > > > > > >>>> a UFO enabled device which will
expect the fragment id.
> > > > > > >>>>
> > > > > > >>>> CC: Eric Dumazet <edumazet at
google.com>
> > > > > > >>>> Signed-off-by: Vladislav Yasevich
<vyasevic at redhat.com>
> > > > > > >>>> ---
> > > > > > >>>>  include/linux/skbuff.h |  3 ++-
> > > > > > >>>>  include/net/ipv6.h     |  2 ++
> > > > > > >>>>  net/ipv6/ip6_output.c  |  4 ++--
> > > > > > >>>>  net/ipv6/output_core.c |  9
++++++++-
> > > > > > >>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> > > > > > >>>>  5 files changed, 23
insertions(+), 5 deletions(-)
> > > > > > >>>>
> > > > > > >>>> diff --git
a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > > > >>>> index 85ab7d7..3ad5203 100644
> > > > > > >>>> --- a/include/linux/skbuff.h
> > > > > > >>>> +++ b/include/linux/skbuff.h
> > > > > > >>>> @@ -605,7 +605,8 @@ struct
sk_buff {
> > > > > > >>>>  	__u8			ipvs_property:1;
> > > > > > >>>>  	__u8			inner_protocol_type:1;
> > > > > > >>>>  	__u8			remcsum_offload:1;
> > > > > > >>>> -	/* 3 or 5 bit hole */
> > > > > > >>>> +	__u8			ufo_fragid_set:1;
> > > > > > >>> [...]
> > > > > > >>>
> > > > > > >>> Doesn't the flag belong in struct
skb_shared_info, rather than struct
> > > > > > >>> sk_buff?  Otherwise this looks fine.
> > > > > > >>>
> > > > > > >>> Ben.
> > > > > > >>
> > > > > > >> Hmm we seem to be out of tx flags.
> > > > > > >> Maybe ip6_frag_id == 0 should mean
"not set".
> > > > > > > 
> > > > > > > Maybe that is the best idea. Definitely the
ufo_fragid_set bit should
> > > > > > > move into the skb_shared_info area.
> > > > > > 
> > > > > > That's what I originally wanted to do, but had
to move and grow txflags thus
> > > > > > skb_shinfo ended up growing.  I wanted to avoid
that, so stole an skb flag.
> > > > > > 
> > > > > > I considered treating fragid == 0 as unset, but a
0 fragid is perfectly valid
> > > > > > from the protocol perspective and could actually
be generated by the id generator
> > > > > > functions.  This may cause us to call the id
generation multiple times.
> > > > > 
> > > > > Are there plans in the long run to let virtio_net
transmit auxiliary
> > > > > data to the other end so we can clean all of this this
up one day?
> > > > > 
> > > > > I don't like the whole situation: looking into the
virtio_net headers
> > > > > just adding a field for ipv6 fragmentation ids to those
small structs
> > > > > seems bloated, not doing it feels incorrect. :/
> > > > > 
> > > > > Thoughts?
> > > > > 
> > > > > Bye,
> > > > > Hannes
> > > > 
> > > > I'm not sure - what will be achieved by generating the
IDs guest side as
> > > > opposed to host side?  It's certainly harder to get hold
of entropy
> > > > guest-side.
> > > 
> > > It is not only about entropy but about uniqueness.  Also
fragmentation
> > > ids should not be discoverable,
> > 
> > I belive "predictable" is the language used by the IETF
draft.
> > 
> > > so there are several aspects:
> > > 
> > > I see fragmentation id generation still as security critical:
> > > When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
> > > identifiers less predictable") I could patch my kernels and
use the
> > > patch regardless of the machine being virtualized or not. It was
not
> > > dependent on the hypervisor.
> > 
> > And now it's even easier - just patch the hypervisor, and all VMs
> > automatically benefit.
> 
> Sometimes the hypervisor is not under my control. You would need to
> patch both kernels in your case - non gso frames would still get the
> fragmentation id generated in the host kernel.
Confused. You would have to patch both kernels *in your case*.
If it's all done by host, then it's in a single place, on host.
> > > I think that is the same reasoning why we
> > > don't support TOE.
> > > If we use one generator in the hypervisor in an openstack alike
setting,
> > > the host deals with quite a lot of overlay networks. A lot of
default
> > > configurations use the same addresses internally, so on the
hypervisor
> > > the frag id generators would interfere by design.
> > > I could come up with an attack scenario for DNS servers (again :)
):
> > > 
> > > You are sitting next to a DNS server on the same hypervisor and
can send
> > > packets without source validation (because that is handled later
on in
> > > case of openvswitch when the packet is put into the corresponding
> > > overlay network). You emit a gso packet with the same source and
> > > destination addresses as the DNS server would do and would get an
> > > fragmentation id which is linearly (+ time delta) incremented
depending
> > > on the source and destination address. With such a leak you could
start
> > > trying attack and spoof DNS responses (fragmentation attacks
etc.).
> > > See also details on such kind of attacks in the description of
commit
> > > 04ca6973f7c1a0d.
> > > 
> > > AFAIK IETF tried with IPv6 to push fragmentation id generation to
the
> > > end hosts, that's also the reason for the introduction of
atomic
> > > fragments (which are now being rolled back ;) ).
> > > 
> > > Still it is better to generate a frag id on the hypervisor than
just
> > > sending a 0, so I am ok with this change, albeit not happy.
> > > 
> > > Thanks,
> > > Hannes
> > > 
> > 
> > OK so to summarize, identifiers are only re-randomized once per jiffy,
> > so you worry that within this window, an external observer can
discover
> > past fragment ID values and so predict the future ones.
> > All that's required is that two paths go through the same box
performing
> > fragmentation.
> > 
> > Is that a fair summary?
> > 
> > If yes, we can make this a bit harder by mixing in some
> > data per input and/or output devices.
> > 
> > For example, just to give you the idea:
> > 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 683d493..4faa7ef 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
> >  	trace_netif_receive_skb(skb);
> >  
> >  	orig_dev = skb->dev;
> > +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
> >  
> >  	skb_reset_network_header(skb);
> >  	if (!skb_transport_header_was_set(skb))
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index ce69a12..819a821 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct
sock *sk,
> >  				     sizeof(struct frag_hdr)) & ~7;
> >  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> >  	ipv6_select_ident(&fhdr, rt);
> > -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> > +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> > +						   fhdr.identification);
> >  
> >  append:
> >  	return skb_append_datato_frags(sk, skb, getfrag, from,
> > 
> 
> I thought about mixing in the incoming interface identifier into the
> frag id generation, but that could hurt us badly as soon as a VM has
> more than one interface to the outside world and uses e.g. ECMP.
I don't understand. Fragmentation is done after routing,
isn't it? So all fragments always go out on the same device.
> We need
> to make sure that those frag ids are unique and the kernel needs to be
> better than just using a random number generator.
> 
> Bye,
> Hannes
32 bit numbers can't be unique. They just shouldn't be discoverable
by an off-path observer.

-- 
MST

Vlad Yasevich

2015-Jan-28 14:16 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:> Hi,
> 
> On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
>> On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
>>> Hello,
>>>
>>> On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
>>>> On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa
wrote:
>>>>> On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
>>>>>> On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
>>>>>>> On Di, 2015-01-27 at 10:42 +0200, Michael S.
Tsirkin wrote:
>>>>>>>> On Tue, Jan 27, 2015 at 02:47:54AM +0000, Ben
Hutchings wrote:
>>>>>>>>> On Mon, 2015-01-26 at 09:37 -0500,
Vladislav Yasevich wrote:
>>>>>>>>>> If the IPv6 fragment id has not been
set and we perform
>>>>>>>>>> fragmentation due to UFO, select a new
fragment id.
>>>>>>>>>> When we store the fragment id into
skb_shinfo, set the bit
>>>>>>>>>> in the skb so we can re-use the
selected id.
>>>>>>>>>> This preserves the behavior of UFO
packets generated on the
>>>>>>>>>> host and solves the issue of id
generation for packet sockets
>>>>>>>>>> and tap/macvtap devices.
>>>>>>>>>>
>>>>>>>>>> This patch moves ipv6_select_ident()
back in to the header file.
>>>>>>>>>> It also provides the helper function
that sets skb_shinfo() frag
>>>>>>>>>> id and sets the bit.
>>>>>>>>>>
>>>>>>>>>> It also makes sure that we select the
fragment id when doing
>>>>>>>>>> just gso validation, since it's
possible for the packet to
>>>>>>>>>> come from an untrusted source (VM) and
be forwarded through
>>>>>>>>>> a UFO enabled device which will expect
the fragment id.
>>>>>>>>>>
>>>>>>>>>> CC: Eric Dumazet <edumazet at
google.com>
>>>>>>>>>> Signed-off-by: Vladislav Yasevich
<vyasevic at redhat.com>
>>>>>>>>>> ---
>>>>>>>>>>  include/linux/skbuff.h |  3 ++-
>>>>>>>>>>  include/net/ipv6.h     |  2 ++
>>>>>>>>>>  net/ipv6/ip6_output.c  |  4 ++--
>>>>>>>>>>  net/ipv6/output_core.c |  9 ++++++++-
>>>>>>>>>>  net/ipv6/udp_offload.c | 10 +++++++++-
>>>>>>>>>>  5 files changed, 23 insertions(+), 5
deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/include/linux/skbuff.h
b/include/linux/skbuff.h
>>>>>>>>>> index 85ab7d7..3ad5203 100644
>>>>>>>>>> --- a/include/linux/skbuff.h
>>>>>>>>>> +++ b/include/linux/skbuff.h
>>>>>>>>>> @@ -605,7 +605,8 @@ struct sk_buff {
>>>>>>>>>>  	__u8			ipvs_property:1;
>>>>>>>>>>  	__u8			inner_protocol_type:1;
>>>>>>>>>>  	__u8			remcsum_offload:1;
>>>>>>>>>> -	/* 3 or 5 bit hole */
>>>>>>>>>> +	__u8			ufo_fragid_set:1;
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>> Doesn't the flag belong in struct
skb_shared_info, rather than struct
>>>>>>>>> sk_buff?  Otherwise this looks fine.
>>>>>>>>>
>>>>>>>>> Ben.
>>>>>>>>
>>>>>>>> Hmm we seem to be out of tx flags.
>>>>>>>> Maybe ip6_frag_id == 0 should mean "not
set".
>>>>>>>
>>>>>>> Maybe that is the best idea. Definitely the
ufo_fragid_set bit should
>>>>>>> move into the skb_shared_info area.
>>>>>>
>>>>>> That's what I originally wanted to do, but had to
move and grow txflags thus
>>>>>> skb_shinfo ended up growing.  I wanted to avoid that,
so stole an skb flag.
>>>>>>
>>>>>> I considered treating fragid == 0 as unset, but a 0
fragid is perfectly valid
>>>>>> from the protocol perspective and could actually be
generated by the id generator
>>>>>> functions.  This may cause us to call the id generation
multiple times.
>>>>>
>>>>> Are there plans in the long run to let virtio_net transmit
auxiliary
>>>>> data to the other end so we can clean all of this this up
one day?
>>>>>
>>>>> I don't like the whole situation: looking into the
virtio_net headers
>>>>> just adding a field for ipv6 fragmentation ids to those
small structs
>>>>> seems bloated, not doing it feels incorrect. :/
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Bye,
>>>>> Hannes
>>>>
>>>> I'm not sure - what will be achieved by generating the IDs
guest side as
>>>> opposed to host side?  It's certainly harder to get hold of
entropy
>>>> guest-side.
>>>
>>> It is not only about entropy but about uniqueness.  Also
fragmentation
>>> ids should not be discoverable,
>>
>> I belive "predictable" is the language used by the IETF
draft.
>>
>>> so there are several aspects:
>>>
>>> I see fragmentation id generation still as security critical:
>>> When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
>>> identifiers less predictable") I could patch my kernels and
use the
>>> patch regardless of the machine being virtualized or not. It was
not
>>> dependent on the hypervisor.
>>
>> And now it's even easier - just patch the hypervisor, and all VMs
>> automatically benefit.
> 
> Sometimes the hypervisor is not under my control. You would need to
> patch both kernels in your case - non gso frames would still get the
> fragmentation id generated in the host kernel.
Why would non-gso frames need a frag id?  We are talking only UDP IPv6
here, so there is no frag id generation if the packet does't need to
be fragmented.
> 
>>> I think that is the same reasoning why we
>>> don't support TOE.
>>> If we use one generator in the hypervisor in an openstack alike
setting,
>>> the host deals with quite a lot of overlay networks. A lot of
default
>>> configurations use the same addresses internally, so on the
hypervisor
>>> the frag id generators would interfere by design.
>>> I could come up with an attack scenario for DNS servers (again :)
):
>>>
>>> You are sitting next to a DNS server on the same hypervisor and can
send
>>> packets without source validation (because that is handled later on
in
>>> case of openvswitch when the packet is put into the corresponding
>>> overlay network). You emit a gso packet with the same source and
>>> destination addresses as the DNS server would do and would get an
>>> fragmentation id which is linearly (+ time delta) incremented
depending
>>> on the source and destination address. With such a leak you could
start
>>> trying attack and spoof DNS responses (fragmentation attacks etc.).
>>> See also details on such kind of attacks in the description of
commit
>>> 04ca6973f7c1a0d.
>>>
>>> AFAIK IETF tried with IPv6 to push fragmentation id generation to
the
>>> end hosts, that's also the reason for the introduction of
atomic
>>> fragments (which are now being rolled back ;) ).
>>>
>>> Still it is better to generate a frag id on the hypervisor than
just
>>> sending a 0, so I am ok with this change, albeit not happy.
>>>
>>> Thanks,
>>> Hannes
>>>
>>
>> OK so to summarize, identifiers are only re-randomized once per jiffy,
>> so you worry that within this window, an external observer can discover
>> past fragment ID values and so predict the future ones.
>> All that's required is that two paths go through the same box
performing
>> fragmentation.
>>
>> Is that a fair summary?
>>
>> If yes, we can make this a bit harder by mixing in some
>> data per input and/or output devices.
>>
>> For example, just to give you the idea:
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 683d493..4faa7ef 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
>>  	trace_netif_receive_skb(skb);
>>  
>>  	orig_dev = skb->dev;
>> +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
>>  
>>  	skb_reset_network_header(skb);
>>  	if (!skb_transport_header_was_set(skb))
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index ce69a12..819a821 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct sock
*sk,
>>  				     sizeof(struct frag_hdr)) & ~7;
>>  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
>>  	ipv6_select_ident(&fhdr, rt);
>> -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
>> +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
>> +						   fhdr.identification);
>>  
>>  append:
>>  	return skb_append_datato_frags(sk, skb, getfrag, from,
>>
> 
> I thought about mixing in the incoming interface identifier into the
> frag id generation, but that could hurt us badly as soon as a VM has
> more than one interface to the outside world and uses e.g. ECMP. We need
> to make sure that those frag ids are unique and the kernel needs to be
> better than just using a random number generator.
>
So the goal behind this series of patches is to restore VM functionality to
pre-916e4cf46d0204 ("ipv6: reuse ip6_frag_id from
ip6_ufo_append_data").

-vlad
> Bye,
> Hannes
> 
>

Hannes Frederic Sowa

2015-Jan-28 14:17 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

On Mi, 2015-01-28 at 15:43 +0200, Michael S. Tsirkin
wrote:> On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
> > Hi,
> > 
> > On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
> > > On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa
wrote:
> > > > Hello,
> > > > 
> > > > On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> > > > > On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes
Frederic Sowa wrote:
> > > > > > On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich
wrote:
> > > > > > > On 01/27/2015 08:47 AM, Hannes Frederic Sowa
wrote:
> > > > > > > > On Di, 2015-01-27 at 10:42 +0200,
Michael S. Tsirkin wrote:
> > > > > > > >> On Tue, Jan 27, 2015 at 02:47:54AM
+0000, Ben Hutchings wrote:
> > > > > > > >>> On Mon, 2015-01-26 at 09:37
-0500, Vladislav Yasevich wrote:
> > > > > > > >>>> If the IPv6 fragment id has
not been set and we perform
> > > > > > > >>>> fragmentation due to UFO,
select a new fragment id.
> > > > > > > >>>> When we store the fragment
id into skb_shinfo, set the bit
> > > > > > > >>>> in the skb so we can re-use
the selected id.
> > > > > > > >>>> This preserves the behavior
of UFO packets generated on the
> > > > > > > >>>> host and solves the issue of
id generation for packet sockets
> > > > > > > >>>> and tap/macvtap devices.
> > > > > > > >>>>
> > > > > > > >>>> This patch moves
ipv6_select_ident() back in to the header file.
> > > > > > > >>>> It also provides the helper
function that sets skb_shinfo() fragd have to patch both kernels *in your case*.If it's all done by host, then it's in a single place, on
host.> > > > > > > >>>> id and sets the bit.
> > > > > > > >>>>
> > > > > > > >>>> It also makes sure that we
select the fragment id when doing
> > > > > > > >>>> just gso validation, since
it's possible for the packet to
> > > > > > > >>>> come from an untrusted
source (VM) and be forwarded through
> > > > > > > >>>> a UFO enabled device which
will expect the fragment id.
> > > > > > > >>>>
> > > > > > > >>>> CC: Eric Dumazet
<edumazet at google.com>
> > > > > > > >>>> Signed-off-by: Vladislav
Yasevich <vyasevic at redhat.com>
> > > > > > > >>>> ---
> > > > > > > >>>>  include/linux/skbuff.h |  3
++-
> > > > > > > >>>>  include/net/ipv6.h     |  2
++
> > > > > > > >>>>  net/ipv6/ip6_output.c  |  4
++--
> > > > > > > >>>>  net/ipv6/output_core.c |  9
++++++++-
> > > > > > > >>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> > > > > > > >>>>  5 files changed, 23
insertions(+), 5 deletions(-)
> > > > > > > >>>>
> > > > > > > >>>> diff --git
a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > > > > >>>> index 85ab7d7..3ad5203
100644
> > > > > > > >>>> --- a/include/linux/skbuff.h
> > > > > > > >>>> +++ b/include/linux/skbuff.h
> > > > > > > >>>> @@ -605,7 +605,8 @@ struct
sk_buff {
> > > > > > > >>>>  	__u8			ipvs_property:1;
> > > > > > > >>>>  	__u8		
inner_protocol_type:1;
> > > > > > > >>>>  	__u8			remcsum_offload:1;
> > > > > > > >>>> -	/* 3 or 5 bit hole */
> > > > > > > >>>> +	__u8			ufo_fragid_set:1;
> > > > > > > >>> [...]
> > > > > > > >>>
> > > > > > > >>> Doesn't the flag belong in
struct skb_shared_info, rather than struct
> > > > > > > >>> sk_buff?  Otherwise this looks
fine.
> > > > > > > >>>
> > > > > > > >>> Ben.
> > > > > > > >>
> > > > > > > >> Hmm we seem to be out of tx flags.
> > > > > > > >> Maybe ip6_frag_id == 0 should mean
"not set".
> > > > > > > > 
> > > > > > > > Maybe that is the best idea. Definitely
the ufo_fragid_set bit should
> > > > > > > > move into the skb_shared_info area.
> > > > > > > 
> > > > > > > That's what I originally wanted to do,
but had to move and grow txflags thus
> > > > > > > skb_shinfo ended up growing.  I wanted to
avoid that, so stole an skb flag.
> > > > > > > 
> > > > > > > I considered treating fragid == 0 as unset,
but a 0 fragid is perfectly valid
> > > > > > > from the protocol perspective and could
actually be generated by the id generator
> > > > > > > functions.  This may cause us to call the id
generation multiple times.
> > > > > > 
> > > > > > Are there plans in the long run to let virtio_net
transmit auxiliary
> > > > > > data to the other end so we can clean all of this
this up one day?
> > > > > > 
> > > > > > I don't like the whole situation: looking into
the virtio_net headers
> > > > > > just adding a field for ipv6 fragmentation ids to
those small structs
> > > > > > seems bloated, not doing it feels incorrect. :/
> > > > > > 
> > > > > > Thoughts?
> > > > > > 
> > > > > > Bye,
> > > > > > Hannes
> > > > > 
> > > > > I'm not sure - what will be achieved by generating
the IDs guest side as
> > > > > opposed to host side?  It's certainly harder to get
hold of entropy
> > > > > guest-side.
> > > > 
> > > > It is not only about entropy but about uniqueness.  Also
fragmentation
> > > > ids should not be discoverable,
> > > 
> > > I belive "predictable" is the language used by the IETF
draft.
> > > 
> > > > so there are several aspects:
> > > > 
> > > > I see fragmentation id generation still as security
critical:
> > > > When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
> > > > identifiers less predictable") I could patch my kernels
and use the
> > > > patch regardless of the machine being virtualized or not. It
was not
> > > > dependent on the hypervisor.
> > > 
> > > And now it's even easier - just patch the hypervisor, and all
VMs
> > > automatically benefit.
> > 
> > Sometimes the hypervisor is not under my control. You would need to
> > patch both kernels in your case - non gso frames would still get the
> > fragmentation id generated in the host kernel.
> 
> Confused. You would have to patch both kernels *in your case*.
> If it's all done by host, then it's in a single place, on host.
host is the hypervisor?

Anyway, we would have to patch both kernels now anyway. :)

We still have a working ipv6_fragment routine in the virtualized kernel
which can embed fragmentation extension headers in frames, thus
generating the fragmentation id in the virtualized kernel.
> > > > I think that is the same reasoning why we
> > > > don't support TOE.
> > > > If we use one generator in the hypervisor in an openstack
alike setting,
> > > > the host deals with quite a lot of overlay networks. A lot
of default
> > > > configurations use the same addresses internally, so on the
hypervisor
> > > > the frag id generators would interfere by design.
> > > > I could come up with an attack scenario for DNS servers
(again :) ):
> > > > 
> > > > You are sitting next to a DNS server on the same hypervisor
and can send
> > > > packets without source validation (because that is handled
later on in
> > > > case of openvswitch when the packet is put into the
corresponding
> > > > overlay network). You emit a gso packet with the same source
and
> > > > destination addresses as the DNS server would do and would
get an
> > > > fragmentation id which is linearly (+ time delta)
incremented depending
> > > > on the source and destination address. With such a leak you
could start
> > > > trying attack and spoof DNS responses (fragmentation attacks
etc.).
> > > > See also details on such kind of attacks in the description
of commit
> > > > 04ca6973f7c1a0d.
> > > > 
> > > > AFAIK IETF tried with IPv6 to push fragmentation id
generation to the
> > > > end hosts, that's also the reason for the introduction
of atomic
> > > > fragments (which are now being rolled back ;) ).
> > > > 
> > > > Still it is better to generate a frag id on the hypervisor
than just
> > > > sending a 0, so I am ok with this change, albeit not happy.
> > > > 
> > > > Thanks,
> > > > Hannes
> > > > 
> > > 
> > > OK so to summarize, identifiers are only re-randomized once per
jiffy,
> > > so you worry that within this window, an external observer can
discover
> > > past fragment ID values and so predict the future ones.
> > > All that's required is that two paths go through the same box
performing
> > > fragmentation.
> > > 
> > > Is that a fair summary?
> > > 
> > > If yes, we can make this a bit harder by mixing in some
> > > data per input and/or output devices.
> > > 
> > > For example, just to give you the idea:
> > > 
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 683d493..4faa7ef 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
> > >  	trace_netif_receive_skb(skb);
> > >  
> > >  	orig_dev = skb->dev;
> > > +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
> > >  
> > >  	skb_reset_network_header(skb);
> > >  	if (!skb_transport_header_was_set(skb))
> > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > > index ce69a12..819a821 100644
> > > --- a/net/ipv6/ip6_output.c
> > > +++ b/net/ipv6/ip6_output.c
> > > @@ -1092,7 +1092,8 @@ static inline int
ip6_ufo_append_data(struct sock *sk,
> > >  				     sizeof(struct frag_hdr)) & ~7;
> > >  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> > >  	ipv6_select_ident(&fhdr, rt);
> > > -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> > > +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> > > +						   fhdr.identification);
> > >  
> > >  append:
> > >  	return skb_append_datato_frags(sk, skb, getfrag, from,
> > > 
> > 
> > I thought about mixing in the incoming interface identifier into the
> > frag id generation, but that could hurt us badly as soon as a VM has
> > more than one interface to the outside world and uses e.g. ECMP.
> 
> I don't understand. Fragmentation is done after routing,
> isn't it? So all fragments always go out on the same device.
It is the other way around:

(Source, Dest) is the key to lookup the next fragmentation id. We send
two fragments and use (Source, Dest, ifindex) as key, then the first
packet leaves the host on ifindex 1 with fragid x (because it was a big
gso packet it got segments), second packet leaves host on ifindex 2 with
same Source and Dest and results in fragid y. Ideally both those packets
should use the same bucket to improve uniqueness. Without ifindex we
would have y = x+<number segments>, otherwise it would be (I guess)
random.
> > We need
> > to make sure that those frag ids are unique and the kernel needs to be
> > better than just using a random number generator.
> > 
> > Bye,
> > Hannes
> 
> 32 bit numbers can't be unique. They just shouldn't be discoverable
> by an off-path observer.
No, they can't. But they should be reasonable unique and, as you said,
not discoverable.

Bye,
Hannes

Hannes Frederic Sowa

2015-Jan-28 14:45 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

Hi,

On Mi, 2015-01-28 at 09:16 -0500, Vlad Yasevich wrote:> On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:
> > Hi,
> > 
> > On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
> >> On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa
wrote:
> >>> Hello,
> >>>
> >>> On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> >>>> On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic
Sowa wrote:
> >>>>> On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
> >>>>>> On 01/27/2015 08:47 AM, Hannes Frederic Sowa
wrote:
> >>>>>>> On Di, 2015-01-27 at 10:42 +0200, Michael S.
Tsirkin wrote:
> >>>>>>>> On Tue, Jan 27, 2015 at 02:47:54AM +0000,
Ben Hutchings wrote:
> >>>>>>>>> On Mon, 2015-01-26 at 09:37 -0500,
Vladislav Yasevich wrote:
> >>>>>>>>>> If the IPv6 fragment id has not
been set and we perform
> >>>>>>>>>> fragmentation due to UFO, select a
new fragment id.
> >>>>>>>>>> When we store the fragment id into
skb_shinfo, set the bit
> >>>>>>>>>> in the skb so we can re-use the
selected id.
> >>>>>>>>>> This preserves the behavior of UFO
packets generated on the
> >>>>>>>>>> host and solves the issue of id
generation for packet sockets
> >>>>>>>>>> and tap/macvtap devices.
> >>>>>>>>>>
> >>>>>>>>>> This patch moves
ipv6_select_ident() back in to the header file.
> >>>>>>>>>> It also provides the helper
function that sets skb_shinfo() frag
> >>>>>>>>>> id and sets the bit.
> >>>>>>>>>>
> >>>>>>>>>> It also makes sure that we select
the fragment id when doing
> >>>>>>>>>> just gso validation, since
it's possible for the packet to
> >>>>>>>>>> come from an untrusted source (VM)
and be forwarded through
> >>>>>>>>>> a UFO enabled device which will
expect the fragment id.
> >>>>>>>>>>
> >>>>>>>>>> CC: Eric Dumazet <edumazet at
google.com>
> >>>>>>>>>> Signed-off-by: Vladislav Yasevich
<vyasevic at redhat.com>
> >>>>>>>>>> ---
> >>>>>>>>>>  include/linux/skbuff.h |  3 ++-
> >>>>>>>>>>  include/net/ipv6.h     |  2 ++
> >>>>>>>>>>  net/ipv6/ip6_output.c  |  4 ++--
> >>>>>>>>>>  net/ipv6/output_core.c |  9
++++++++-
> >>>>>>>>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> >>>>>>>>>>  5 files changed, 23
insertions(+), 5 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git
a/include/linux/skbuff.h b/include/linux/skbuff.h
> >>>>>>>>>> index 85ab7d7..3ad5203 100644
> >>>>>>>>>> --- a/include/linux/skbuff.h
> >>>>>>>>>> +++ b/include/linux/skbuff.h
> >>>>>>>>>> @@ -605,7 +605,8 @@ struct sk_buff
{
> >>>>>>>>>>  	__u8			ipvs_property:1;
> >>>>>>>>>>  	__u8			inner_protocol_type:1;
> >>>>>>>>>>  	__u8			remcsum_offload:1;
> >>>>>>>>>> -	/* 3 or 5 bit hole */
> >>>>>>>>>> +	__u8			ufo_fragid_set:1;
> >>>>>>>>> [...]
> >>>>>>>>>
> >>>>>>>>> Doesn't the flag belong in struct
skb_shared_info, rather than struct
> >>>>>>>>> sk_buff?  Otherwise this looks fine.
> >>>>>>>>>
> >>>>>>>>> Ben.
> >>>>>>>>
> >>>>>>>> Hmm we seem to be out of tx flags.
> >>>>>>>> Maybe ip6_frag_id == 0 should mean
"not set".
> >>>>>>>
> >>>>>>> Maybe that is the best idea. Definitely the
ufo_fragid_set bit should
> >>>>>>> move into the skb_shared_info area.
> >>>>>>
> >>>>>> That's what I originally wanted to do, but had
to move and grow txflags thus
> >>>>>> skb_shinfo ended up growing.  I wanted to avoid
that, so stole an skb flag.
> >>>>>>
> >>>>>> I considered treating fragid == 0 as unset, but a
0 fragid is perfectly valid
> >>>>>> from the protocol perspective and could actually
be generated by the id generator
> >>>>>> functions.  This may cause us to call the id
generation multiple times.
> >>>>>
> >>>>> Are there plans in the long run to let virtio_net
transmit auxiliary
> >>>>> data to the other end so we can clean all of this this
up one day?
> >>>>>
> >>>>> I don't like the whole situation: looking into the
virtio_net headers
> >>>>> just adding a field for ipv6 fragmentation ids to
those small structs
> >>>>> seems bloated, not doing it feels incorrect. :/
> >>>>>
> >>>>> Thoughts?
> >>>>>
> >>>>> Bye,
> >>>>> Hannes
> >>>>
> >>>> I'm not sure - what will be achieved by generating the
IDs guest side as
> >>>> opposed to host side?  It's certainly harder to get
hold of entropy
> >>>> guest-side.
> >>>
> >>> It is not only about entropy but about uniqueness.  Also
fragmentation
> >>> ids should not be discoverable,
> >>
> >> I belive "predictable" is the language used by the IETF
draft.
> >>
> >>> so there are several aspects:
> >>>
> >>> I see fragmentation id generation still as security critical:
> >>> When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
> >>> identifiers less predictable") I could patch my kernels
and use the
> >>> patch regardless of the machine being virtualized or not. It
was not
> >>> dependent on the hypervisor.
> >>
> >> And now it's even easier - just patch the hypervisor, and all
VMs
> >> automatically benefit.
> > 
> > Sometimes the hypervisor is not under my control. You would need to
> > patch both kernels in your case - non gso frames would still get the
> > fragmentation id generated in the host kernel.
> 
> Why would non-gso frames need a frag id?  We are talking only UDP IPv6
> here, so there is no frag id generation if the packet does't need to
> be fragmented.
E.g. raw sockets still can generate fragments locally. It is also a
valid setup to have multiple interfaces in one machine, one that is UFO
enabled and one that isn't. In that case, fragmentation id generation
happens on different hosts which I want to avoid.

I haven't looked closely but mismatch of MTUs on interfaces seems like
it could lead to unwanted fragmentation, e.g. see is_skb_forwardable
which is mostly always true for gso frames, so we never stop them on
bridges etc.
> >>> I think that is the same reasoning why we
> >>> don't support TOE.
> >>> If we use one generator in the hypervisor in an openstack
alike setting,
> >>> the host deals with quite a lot of overlay networks. A lot of
default
> >>> configurations use the same addresses internally, so on the
hypervisor
> >>> the frag id generators would interfere by design.
> >>> I could come up with an attack scenario for DNS servers (again
:) ):
> >>>
> >>> You are sitting next to a DNS server on the same hypervisor
and can send
> >>> packets without source validation (because that is handled
later on in
> >>> case of openvswitch when the packet is put into the
corresponding
> >>> overlay network). You emit a gso packet with the same source
and
> >>> destination addresses as the DNS server would do and would get
an
> >>> fragmentation id which is linearly (+ time delta) incremented
depending
> >>> on the source and destination address. With such a leak you
could start
> >>> trying attack and spoof DNS responses (fragmentation attacks
etc.).
> >>> See also details on such kind of attacks in the description of
commit
> >>> 04ca6973f7c1a0d.
> >>>
> >>> AFAIK IETF tried with IPv6 to push fragmentation id generation
to the
> >>> end hosts, that's also the reason for the introduction of
atomic
> >>> fragments (which are now being rolled back ;) ).
> >>>
> >>> Still it is better to generate a frag id on the hypervisor
than just
> >>> sending a 0, so I am ok with this change, albeit not happy.
> >>>
> >>> Thanks,
> >>> Hannes
> >>>
> >>
> >> OK so to summarize, identifiers are only re-randomized once per
jiffy,
> >> so you worry that within this window, an external observer can
discover
> >> past fragment ID values and so predict the future ones.
> >> All that's required is that two paths go through the same box
performing
> >> fragmentation.
> >>
> >> Is that a fair summary?
> >>
> >> If yes, we can make this a bit harder by mixing in some
> >> data per input and/or output devices.
> >>
> >> For example, just to give you the idea:
> >>
> >> diff --git a/net/core/dev.c b/net/core/dev.c
> >> index 683d493..4faa7ef 100644
> >> --- a/net/core/dev.c
> >> +++ b/net/core/dev.c
> >> @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
> >>  	trace_netif_receive_skb(skb);
> >>  
> >>  	orig_dev = skb->dev;
> >> +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
> >>  
> >>  	skb_reset_network_header(skb);
> >>  	if (!skb_transport_header_was_set(skb))
> >> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> >> index ce69a12..819a821 100644
> >> --- a/net/ipv6/ip6_output.c
> >> +++ b/net/ipv6/ip6_output.c
> >> @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct
sock *sk,
> >>  				     sizeof(struct frag_hdr)) & ~7;
> >>  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> >>  	ipv6_select_ident(&fhdr, rt);
> >> -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> >> +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> >> +						   fhdr.identification);
> >>  
> >>  append:
> >>  	return skb_append_datato_frags(sk, skb, getfrag, from,
> >>
> > 
> > I thought about mixing in the incoming interface identifier into the
> > frag id generation, but that could hurt us badly as soon as a VM has
> > more than one interface to the outside world and uses e.g. ECMP. We
need
> > to make sure that those frag ids are unique and the kernel needs to be
> > better than just using a random number generator.
> >
> 
> So the goal behind this series of patches is to restore VM functionality to
> pre-916e4cf46d0204 ("ipv6: reuse ip6_frag_id from
ip6_ufo_append_data").
I understand (the patch fixed a NULL ptr deref btw.).

As I said, I don't want to stop this series (hopefully the flag can be
moved into skb_shared_info etc.), would look after that IMHO
(skb flags/IPCB and skb_shared_info have different semantics on
__skb_clone).

I think it is very much worth to try to move the fragmentation id
generation back to the end host and only use this as a fallback.

Bye,
Hannes

Michael S. Tsirkin

2015-Jan-28 16:00 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa
wrote:> Hi,
> 
> On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
> > On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
> > > Hello,
> > > 
> > > On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> > > > On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic
Sowa wrote:
> > > > > On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
> > > > > > On 01/27/2015 08:47 AM, Hannes Frederic Sowa
wrote:
> > > > > > > On Di, 2015-01-27 at 10:42 +0200, Michael S.
Tsirkin wrote:
> > > > > > >> On Tue, Jan 27, 2015 at 02:47:54AM +0000,
Ben Hutchings wrote:
> > > > > > >>> On Mon, 2015-01-26 at 09:37 -0500,
Vladislav Yasevich wrote:
> > > > > > >>>> If the IPv6 fragment id has not
been set and we perform
> > > > > > >>>> fragmentation due to UFO, select
a new fragment id.
> > > > > > >>>> When we store the fragment id
into skb_shinfo, set the bit
> > > > > > >>>> in the skb so we can re-use the
selected id.
> > > > > > >>>> This preserves the behavior of
UFO packets generated on the
> > > > > > >>>> host and solves the issue of id
generation for packet sockets
> > > > > > >>>> and tap/macvtap devices.
> > > > > > >>>>
> > > > > > >>>> This patch moves
ipv6_select_ident() back in to the header file.
> > > > > > >>>> It also provides the helper
function that sets skb_shinfo() frag
> > > > > > >>>> id and sets the bit.
> > > > > > >>>>
> > > > > > >>>> It also makes sure that we select
the fragment id when doing
> > > > > > >>>> just gso validation, since
it's possible for the packet to
> > > > > > >>>> come from an untrusted source
(VM) and be forwarded through
> > > > > > >>>> a UFO enabled device which will
expect the fragment id.
> > > > > > >>>>
> > > > > > >>>> CC: Eric Dumazet <edumazet at
google.com>
> > > > > > >>>> Signed-off-by: Vladislav Yasevich
<vyasevic at redhat.com>
> > > > > > >>>> ---
> > > > > > >>>>  include/linux/skbuff.h |  3 ++-
> > > > > > >>>>  include/net/ipv6.h     |  2 ++
> > > > > > >>>>  net/ipv6/ip6_output.c  |  4 ++--
> > > > > > >>>>  net/ipv6/output_core.c |  9
++++++++-
> > > > > > >>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> > > > > > >>>>  5 files changed, 23
insertions(+), 5 deletions(-)
> > > > > > >>>>
> > > > > > >>>> diff --git
a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > > > >>>> index 85ab7d7..3ad5203 100644
> > > > > > >>>> --- a/include/linux/skbuff.h
> > > > > > >>>> +++ b/include/linux/skbuff.h
> > > > > > >>>> @@ -605,7 +605,8 @@ struct
sk_buff {
> > > > > > >>>>  	__u8			ipvs_property:1;
> > > > > > >>>>  	__u8			inner_protocol_type:1;
> > > > > > >>>>  	__u8			remcsum_offload:1;
> > > > > > >>>> -	/* 3 or 5 bit hole */
> > > > > > >>>> +	__u8			ufo_fragid_set:1;
> > > > > > >>> [...]
> > > > > > >>>
> > > > > > >>> Doesn't the flag belong in struct
skb_shared_info, rather than struct
> > > > > > >>> sk_buff?  Otherwise this looks fine.
> > > > > > >>>
> > > > > > >>> Ben.
> > > > > > >>
> > > > > > >> Hmm we seem to be out of tx flags.
> > > > > > >> Maybe ip6_frag_id == 0 should mean
"not set".
> > > > > > > 
> > > > > > > Maybe that is the best idea. Definitely the
ufo_fragid_set bit should
> > > > > > > move into the skb_shared_info area.
> > > > > > 
> > > > > > That's what I originally wanted to do, but had
to move and grow txflags thus
> > > > > > skb_shinfo ended up growing.  I wanted to avoid
that, so stole an skb flag.
> > > > > > 
> > > > > > I considered treating fragid == 0 as unset, but a
0 fragid is perfectly valid
> > > > > > from the protocol perspective and could actually
be generated by the id generator
> > > > > > functions.  This may cause us to call the id
generation multiple times.
> > > > > 
> > > > > Are there plans in the long run to let virtio_net
transmit auxiliary
> > > > > data to the other end so we can clean all of this this
up one day?
> > > > > 
> > > > > I don't like the whole situation: looking into the
virtio_net headers
> > > > > just adding a field for ipv6 fragmentation ids to those
small structs
> > > > > seems bloated, not doing it feels incorrect. :/
> > > > > 
> > > > > Thoughts?
> > > > > 
> > > > > Bye,
> > > > > Hannes
> > > > 
> > > > I'm not sure - what will be achieved by generating the
IDs guest side as
> > > > opposed to host side?  It's certainly harder to get hold
of entropy
> > > > guest-side.
> > > 
> > > It is not only about entropy but about uniqueness.  Also
fragmentation
> > > ids should not be discoverable,
> > 
> > I belive "predictable" is the language used by the IETF
draft.
> > 
> > > so there are several aspects:
> > > 
> > > I see fragmentation id generation still as security critical:
> > > When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
> > > identifiers less predictable") I could patch my kernels and
use the
> > > patch regardless of the machine being virtualized or not. It was
not
> > > dependent on the hypervisor.
> > 
> > And now it's even easier - just patch the hypervisor, and all VMs
> > automatically benefit.
> 
> Sometimes the hypervisor is not under my control.
In that case doing things like extending virtio
is out of the question too, isn't it?
It needs hypervisor changes.
> You would need to
> patch both kernels in your case - non gso frames would still get the
> fragmentation id generated in the host kernel.
> 
> > > I think that is the same reasoning why we
> > > don't support TOE.
> > > If we use one generator in the hypervisor in an openstack alike
setting,
> > > the host deals with quite a lot of overlay networks. A lot of
default
> > > configurations use the same addresses internally, so on the
hypervisor
> > > the frag id generators would interfere by design.
> > > I could come up with an attack scenario for DNS servers (again :)
):
> > > 
> > > You are sitting next to a DNS server on the same hypervisor and
can send
> > > packets without source validation (because that is handled later
on in
> > > case of openvswitch when the packet is put into the corresponding
> > > overlay network). You emit a gso packet with the same source and
> > > destination addresses as the DNS server would do and would get an
> > > fragmentation id which is linearly (+ time delta) incremented
depending
> > > on the source and destination address. With such a leak you could
start
> > > trying attack and spoof DNS responses (fragmentation attacks
etc.).
> > > See also details on such kind of attacks in the description of
commit
> > > 04ca6973f7c1a0d.
> > > 
> > > AFAIK IETF tried with IPv6 to push fragmentation id generation to
the
> > > end hosts, that's also the reason for the introduction of
atomic
> > > fragments (which are now being rolled back ;) ).
> > > 
> > > Still it is better to generate a frag id on the hypervisor than
just
> > > sending a 0, so I am ok with this change, albeit not happy.
> > > 
> > > Thanks,
> > > Hannes
> > > 
> > 
> > OK so to summarize, identifiers are only re-randomized once per jiffy,
> > so you worry that within this window, an external observer can
discover
> > past fragment ID values and so predict the future ones.
> > All that's required is that two paths go through the same box
performing
> > fragmentation.
> > 
> > Is that a fair summary?
No answer here?
> > If yes, we can make this a bit harder by mixing in some
> > data per input and/or output devices.
> > 
> > For example, just to give you the idea:
> > 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 683d493..4faa7ef 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
> >  	trace_netif_receive_skb(skb);
> >  
> >  	orig_dev = skb->dev;
> > +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
> >  
> >  	skb_reset_network_header(skb);
> >  	if (!skb_transport_header_was_set(skb))
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index ce69a12..819a821 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct
sock *sk,
> >  				     sizeof(struct frag_hdr)) & ~7;
> >  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> >  	ipv6_select_ident(&fhdr, rt);
> > -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> > +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> > +						   fhdr.identification);
> >  
> >  append:
> >  	return skb_append_datato_frags(sk, skb, getfrag, from,
> > 
> 
> I thought about mixing in the incoming interface identifier into the
> frag id generation, but that could hurt us badly as soon as a VM has
> more than one interface to the outside world and uses e.g. ECMP.
> We need
> to make sure that those frag ids are unique and the kernel needs to be
> better than just using a random number generator.
> 
> Bye,
> Hannes
OK then. Like this:

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 679e6e9..1ee9a3a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1508,6 +1508,9 @@ struct net_device {
 	 *	part of the usual set specified in Space.c.
 	 */
 
+	/* Extra hash to mix into IPv6 frag ID on packets received from here. */
+	unsigned int		frag_id_hash;
+
 	unsigned long		state;
 
 	struct list_head	dev_list;
diff --git a/net/core/dev.c b/net/core/dev.c
index 683d493..56f1898 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct sk_buff *skb,
bool pfmemalloc)
 	trace_netif_receive_skb(skb);
 
 	orig_dev = skb->dev;
+	skb_shinfo(skb)->ip6_frag_id = skb->dev->frag_id_hash;
 
 	skb_reset_network_header(skb);
 	if (!skb_transport_header_was_set(skb))
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ce69a12..819a821 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 				     sizeof(struct frag_hdr)) & ~7;
 	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
 	ipv6_select_ident(&fhdr, rt);
-	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
+	skb_shinfo(skb)->ip6_frag_id = jhash_1word(skb_shinfo(skb)->ip6_frag_id,
+						   fhdr.identification);
 
 append:
 	return skb_append_datato_frags(sk, skb, getfrag, from,


Add to this a netlink/sysfs API to set the frag_id_hash for
devices.

Now, user can set identical frag id hash for all devices
for a given VM.

We can even expose this to guests: each guest would generate
the ID on boot and send it to host, host would set it
in sysfs.



-- 
MST

Hannes Frederic Sowa

2015-Jan-28 16:15 UTC

head link

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

Hi,

On Mi, 2015-01-28 at 18:00 +0200, Michael S. Tsirkin
wrote:> On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
> > Hi,
> > 
> > On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
> > > On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa
wrote:
> > > > Hello,
> > > > 
> > > > On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
> > > > > On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes
Frederic Sowa wrote:
> > > > > > On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich
wrote:
> > > > > > > On 01/27/2015 08:47 AM, Hannes Frederic Sowa
wrote:
> > > > > > > > On Di, 2015-01-27 at 10:42 +0200,
Michael S. Tsirkin wrote:
> > > > > > > >> On Tue, Jan 27, 2015 at 02:47:54AM
+0000, Ben Hutchings wrote:
> > > > > > > >>> On Mon, 2015-01-26 at 09:37
-0500, Vladislav Yasevich wrote:
> > > > > > > >>>> If the IPv6 fragment id has
not been set and we perform
> > > > > > > >>>> fragmentation due to UFO,
select a new fragment id.
> > > > > > > >>>> When we store the fragment
id into skb_shinfo, set the bit
> > > > > > > >>>> in the skb so we can re-use
the selected id.
> > > > > > > >>>> This preserves the behavior
of UFO packets generated on the
> > > > > > > >>>> host and solves the issue of
id generation for packet sockets
> > > > > > > >>>> and tap/macvtap devices.
> > > > > > > >>>>
> > > > > > > >>>> This patch moves
ipv6_select_ident() back in to the header file.
> > > > > > > >>>> It also provides the helper
function that sets skb_shinfo() frag
> > > > > > > >>>> id and sets the bit.
> > > > > > > >>>>
> > > > > > > >>>> It also makes sure that we
select the fragment id when doing
> > > > > > > >>>> just gso validation, since
it's possible for the packet to
> > > > > > > >>>> come from an untrusted
source (VM) and be forwarded through
> > > > > > > >>>> a UFO enabled device which
will expect the fragment id.
> > > > > > > >>>>
> > > > > > > >>>> CC: Eric Dumazet
<edumazet at google.com>
> > > > > > > >>>> Signed-off-by: Vladislav
Yasevich <vyasevic at redhat.com>
> > > > > > > >>>> ---
> > > > > > > >>>>  include/linux/skbuff.h |  3
++-
> > > > > > > >>>>  include/net/ipv6.h     |  2
++
> > > > > > > >>>>  net/ipv6/ip6_output.c  |  4
++--
> > > > > > > >>>>  net/ipv6/output_core.c |  9
++++++++-
> > > > > > > >>>>  net/ipv6/udp_offload.c | 10
+++++++++-
> > > > > > > >>>>  5 files changed, 23
insertions(+), 5 deletions(-)
> > > > > > > >>>>
> > > > > > > >>>> diff --git
a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > > > > >>>> index 85ab7d7..3ad5203
100644
> > > > > > > >>>> --- a/include/linux/skbuff.h
> > > > > > > >>>> +++ b/include/linux/skbuff.h
> > > > > > > >>>> @@ -605,7 +605,8 @@ struct
sk_buff {
> > > > > > > >>>>  	__u8			ipvs_property:1;
> > > > > > > >>>>  	__u8		
inner_protocol_type:1;
> > > > > > > >>>>  	__u8			remcsum_offload:1;
> > > > > > > >>>> -	/* 3 or 5 bit hole */
> > > > > > > >>>> +	__u8			ufo_fragid_set:1;
> > > > > > > >>> [...]
> > > > > > > >>>
> > > > > > > >>> Doesn't the flag belong in
struct skb_shared_info, rather than struct
> > > > > > > >>> sk_buff?  Otherwise this looks
fine.
> > > > > > > >>>
> > > > > > > >>> Ben.
> > > > > > > >>
> > > > > > > >> Hmm we seem to be out of tx flags.
> > > > > > > >> Maybe ip6_frag_id == 0 should mean
"not set".
> > > > > > > > 
> > > > > > > > Maybe that is the best idea. Definitely
the ufo_fragid_set bit should
> > > > > > > > move into the skb_shared_info area.
> > > > > > > 
> > > > > > > That's what I originally wanted to do,
but had to move and grow txflags thus
> > > > > > > skb_shinfo ended up growing.  I wanted to
avoid that, so stole an skb flag.
> > > > > > > 
> > > > > > > I considered treating fragid == 0 as unset,
but a 0 fragid is perfectly valid
> > > > > > > from the protocol perspective and could
actually be generated by the id generator
> > > > > > > functions.  This may cause us to call the id
generation multiple times.
> > > > > > 
> > > > > > Are there plans in the long run to let virtio_net
transmit auxiliary
> > > > > > data to the other end so we can clean all of this
this up one day?
> > > > > > 
> > > > > > I don't like the whole situation: looking into
the virtio_net headers
> > > > > > just adding a field for ipv6 fragmentation ids to
those small structs
> > > > > > seems bloated, not doing it feels incorrect. :/
> > > > > > 
> > > > > > Thoughts?
> > > > > > 
> > > > > > Bye,
> > > > > > Hannes
> > > > > 
> > > > > I'm not sure - what will be achieved by generating
the IDs guest side as
> > > > > opposed to host side?  It's certainly harder to get
hold of entropy
> > > > > guest-side.
> > > > 
> > > > It is not only about entropy but about uniqueness.  Also
fragmentation
> > > > ids should not be discoverable,
> > > 
> > > I belive "predictable" is the language used by the IETF
draft.
> > > 
> > > > so there are several aspects:
> > > > 
> > > > I see fragmentation id generation still as security
critical:
> > > > When Eric patched the frag id generator in 04ca6973f7c1a0d
("ip: make IP
> > > > identifiers less predictable") I could patch my kernels
and use the
> > > > patch regardless of the machine being virtualized or not. It
was not
> > > > dependent on the hypervisor.
> > > 
> > > And now it's even easier - just patch the hypervisor, and all
VMs
> > > automatically benefit.
> > 
> > Sometimes the hypervisor is not under my control.
> 
> In that case doing things like extending virtio
> is out of the question too, isn't it?
> It needs hypervisor changes.
Sure, but I would like to have the fragmentation id generator to reside
inside the end-host kernel. Hypervisor needs to carry the frag id along,
sure, and needs to be changed accordingly.

So in either case we need to change both kernels. ;)
> 
> > You would need to
> > patch both kernels in your case - non gso frames would still get the
> > fragmentation id generated in the host kernel.
> > 
> > > > I think that is the same reasoning why we
> > > > don't support TOE.
> > > > If we use one generator in the hypervisor in an openstack
alike setting,
> > > > the host deals with quite a lot of overlay networks. A lot
of default
> > > > configurations use the same addresses internally, so on the
hypervisor
> > > > the frag id generators would interfere by design.
> > > > I could come up with an attack scenario for DNS servers
(again :) ):
> > > > 
> > > > You are sitting next to a DNS server on the same hypervisor
and can send
> > > > packets without source validation (because that is handled
later on in
> > > > case of openvswitch when the packet is put into the
corresponding
> > > > overlay network). You emit a gso packet with the same source
and
> > > > destination addresses as the DNS server would do and would
get an
> > > > fragmentation id which is linearly (+ time delta)
incremented depending
> > > > on the source and destination address. With such a leak you
could start
> > > > trying attack and spoof DNS responses (fragmentation attacks
etc.).
> > > > See also details on such kind of attacks in the description
of commit
> > > > 04ca6973f7c1a0d.
> > > > 
> > > > AFAIK IETF tried with IPv6 to push fragmentation id
generation to the
> > > > end hosts, that's also the reason for the introduction
of atomic
> > > > fragments (which are now being rolled back ;) ).
> > > > 
> > > > Still it is better to generate a frag id on the hypervisor
than just
> > > > sending a 0, so I am ok with this change, albeit not happy.
> > > > 
> > > > Thanks,
> > > > Hannes
> > > > 
> > > 
> > > OK so to summarize, identifiers are only re-randomized once per
jiffy,
> > > so you worry that within this window, an external observer can
discover
> > > past fragment ID values and so predict the future ones.
> > > All that's required is that two paths go through the same box
performing
> > > fragmentation.
> > > 
> > > Is that a fair summary?
> 
> No answer here?
Ups, sorry.

It is not re-randomized but only biased by a time delta (note the
prandom_u32_max). So even after such an increment happens you can still
guess the range of the current fragmentation ids for a longer time.

Otherwise it is a fair summary.
> 
> > > If yes, we can make this a bit harder by mixing in some
> > > data per input and/or output devices.
> > > 
> > > For example, just to give you the idea:
> > > 
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 683d493..4faa7ef 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct
sk_buff *skb, bool pfmemalloc)
> > >  	trace_netif_receive_skb(skb);
> > >  
> > >  	orig_dev = skb->dev;
> > > +	skb_shinfo(skb)->ip6_frag_id = skb->dev->ifindex;
> > >  
> > >  	skb_reset_network_header(skb);
> > >  	if (!skb_transport_header_was_set(skb))
> > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > > index ce69a12..819a821 100644
> > > --- a/net/ipv6/ip6_output.c
> > > +++ b/net/ipv6/ip6_output.c
> > > @@ -1092,7 +1092,8 @@ static inline int
ip6_ufo_append_data(struct sock *sk,
> > >  				     sizeof(struct frag_hdr)) & ~7;
> > >  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> > >  	ipv6_select_ident(&fhdr, rt);
> > > -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> > > +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> > > +						   fhdr.identification);
> > >  
> > >  append:
> > >  	return skb_append_datato_frags(sk, skb, getfrag, from,
> > > 
> > 
> > I thought about mixing in the incoming interface identifier into the
> > frag id generation, but that could hurt us badly as soon as a VM has
> > more than one interface to the outside world and uses e.g. ECMP.
> > We need
> > to make sure that those frag ids are unique and the kernel needs to be
> > better than just using a random number generator.
> > 
> > Bye,
> > Hannes
> 
> OK then. Like this:
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 679e6e9..1ee9a3a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1508,6 +1508,9 @@ struct net_device {
>  	 *	part of the usual set specified in Space.c.
>  	 */
>  
> +	/* Extra hash to mix into IPv6 frag ID on packets received from here. */
> +	unsigned int		frag_id_hash;
> +
>  	unsigned long		state;
>  
>  	struct list_head	dev_list;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 683d493..56f1898 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3625,6 +3625,7 @@ static int __netif_receive_skb_core(struct sk_buff
*skb, bool pfmemalloc)
>  	trace_netif_receive_skb(skb);
>  
>  	orig_dev = skb->dev;
> +	skb_shinfo(skb)->ip6_frag_id = skb->dev->frag_id_hash;
>  
>  	skb_reset_network_header(skb);
>  	if (!skb_transport_header_was_set(skb))
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index ce69a12..819a821 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1092,7 +1092,8 @@ static inline int ip6_ufo_append_data(struct sock
*sk,
>  				     sizeof(struct frag_hdr)) & ~7;
>  	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
>  	ipv6_select_ident(&fhdr, rt);
> -	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
> +	skb_shinfo(skb)->ip6_frag_id =
jhash_1word(skb_shinfo(skb)->ip6_frag_id,
> +						   fhdr.identification);
>  
>  append:
>  	return skb_append_datato_frags(sk, skb, getfrag, from,
> 
> 
> Add to this a netlink/sysfs API to set the frag_id_hash for
> devices.
> 
> Now, user can set identical frag id hash for all devices
> for a given VM.
> 
> We can even expose this to guests: each guest would generate
> the ID on boot and send it to host, host would set it
> in sysfs.
jhash_1word shouldn't be a bijection, so we are randomizing here and are
increasing the probability of collisions. Instead of jhash_1word you
would need to take a simple block cipher with the hash as key.

Bye,
Hannes

Apparently Analagous Threads

Search for more maybe matching threads

Virtualization - Jan 2015 - [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

[PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

Apparently Analagous Threads