thr3ads.net - Virtualization - [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Vlad Yasevich

2017-Apr-20 15:34 UTC

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

On 04/17/2017 11:01 PM, Jason Wang wrote:> 
> 
> On 2017?04?16? 00:38, Vladislav Yasevich wrote:
>> Curreclty virtion net header is fixed size and adding things to it is
rather
>> difficult to do.  This series attempt to add the infrastructure as well
as some
>> extensions that try to resolve some deficiencies we currently have.
>>
>> First, vnet header only has space for 16 flags.  This may not be enough
>> in the future.  The extensions will provide space for 32 possbile
extension
>> flags and 32 possible extensions.   These flags will be carried in the
>> first pseudo extension header, the presense of which will be determined
by
>> the flag in the virtio net header.
>>
>> The extensions themselves will immidiately follow the extension header
itself.
>> They will be added to the packet in the same order as they appear in
the
>> extension flags.  No padding is placed between the extensions and any
>> extensions negotiated, but not used need by a given packet will convert
to
>> trailing padding.
> 
> Do we need a explicit padding (e.g an extension) which could be controlled
by each side?
I don't think so.  The size of the vnet header is set based on the
extensions negotiated.
The one part I am not crazy about is that in the case of packet not using any
extensions,
the data is still placed after the entire vnet header, which essentially adds a
lot
of padding.  However, that's really no different then if we simply grew the
vnet header.

The other thing I've tried before is putting extensions into their own sg
buffer, but that
made it slower.
> 
>>
>> For example:
>>   | vnet mrg hdr | ext hdr | ext 1 | ext 2 | ext 5 | .. pad .. | packet
data |
> 
> Just some rough thoughts:
> 
> - Is this better to use TLV instead of bitmap here? One advantage of TLV is
that the
> length is not limited by the length of bitmap.
but the disadvantage is that we add at least 4 bytes per extension of just TL
data.  That
makes this thing even longer.
> - For 1.1, do we really want something like vnet header? AFAIK, it was not
used by modern
> NICs, is this better to pack all meta-data into descriptor itself? This may
need a some
> changes in tun/macvtap, but looks more PCIE friendly.
That would really be ideal and I've looked at this.  There are small issues
of exposing
the 'net metadata' of the descriptor to taps so they can be filled in. 
The alternative
is to use a different control structure for tap->qemu|vhost channel (that can
be
implementation specific) and have qemu|vhost populate the 'net metadata'
of the descriptor.

Thanks
-vlad
> 
> Thanks
> 
>>
>> Extensions proposed in this series are:
>>   - IPv6 fragment id extension
>>     * Currently, the guest generated fragment id is discarded and the
host
>>       generates an IPv6 fragment id if the packet has to be fragmented.
The
>>       code attempts to add time based perturbation to id generation to
make
>>       it harder to guess the next fragment id to be used.  However,
doing this
>>       on the host may result is less perturbation (due to differnet
timing)
>>       and might make id guessing easier.  Ideally, the ids generated by
the
>>       guest should be used.  One could also argue that we a
"violating" the
>>       IPv6 protocol in the if the _strict_ interpretation of the spec.
>>
>>   - VLAN header acceleration
>>     * Currently virtio doesn't not do vlan header acceleration and
instead
>>       uses software tagging.  One of the first things that the host
will do is
>>       strip the vlan header out.  When passing the packet the a guest
the
>>       vlan header is re-inserted in to the packet.  We can skip all
that work
>>       if we can pass the vlan data in accelearted format.  Then the
host will
>>       not do any extra work.  However, so far, this yeilded a very
small
>>       perf bump (only ~1%).  I am still looking into this.
>>
>>   - UDP tunnel offload
>>     * Similar to vlan acceleration, with this extension we can pass
additional
>>       data to host for support GSO with udp tunnel and possible other
>>       encapsulations.  This yeilds a significant perfromance
improvement
>>      (still testing remote checksum code).
>>
>> An addition extension that is unfinished (due to still testing for any
>> side-effects) is checksum passthrough to support drivers that set
>> CHECKSUM_COMPLETE.  This would eliminate the need for guests to compute
>> the software checksum.
>>
>> This series only takes care of virtio net.  I have addition patches for
the
>> host side (vhost and tap/macvtap as well as qemu), but wanted to get
feedback
>> on the general approach first.
>>
>> Vladislav Yasevich (6):
>>    virtio-net: Remove the use the padded vnet_header structure
>>    virtio-net: make header length handling uniform
>>    virtio_net: Add basic skeleton for handling vnet header extensions.
>>    virtio-net: Add support for IPv6 fragment id vnet header extension.
>>    virtio-net: Add support for vlan acceleration vnet header extension.
>>    virtio-net: Add support for UDP tunnel offload and extension.
>>
>>   drivers/net/virtio_net.c        | 132
+++++++++++++++++++++++++++++++++-------
>>   include/linux/skbuff.h          |   5 ++
>>   include/linux/virtio_net.h      |  91 ++++++++++++++++++++++++++-
>>   include/uapi/linux/virtio_net.h |  38 ++++++++++++
>>   4 files changed, 242 insertions(+), 24 deletions(-)
>>
>

Jason Wang

2017-Apr-21 04:05 UTC

head link

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

On 2017?04?20? 23:34, Vlad Yasevich wrote:> On 04/17/2017 11:01 PM, Jason Wang wrote:
>>
>> On 2017?04?16? 00:38, Vladislav Yasevich wrote:
>>> Curreclty virtion net header is fixed size and adding things to it
is rather
>>> difficult to do.  This series attempt to add the infrastructure as
well as some
>>> extensions that try to resolve some deficiencies we currently have.
>>>
>>> First, vnet header only has space for 16 flags.  This may not be
enough
>>> in the future.  The extensions will provide space for 32 possbile
extension
>>> flags and 32 possible extensions.   These flags will be carried in
the
>>> first pseudo extension header, the presense of which will be
determined by
>>> the flag in the virtio net header.
>>>
>>> The extensions themselves will immidiately follow the extension
header itself.
>>> They will be added to the packet in the same order as they appear
in the
>>> extension flags.  No padding is placed between the extensions and
any
>>> extensions negotiated, but not used need by a given packet will
convert to
>>> trailing padding.
>> Do we need a explicit padding (e.g an extension) which could be
controlled by each side?
> I don't think so.  The size of the vnet header is set based on the
extensions negotiated.
> The one part I am not crazy about is that in the case of packet not using
any extensions,
> the data is still placed after the entire vnet header, which essentially
adds a lot
> of padding.  However, that's really no different then if we simply grew
the vnet header.
>
> The other thing I've tried before is putting extensions into their own
sg buffer, but that
> made it slower.h
Yes.
>
>>> For example:
>>>    | vnet mrg hdr | ext hdr | ext 1 | ext 2 | ext 5 | .. pad .. |
packet data |
>> Just some rough thoughts:
>>
>> - Is this better to use TLV instead of bitmap here? One advantage of
TLV is that the
>> length is not limited by the length of bitmap.
> but the disadvantage is that we add at least 4 bytes per extension of just
TL data.  That
> makes this thing even longer.
Yes, and it looks like the length is still limited by e.g the length of T.
>
>> - For 1.1, do we really want something like vnet header? AFAIK, it was
not used by modern
>> NICs, is this better to pack all meta-data into descriptor itself? This
may need a some
>> changes in tun/macvtap, but looks more PCIE friendly.
> That would really be ideal and I've looked at this.  There are small
issues of exposing
> the 'net metadata' of the descriptor to taps so they can be filled
in.  The alternative
> is to use a different control structure for tap->qemu|vhost channel
(that can be
> implementation specific) and have qemu|vhost populate the 'net
metadata' of the descriptor.
Yes, this needs some thought. For vhost, things looks a little bit 
easier, we can probably use msg_control.

Thanks
> Thanks
> -vlad
>
>> Thanks
>>
>>> Extensions proposed in this series are:
>>>    - IPv6 fragment id extension
>>>      * Currently, the guest generated fragment id is discarded and
the host
>>>        generates an IPv6 fragment id if the packet has to be
fragmented.  The
>>>        code attempts to add time based perturbation to id
generation to make
>>>        it harder to guess the next fragment id to be used. 
However, doing this
>>>        on the host may result is less perturbation (due to
differnet timing)
>>>        and might make id guessing easier.  Ideally, the ids
generated by the
>>>        guest should be used.  One could also argue that we a
"violating" the
>>>        IPv6 protocol in the if the _strict_ interpretation of the
spec.
>>>
>>>    - VLAN header acceleration
>>>      * Currently virtio doesn't not do vlan header acceleration
and instead
>>>        uses software tagging.  One of the first things that the
host will do is
>>>        strip the vlan header out.  When passing the packet the a
guest the
>>>        vlan header is re-inserted in to the packet.  We can skip
all that work
>>>        if we can pass the vlan data in accelearted format.  Then
the host will
>>>        not do any extra work.  However, so far, this yeilded a very
small
>>>        perf bump (only ~1%).  I am still looking into this.
>>>
>>>    - UDP tunnel offload
>>>      * Similar to vlan acceleration, with this extension we can
pass additional
>>>        data to host for support GSO with udp tunnel and possible
other
>>>        encapsulations.  This yeilds a significant perfromance
improvement
>>>       (still testing remote checksum code).
>>>
>>> An addition extension that is unfinished (due to still testing for
any
>>> side-effects) is checksum passthrough to support drivers that set
>>> CHECKSUM_COMPLETE.  This would eliminate the need for guests to
compute
>>> the software checksum.
>>>
>>> This series only takes care of virtio net.  I have addition patches
for the
>>> host side (vhost and tap/macvtap as well as qemu), but wanted to
get feedback
>>> on the general approach first.
>>>
>>> Vladislav Yasevich (6):
>>>     virtio-net: Remove the use the padded vnet_header structure
>>>     virtio-net: make header length handling uniform
>>>     virtio_net: Add basic skeleton for handling vnet header
extensions.
>>>     virtio-net: Add support for IPv6 fragment id vnet header
extension.
>>>     virtio-net: Add support for vlan acceleration vnet header
extension.
>>>     virtio-net: Add support for UDP tunnel offload and extension.
>>>
>>>    drivers/net/virtio_net.c        | 132
+++++++++++++++++++++++++++++++++-------
>>>    include/linux/skbuff.h          |   5 ++
>>>    include/linux/virtio_net.h      |  91
++++++++++++++++++++++++++-
>>>    include/uapi/linux/virtio_net.h |  38 ++++++++++++
>>>    4 files changed, 242 insertions(+), 24 deletions(-)
>>>

Vlad Yasevich

2017-Apr-21 13:08 UTC

head link

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

On 04/21/2017 12:05 AM, Jason Wang wrote:> 
> 
> On 2017?04?20? 23:34, Vlad Yasevich wrote:
>> On 04/17/2017 11:01 PM, Jason Wang wrote:
>>>
>>> On 2017?04?16? 00:38, Vladislav Yasevich wrote:
>>>> Curreclty virtion net header is fixed size and adding things to
it is rather
>>>> difficult to do.  This series attempt to add the infrastructure
as well as some
>>>> extensions that try to resolve some deficiencies we currently
have.
>>>>
>>>> First, vnet header only has space for 16 flags.  This may not
be enough
>>>> in the future.  The extensions will provide space for 32
possbile extension
>>>> flags and 32 possible extensions.   These flags will be carried
in the
>>>> first pseudo extension header, the presense of which will be
determined by
>>>> the flag in the virtio net header.
>>>>
>>>> The extensions themselves will immidiately follow the extension
header itself.
>>>> They will be added to the packet in the same order as they
appear in the
>>>> extension flags.  No padding is placed between the extensions
and any
>>>> extensions negotiated, but not used need by a given packet will
convert to
>>>> trailing padding.
>>> Do we need a explicit padding (e.g an extension) which could be
controlled by each side?
>> I don't think so.  The size of the vnet header is set based on the
extensions negotiated.
>> The one part I am not crazy about is that in the case of packet not
using any extensions,
>> the data is still placed after the entire vnet header, which
essentially adds a lot
>> of padding.  However, that's really no different then if we simply
grew the vnet header.
>>
>> The other thing I've tried before is putting extensions into their
own sg buffer, but that
>> made it slower.h
> 
> Yes.
> 
>>
>>>> For example:
>>>>    | vnet mrg hdr | ext hdr | ext 1 | ext 2 | ext 5 | .. pad ..
| packet data |
>>> Just some rough thoughts:
>>>
>>> - Is this better to use TLV instead of bitmap here? One advantage
of TLV is that the
>>> length is not limited by the length of bitmap.
>> but the disadvantage is that we add at least 4 bytes per extension of
just TL data.  That
>> makes this thing even longer.
> 
> Yes, and it looks like the length is still limited by e.g the length of T.
Not only that, but it is also limited by the skb->cb as a whole.  So adding
putting
extensions into a TLV style means we have less extensions for now, until we get
rid of
skb->cb usage.
> 
>>
>>> - For 1.1, do we really want something like vnet header? AFAIK, it
was not used by modern
>>> NICs, is this better to pack all meta-data into descriptor itself?
This may need a some
>>> changes in tun/macvtap, but looks more PCIE friendly.
>> That would really be ideal and I've looked at this.  There are
small issues of exposing
>> the 'net metadata' of the descriptor to taps so they can be
filled in.  The alternative
>> is to use a different control structure for tap->qemu|vhost channel
(that can be
>> implementation specific) and have qemu|vhost populate the 'net
metadata' of the descriptor.
> 
> Yes, this needs some thought. For vhost, things looks a little bit easier,
we can probably
> use msg_control.
> 
We can use msg_control in qemu as well, can't we?  It really is a question
of who is doing
the work and the number of copies.

I can take a closer look of how it would look if we extend the descriptor with
type
specific data.  I don't know if other users of virtio would benefit from it?

-vlad> Thanks
> 
>> Thanks
>> -vlad
>>
>>> Thanks
>>>
>>>> Extensions proposed in this series are:
>>>>    - IPv6 fragment id extension
>>>>      * Currently, the guest generated fragment id is discarded
and the host
>>>>        generates an IPv6 fragment id if the packet has to be
fragmented.  The
>>>>        code attempts to add time based perturbation to id
generation to make
>>>>        it harder to guess the next fragment id to be used. 
However, doing this
>>>>        on the host may result is less perturbation (due to
differnet timing)
>>>>        and might make id guessing easier.  Ideally, the ids
generated by the
>>>>        guest should be used.  One could also argue that we a
"violating" the
>>>>        IPv6 protocol in the if the _strict_ interpretation of
the spec.
>>>>
>>>>    - VLAN header acceleration
>>>>      * Currently virtio doesn't not do vlan header
acceleration and instead
>>>>        uses software tagging.  One of the first things that the
host will do is
>>>>        strip the vlan header out.  When passing the packet the
a guest the
>>>>        vlan header is re-inserted in to the packet.  We can
skip all that work
>>>>        if we can pass the vlan data in accelearted format. 
Then the host will
>>>>        not do any extra work.  However, so far, this yeilded a
very small
>>>>        perf bump (only ~1%).  I am still looking into this.
>>>>
>>>>    - UDP tunnel offload
>>>>      * Similar to vlan acceleration, with this extension we can
pass additional
>>>>        data to host for support GSO with udp tunnel and
possible other
>>>>        encapsulations.  This yeilds a significant perfromance
improvement
>>>>       (still testing remote checksum code).
>>>>
>>>> An addition extension that is unfinished (due to still testing
for any
>>>> side-effects) is checksum passthrough to support drivers that
set
>>>> CHECKSUM_COMPLETE.  This would eliminate the need for guests to
compute
>>>> the software checksum.
>>>>
>>>> This series only takes care of virtio net.  I have addition
patches for the
>>>> host side (vhost and tap/macvtap as well as qemu), but wanted
to get feedback
>>>> on the general approach first.
>>>>
>>>> Vladislav Yasevich (6):
>>>>     virtio-net: Remove the use the padded vnet_header structure
>>>>     virtio-net: make header length handling uniform
>>>>     virtio_net: Add basic skeleton for handling vnet header
extensions.
>>>>     virtio-net: Add support for IPv6 fragment id vnet header
extension.
>>>>     virtio-net: Add support for vlan acceleration vnet header
extension.
>>>>     virtio-net: Add support for UDP tunnel offload and
extension.
>>>>
>>>>    drivers/net/virtio_net.c        | 132
+++++++++++++++++++++++++++++++++-------
>>>>    include/linux/skbuff.h          |   5 ++
>>>>    include/linux/virtio_net.h      |  91
++++++++++++++++++++++++++-
>>>>    include/uapi/linux/virtio_net.h |  38 ++++++++++++
>>>>    4 files changed, 242 insertions(+), 24 deletions(-)
>>>>
>

Michael S. Tsirkin

2017-Apr-24 17:04 UTC

head link

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

On Thu, Apr 20, 2017 at 11:34:57AM -0400, Vlad Yasevich
wrote:> > - For 1.1, do we really want something like vnet header? AFAIK, it was
not used by modern
> > NICs, is this better to pack all meta-data into descriptor itself?
This may need a some
> > changes in tun/macvtap, but looks more PCIE friendly.
> 
> That would really be ideal and I've looked at this.
We already have at least 16 unused bits in the used ring
(head is 16 bit we are using 32 for it).

-- 
MST

Apparently Analagous Threads

Search for more possibly parallel threads

Virtualization - Apr 2017 - [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

[PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions

Apparently Analagous Threads