thr3ads.net - Linux Virtualization - updated: kvm networking todo wiki [May 2013]

If this information is useful, please help other people find it:
Share via:

Rusty Russell

2013-May-29 00:07 UTC

updated: kvm networking todo wiki

"Michael S. Tsirkin" <mst at redhat.com>
writes:> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>> "Michael S. Tsirkin" <mst at redhat.com> writes:
>> 
>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
>> >> > Hey guys,
>> >> > I've updated the kvm networking todo wiki with
current projects.
>> >> > Will try to keep it up to date more often.
>> >> > Original announcement below.
>> >> 
>> >> Thanks a lot. I've added the tasks I'm currently
working on to the wiki.
>> >> 
>> >> btw. I notice the virtio-net data plane were missed in the
wiki. Is the
>> >> project still being considered?
>> >
>> > It might have been interesting several years ago, but now that
linux has
>> > vhost-net in kernel, the only point seems to be to
>> > speed up networking on non-linux hosts.
>> 
>> Data plane just means having a dedicated thread for virtqueue
processing
>> that doesn't hold qemu_mutex.
>> 
>> Of course we're going to do this in QEMU.  It's a no brainer. 
But not
>> as a separate device, just as an improvement to the existing userspace
>> virtio-net.
>> 
>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck.
>> 
>> FWIW, I think what's more interesting is using vhost-net as a
networking
>> backend with virtio-net in QEMU being what's guest facing.
>> 
>> In theory, this gives you the best of both worlds: QEMU acts as a first
>> line of defense against a malicious guest while still getting the
>> performance advantages of vhost-net (zero-copy).
>
> Great idea, that sounds very intresting.
>
> I'll add it to the wiki.
>
> In fact a bit of complexity in vhost was put there in the vague hope to
> support something like this: virtio rings are not translated through
> regular memory tables, instead, vhost gets a pointer to ring address.
>
> This allows qemu acting as a man in the middle,
> verifying the descriptors but not touching the
>
> Anyone interested in working on such a project?
It would be an interesting idea if we didn't already have the vhost
model where we don't need the userspace bounce.  We already have two
sets of host side ring code in the kernel (vhost and vringh, though
they're being unified).

All an accelerator can offer on the tx side is zero copy and direct
update of the used ring.  On rx userspace could register the buffers and
the accelerator could fill them and update the used ring.  It still
needs to deal with merged buffers, for example.

You avoid the address translation in the kernel, but I'm not convinced
that's a key problem.

Cheers,
Rusty.

Anthony Liguori

2013-May-29 13:01 UTC

head link

updated: kvm networking todo wiki

Rusty Russell <rusty at rustcorp.com.au> writes:
> "Michael S. Tsirkin" <mst at redhat.com> writes:
>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>> "Michael S. Tsirkin" <mst at redhat.com> writes:
>>> 
>>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
>>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
>>> >> > Hey guys,
>>> >> > I've updated the kvm networking todo wiki with
current projects.
>>> >> > Will try to keep it up to date more often.
>>> >> > Original announcement below.
>>> >> 
>>> >> Thanks a lot. I've added the tasks I'm currently
working on to the wiki.
>>> >> 
>>> >> btw. I notice the virtio-net data plane were missed in the
wiki. Is the
>>> >> project still being considered?
>>> >
>>> > It might have been interesting several years ago, but now that
linux has
>>> > vhost-net in kernel, the only point seems to be to
>>> > speed up networking on non-linux hosts.
>>> 
>>> Data plane just means having a dedicated thread for virtqueue
processing
>>> that doesn't hold qemu_mutex.
>>> 
>>> Of course we're going to do this in QEMU.  It's a no
brainer.  But not
>>> as a separate device, just as an improvement to the existing
userspace
>>> virtio-net.
>>> 
>>> > Since non-linux does not have kvm, I doubt virtio is a
bottleneck.
>>> 
>>> FWIW, I think what's more interesting is using vhost-net as a
networking
>>> backend with virtio-net in QEMU being what's guest facing.
>>> 
>>> In theory, this gives you the best of both worlds: QEMU acts as a
first
>>> line of defense against a malicious guest while still getting the
>>> performance advantages of vhost-net (zero-copy).
>>
>> Great idea, that sounds very intresting.
>>
>> I'll add it to the wiki.
>>
>> In fact a bit of complexity in vhost was put there in the vague hope to
>> support something like this: virtio rings are not translated through
>> regular memory tables, instead, vhost gets a pointer to ring address.
>>
>> This allows qemu acting as a man in the middle,
>> verifying the descriptors but not touching the
>>
>> Anyone interested in working on such a project?
>
> It would be an interesting idea if we didn't already have the vhost
> model where we don't need the userspace bounce.
The model is very interesting for QEMU because then we can use vhost as
a backend for other types of network adapters (like vmxnet3 or even
e1000).

It also helps for things like fault tolerance where we need to be able
to control packet flow within QEMU.

Regards,

Anthony Liguori
> We already have two
> sets of host side ring code in the kernel (vhost and vringh, though
> they're being unified).
>
> All an accelerator can offer on the tx side is zero copy and direct
> update of the used ring.  On rx userspace could register the buffers and
> the accelerator could fill them and update the used ring.  It still
> needs to deal with merged buffers, for example.
>
> You avoid the address translation in the kernel, but I'm not convinced
> that's a key problem.
>
> Cheers,
> Rusty.

Michael S. Tsirkin

2013-May-29 14:12 UTC

head link

updated: kvm networking todo wiki

On Wed, May 29, 2013 at 08:01:03AM -0500, Anthony Liguori
wrote:> Rusty Russell <rusty at rustcorp.com.au> writes:
> 
> > "Michael S. Tsirkin" <mst at redhat.com> writes:
> >> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
> >>> "Michael S. Tsirkin" <mst at redhat.com>
writes:
> >>> 
> >>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang
wrote:
> >>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
> >>> >> > Hey guys,
> >>> >> > I've updated the kvm networking todo wiki
with current projects.
> >>> >> > Will try to keep it up to date more often.
> >>> >> > Original announcement below.
> >>> >> 
> >>> >> Thanks a lot. I've added the tasks I'm
currently working on to the wiki.
> >>> >> 
> >>> >> btw. I notice the virtio-net data plane were missed
in the wiki. Is the
> >>> >> project still being considered?
> >>> >
> >>> > It might have been interesting several years ago, but now
that linux has
> >>> > vhost-net in kernel, the only point seems to be to
> >>> > speed up networking on non-linux hosts.
> >>> 
> >>> Data plane just means having a dedicated thread for virtqueue
processing
> >>> that doesn't hold qemu_mutex.
> >>> 
> >>> Of course we're going to do this in QEMU.  It's a no
brainer.  But not
> >>> as a separate device, just as an improvement to the existing
userspace
> >>> virtio-net.
> >>> 
> >>> > Since non-linux does not have kvm, I doubt virtio is a
bottleneck.
> >>> 
> >>> FWIW, I think what's more interesting is using vhost-net
as a networking
> >>> backend with virtio-net in QEMU being what's guest facing.
> >>> 
> >>> In theory, this gives you the best of both worlds: QEMU acts
as a first
> >>> line of defense against a malicious guest while still getting
the
> >>> performance advantages of vhost-net (zero-copy).
> >>
> >> Great idea, that sounds very intresting.
> >>
> >> I'll add it to the wiki.
> >>
> >> In fact a bit of complexity in vhost was put there in the vague
hope to
> >> support something like this: virtio rings are not translated
through
> >> regular memory tables, instead, vhost gets a pointer to ring
address.
> >>
> >> This allows qemu acting as a man in the middle,
> >> verifying the descriptors but not touching the
> >>
> >> Anyone interested in working on such a project?
> >
> > It would be an interesting idea if we didn't already have the
vhost
> > model where we don't need the userspace bounce.
> 
> The model is very interesting for QEMU because then we can use vhost as
> a backend for other types of network adapters (like vmxnet3 or even
> e1000).
> 
> It also helps for things like fault tolerance where we need to be able
> to control packet flow within QEMU.
> 
> Regards,
> 
> Anthony Liguori
It was also floated as an alternative way to do live migration.
> > We already have two
> > sets of host side ring code in the kernel (vhost and vringh, though
> > they're being unified).
> >
> > All an accelerator can offer on the tx side is zero copy and direct
> > update of the used ring.  On rx userspace could register the buffers
and
> > the accelerator could fill them and update the used ring.  It still
> > needs to deal with merged buffers, for example.
> >
> > You avoid the address translation in the kernel, but I'm not
convinced
> > that's a key problem.
> >
> > Cheers,
> > Rusty.

Rusty Russell

2013-May-30 05:23 UTC

head link

updated: kvm networking todo wiki

Anthony Liguori <anthony at codemonkey.ws> writes:> Rusty Russell <rusty at rustcorp.com.au> writes:
>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>> FWIW, I think what's more interesting is using vhost-net as a
networking
>>> backend with virtio-net in QEMU being what's guest facing.
>>> 
>>> In theory, this gives you the best of both worlds: QEMU acts as a
first
>>> line of defense against a malicious guest while still getting the
>>> performance advantages of vhost-net (zero-copy).
>>>
>> It would be an interesting idea if we didn't already have the vhost
>> model where we don't need the userspace bounce.
>
> The model is very interesting for QEMU because then we can use vhost as
> a backend for other types of network adapters (like vmxnet3 or even
> e1000).
>
> It also helps for things like fault tolerance where we need to be able
> to control packet flow within QEMU.
(CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).

Then I'm really confused as to what this would look like.  A zero copy
sendmsg?  We should be able to implement that today.

On the receive side, what can we do better than readv?  If we need to
return to userspace to tell the guest that we've got a new packet, we
don't win on latency.  We might reduce syscall overhead with a
multi-dimensional readv to read multiple packets at once?

Confused,
Rusty.

Reasonably Related Threads

Search for more seemingly similar threads

Linux Virtualization - May 2013 - updated: kvm networking todo wiki

updated: kvm networking todo wiki

updated: kvm networking todo wiki

updated: kvm networking todo wiki

updated: kvm networking todo wiki

Reasonably Related Threads