Jan Kiszka
2015-Apr-27 12:35 UTC
[virtio-dev] Zerocopy VM-to-VM networking using virtio-net
Am 2015-04-27 um 12:17 schrieb Stefan Hajnoczi:> On Sun, Apr 26, 2015 at 2:24 PM, Luke Gorrie <luke at snabb.co> wrote: >> On 24 April 2015 at 15:22, Stefan Hajnoczi <stefanha at gmail.com> wrote: >>> >>> The motivation for making VM-to-VM fast is that while software >>> switches on the host are efficient today (thanks to vhost-user), there >>> is no efficient solution if the software switch is a VM. >> >> >> I see. This sounds like a noble goal indeed. I would love to run the >> software switch as just another VM in the long term. It would make it much >> easier for the various software switches to coexist in the world. >> >> The main technical risk I see in this proposal is that eliminating the >> memory copies might not have the desired effect. I might be tempted to keep >> the copies but prevent the kernel from having to inspect the vrings (more >> like vhost-user). But that is just a hunch and I suppose the first step >> would be a prototype to check the performance anyway. >> >> For what it is worth here is my view of networking performance on x86 in the >> Haswell+ era: >> https://groups.google.com/forum/#!topic/snabb-devel/aez4pEnd4ow > > Thanks. > > I've been thinking about how to eliminate the VM <-> host <-> VM > switching and instead achieve just VM <-> VM. > > The holy grail of VM-to-VM networking is an exitless I/O path. In > other words, packets can be transferred between VMs without any > vmexits (this requires a polling driver). > > Here is how it works. QEMU gets "-device vhost-user" so that a VM can > act as the vhost-user server: > > VM1 (virtio-net guest driver) <-> VM2 (vhost-user device) > > VM1 has a regular virtio-net PCI device. VM2 has a vhost-user device > and plays the host role instead of the normal virtio-net guest driver > role. > > The ugly thing about this is that VM2 needs to map all of VM1's guest > RAM so it can access the vrings and packet data. The solution to this > is something like the Shared Buffers BAR but this time it contains not > just the packet data but also the vring, let's call it the Shared > Virtqueues BAR. > > The Shared Virtqueues BAR eliminates the need for vhost-net on the > host because VM1 and VM2 communicate directly using virtqueue notify > or polling vring memory. Virtqueue notify works by connecting an > eventfd as ioeventfd in VM1 and irqfd in VM2. And VM2 would also have > an ioeventfd that is irqfd for VM1 to signal completions.We had such a discussion before: http://thread.gmane.org/gmane.comp.emulators.kvm.devel/123014/focus=279658 Would be great to get this ball rolling again. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
Stefan Hajnoczi
2015-Apr-27 13:01 UTC
[virtio-dev] Zerocopy VM-to-VM networking using virtio-net
On Mon, Apr 27, 2015 at 1:55 PM, Jan Kiszka <jan.kiszka at siemens.com> wrote:> Am 2015-04-27 um 14:35 schrieb Jan Kiszka: >> Am 2015-04-27 um 12:17 schrieb Stefan Hajnoczi: >>> On Sun, Apr 26, 2015 at 2:24 PM, Luke Gorrie <luke at snabb.co> wrote: >>>> On 24 April 2015 at 15:22, Stefan Hajnoczi <stefanha at gmail.com> wrote: >>>>> >>>>> The motivation for making VM-to-VM fast is that while software >>>>> switches on the host are efficient today (thanks to vhost-user), there >>>>> is no efficient solution if the software switch is a VM. >>>> >>>> >>>> I see. This sounds like a noble goal indeed. I would love to run the >>>> software switch as just another VM in the long term. It would make it much >>>> easier for the various software switches to coexist in the world. >>>> >>>> The main technical risk I see in this proposal is that eliminating the >>>> memory copies might not have the desired effect. I might be tempted to keep >>>> the copies but prevent the kernel from having to inspect the vrings (more >>>> like vhost-user). But that is just a hunch and I suppose the first step >>>> would be a prototype to check the performance anyway. >>>> >>>> For what it is worth here is my view of networking performance on x86 in the >>>> Haswell+ era: >>>> https://groups.google.com/forum/#!topic/snabb-devel/aez4pEnd4ow >>> >>> Thanks. >>> >>> I've been thinking about how to eliminate the VM <-> host <-> VM >>> switching and instead achieve just VM <-> VM. >>> >>> The holy grail of VM-to-VM networking is an exitless I/O path. In >>> other words, packets can be transferred between VMs without any >>> vmexits (this requires a polling driver). >>> >>> Here is how it works. QEMU gets "-device vhost-user" so that a VM can >>> act as the vhost-user server: >>> >>> VM1 (virtio-net guest driver) <-> VM2 (vhost-user device) >>> >>> VM1 has a regular virtio-net PCI device. VM2 has a vhost-user device >>> and plays the host role instead of the normal virtio-net guest driver >>> role. >>> >>> The ugly thing about this is that VM2 needs to map all of VM1's guest >>> RAM so it can access the vrings and packet data. The solution to this >>> is something like the Shared Buffers BAR but this time it contains not >>> just the packet data but also the vring, let's call it the Shared >>> Virtqueues BAR. >>> >>> The Shared Virtqueues BAR eliminates the need for vhost-net on the >>> host because VM1 and VM2 communicate directly using virtqueue notify >>> or polling vring memory. Virtqueue notify works by connecting an >>> eventfd as ioeventfd in VM1 and irqfd in VM2. And VM2 would also have >>> an ioeventfd that is irqfd for VM1 to signal completions. >> >> We had such a discussion before: >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/123014/focus=279658 >> >> Would be great to get this ball rolling again. >> >> Jan >> > > But one challenge would remain even then (unless both sides only poll): > exit-free inter-VM signaling, no? But that's a hardware issue first of all.To start with ioeventfd<->irqfd can be used. It incurs a light-weight exit in VM1 and interrupt injection in VM2. For networking the cost is mitigated by NAPI drivers which switch between interrupts and polling. During notification-heavy periods the guests would use polling anyway. A hardware solution would be some kind of inter-guest interrupt injection. I don't know VMX well enough to know whether that is possible on Intel CPUs. Stefan
Reasonably Related Threads
- [virtio-dev] Zerocopy VM-to-VM networking using virtio-net
- [virtio-dev] Zerocopy VM-to-VM networking using virtio-net
- [virtio-dev] Zerocopy VM-to-VM networking using virtio-net
- [virtio-dev] Zerocopy VM-to-VM networking using virtio-net
- [virtio-dev] Zerocopy VM-to-VM networking using virtio-net