Markus Armbruster
2014-Jun-12  14:40 UTC
Why I advise against using ivshmem (was: [Qemu-devel] Using virtio for inter-VM communication)
Henning Schild <henning.schild at siemens.com> writes:> On Thu, 12 Jun 2014 08:48:04 +0200 > Markus Armbruster <armbru at redhat.com> wrote: > >> Vincent JARDIN <vincent.jardin at 6wind.com> writes: >> >> > On 10/06/2014 18:48, Henning Schild wrote:> Hi, >> >> In a first prototype i implemented a ivshmem[2] device for the >> >> hypervisor. That way we can share memory between virtual machines. >> >> Ivshmem is nice and simple but does not seem to be used anymore. >> >> And it >> >> does not define higher level devices, like a console. >> > >> > FYI, ivhsmem is used here: >> > http://dpdk.org/browse/memnic/tree/ >> > >> > http://dpdk.org/browse/memnic/tree/pmd/pmd_memnic.c#n449 >> > >> > There are some few other references too, if needed. >> >> It may be used, but that doesn't mean it's maintained, or robust >> against abuse. My advice is to steer clear of it. > > Could you elaborate on why you advice against it?Sure! The reasons for my dislike range from practical to philosophical. My practical concerns include: 1. ivshmem code needs work, but has no maintainer - Error handling is generally poor. For instance, "device_add ivshmem" kills your guest instantly. - More subjectively, I don't trust the code to be robust against abuse by our own guest, or the other guests sharing the memory. Convincing me would take a code audit. - MAINTAINERS doesn't cover ivshmem.c. - The last non-trivial commit that isn't obviously part of some tree-wide infrastructure or cleanup work is from September 2012 (commit c08ba66). 2. There is no libvirt support 3. Out-of-tree server program required for full functionality Interrupts require a "shared memory server" running in the host (see docs/specs/ivshmem_device_spec.txt). It doesn't tell where to find one. The initial commit 6cbf4c8 points to <www.gitorious.org/nahanni>. That repository's last commit is from September 2012. He's dead, Jim. ivshmem_device_spec.txt is silent on what the server is supposed to do. If this server requires privileges: I don't trust it without an audit. 4. Out-of-tree kernel uio driver required The device is "intended to be used with the provided UIO driver" (ivshmem_device_spec.txt again). As far as I can tell, the "provided UIO driver" is the one in the dead Nahanni repo. By now, you should be expecting this: I don't trust that one either. These concerns are all fixable, but it'll take serious work, and time. Something like: * Find a maintainer for the device model * Review and fix its code * Get the required kernel module upstream * Get all the required parts outside QEMU packaged in major distros, or absorbed into QEMU In short, create a viable community around ivshmem, either within the QEMU community, or separately but cooperating. On to the more philosophical ones. 5. Out-of-tree interface required Paraphrasing an old quip: Some people, when confronted with a problem, think "I know, I'll use shared memory." Now they have two problems. Shared memory is not an interface. It's at best something you can use to build an interface. I'd rather have us offer something with a little bit more structure. Very fast guest-to-guest networking perhaps. 6. Device models belong into QEMU Say you build an actual interface on top of ivshmem. Then ivshmem in QEMU together with the supporting host code outside QEMU (see 3.) and the lower layer of the code using it in guests (kernel + user space) provide something that to me very much looks like a device model. Device models belong into QEMU. It's what QEMU does. To all currently using ivshmem or contemplating its use: I'd like to invite you to work with the QEMU community to get your use case served better. You could start worse than with explaining it to us. In case you'd rather not work with the QEMU community: I'm not passing judgement on that (heck, I have had days when I'd rather not, too). But if somebody's reasons not to work with us include GPL circumvention, then that somebody is a scoundrel.
Markus, see inline (I am not on all mailing list, please, keep the cc list). > Sure! The reasons for my dislike range from practical to > philosophical.> > My practical concerns include: > > 1. ivshmem code needs work, but has no maintainerSee David's contributions: http://patchwork.ozlabs.org/patch/358750/> 2. There is no libvirt supportOne can use qemu without libvivrt.> 3. Out-of-tree server program required for full functionalityWe have the source code, it provides the documentation to write our own better server program.> 4. Out-of-tree kernel uio driver requiredNo, it is optional.> These concerns are all fixable, but it'll take serious work, and time. > Something like: > > * Find a maintainer for the device modelI guess, we can find it into the DPDK.org community.> * Review and fix its code > > * Get the required kernel module upstreamwhich module? uio, it is not required.> * Get all the required parts outside QEMU packaged in major distros, or > absorbed into QEMURedhat did disable it. why? it is there in QEMU.> In short, create a viable community around ivshmem, either within the > QEMU community, or separately but cooperating.At least, DPDK.org community is a community using it.> On to the more philosophical ones. > > 5. Out-of-tree interface required > > Paraphrasing an old quip: Some people, when confronted with a > problem, think "I know, I'll use shared memory." Now they have two > problems. > > Shared memory is not an interface. It's at best something you can > use to build an interface. > > I'd rather have us offer something with a little bit more structure. > Very fast guest-to-guest networking perhaps.It is not just networking, you have other use cases like HPC, sharing in-memory databases.> > 6. Device models belong into QEMU > > Say you build an actual interface on top of ivshmem. Then ivshmem in > QEMU together with the supporting host code outside QEMU (see 3.) and > the lower layer of the code using it in guests (kernel + user space) > provide something that to me very much looks like a device model. > > Device models belong into QEMU. It's what QEMU does.See my previous statement, it is not just device model. Best regards, Vincent
Il 12/06/2014 18:02, Vincent JARDIN ha scritto:> >> * Get all the required parts outside QEMU packaged in major distros, or >> absorbed into QEMU > > Redhat did disable it. why? it is there in QEMU.We don't ship everything that is part of QEMU, just like we selectively disable many drivers in Linux. Markus especially referred to parts *outside* QEMU: the server, the uio driver, etc. These out-of-tree, non-packaged parts of ivshmem are one of the reasons why Red Hat has disabled ivshmem in RHEL7. He also listed many others. Basically for parts of QEMU that are not of high quality, we either fix them (this is for example what we did for qcow2) or disable them. Not just ivshmem suffered this fate, for example many network cards, sound cards, SCSI storage adapters. Now, vhost-user is in the process of being merged for 2.1. Compared to the DPDK solution: * it doesn't require hugetlbfs (which only enabled shared memory by chance in older QEMU releases, that was never documented) * it doesn't require ivshmem (it does require shared memory, which will also be added to 2.1) * it doesn't require the kernel driver from the DPDK sample * it is not just shared memory, but also defines an interface to use it (another of Markus's points) vhost-user is superior, and it is superior because it has been designed from the get-go through cooperation of all interested parties (namely QEMU and snabbswitch). Paolo
Some dropped quoted text restored. Vincent JARDIN <vincent.jardin at 6wind.com> writes:> Markus, > > see inline (I am not on all mailing list, please, keep the cc list). > >> Sure! The reasons for my dislike range from practical to >> philosophical. >> >> My practical concerns include: >> >> 1. ivshmem code needs work, but has no maintainer > See David's contributions: > http://patchwork.ozlabs.org/patch/358750/We're grateful for David's patch for qemu-char.c, but this isn't ivshmem maintenance, yet.>> - Error handling is generally poor. For instance, "device_add >> ivshmem" kills your guest instantly. >> >> - More subjectively, I don't trust the code to be robust against >> abuse by our own guest, or the other guests sharing the memory. >> Convincing me would take a code audit. >> >> - MAINTAINERS doesn't cover ivshmem.c. >> >> - The last non-trivial commit that isn't obviously part of some >> tree-wide infrastructure or cleanup work is from September 2012 >> (commit c08ba66). >> >> 2. There is no libvirt support > > One can use qemu without libvivrt.You asked me for my reasons for disliking ivshmem. This is one. Sure, I can drink my water through a straw while standing on one foot, but that doesn't mean I have to like it. And me not liking it doesn't mean the next guy shouldn't like it. To each their own.>> 3. Out-of-tree server program required for full functionality >> >> Interrupts require a "shared memory server" running in the host (see >> docs/specs/ivshmem_device_spec.txt). It doesn't tell where to find >> one. The initial commit 6cbf4c8 points to >> <www.gitorious.org/nahanni>. That repository's last commit is from >> September 2012. He's dead, Jim. >> >> ivshmem_device_spec.txt is silent on what the server is supposed to >> do. > > We have the source code, it provides the documentation to write our > own better server program.Good for you. Not good enough for the QEMU community. QEMU features requiring on out-of-tree software to be useful are fine, as long as said out-of-tree software is readily available to QEMU developers and users. Free software with a community around it and packaged in major distros qualifies. If you haven't got that, talk to us to find out whether what you've got qualifies, and if not, what you'd have to do to make it qualify. Back when we accepted ivshmem, the out-of-tree parts it needs were well below the "community & packaged" bar. But folks interested in it talked to us, and the fact that it's in shows that QEMU maintainers decided what they had then was enough. Unfortunately, we now have considerably less: Nahanni appears to be dead. An apparently dead git repository you can study is not enough. The fact that you hold an improved reimplementation privately is immaterial. So is the (plausible) claim that others could also create a reimplementation.>> If this server requires privileges: I don't trust it without an >> audit. >> >> 4. Out-of-tree kernel uio driver required > > No, it is optional.Good to know. Would you be willing to send a patch to ivshmem_device_spec.txt clarifying that?>> The device is "intended to be used with the provided UIO driver" >> (ivshmem_device_spec.txt again). As far as I can tell, the "provided >> UIO driver" is the one in the dead Nahanni repo. >> >> By now, you should be expecting this: I don't trust that one either. >> >> These concerns are all fixable, but it'll take serious work, and time. >> Something like: >> >> * Find a maintainer for the device model > I guess, we can find it into the DPDK.org community. >> * Review and fix its code >> >> * Get the required kernel module upstream > > which module? uio, it is not required. > >> * Get all the required parts outside QEMU packaged in major distros, or >> absorbed into QEMU > > Redhat did disable it. why? it is there in QEMU.Up to now, I've been wearing my QEMU hat. Let me exchange it for my Red one for a bit. We (Red Hat) don't just package & ship metric tons of random free software. We package & ship useful free software we can support for many, many years. Sometimes, we find that we have to focus serious development resources on making something useful supportable (Paolo mentioned qcow2). We obviously can't focus on everything, though. Anyway, ivshmem didn't make the cut for RHEL-7.0. Sorry if that inconveniences you. To get it into RHEL, you need to show it's both useful and supportable. Building a community around it would go a long way towards that. If you want to discuss this in more detail with us, you may want to try communication channels provided by your RHEL subscription in addition to the QEMU development mailing list. Don't be shy, you're paying for it! As always, I'm not speaking for myself, not my employer. Okay, wearing my QEMU hat again.>> In short, create a viable community around ivshmem, either within the >> QEMU community, or separately but cooperating. > > At least, DPDK.org community is a community using it.Using something isn't the same as maintaining something. But it's a necessary first step. [...]