Hi, While my main interest is mostly in UML right now [1] I've CC'ed the qemu and virtualization lists because something similar might actually apply to other types of virtualization. I'm thinking about adding virt-io support to UML, but the tricky part is that while I want to use the virt-io basics (because it's a nice interface from the 'inside'), I don't actually want the stock drivers that are part of the kernel now (like virtio-net etc.) but rather something that integrates with wifi (probably building on hwsim). The 'inside' interfaces aren't really a problem - just have a specific device ID for this, and then write a normal virtio kernel driver for it. The 'outside' interfaces are where my thinking breaks down right now. Looking at lkl, the outside is just all implemented in lkl as code that gets linked to the library, so in UML terms it'd just be extra 'outside' code like the timer handling or other netdev stuff we have today. Looking at qemu, it's of course also implemented there, and then interfaces with the real network, console abstraction, etc. However, like I said above, I really need something very custom and not likely to make it upstream to any project (because what point is that if you cannot connect to the rest of the environment I'm building), so I'm thinking that perhaps it should be possible to write an abstract 'outside' that lets you interact with it really from out-of-process? Perhaps through some kind of shared memory segment? I think that gets tricky with virt-io doing DMA (I think it does?) though, so that part would have to be implemented directly and not out-of-process? But really that's why I'm asking - is there a better way than to just link the device-side virt-io code into the same binary (be it lkl lib, uml binary, qemu binary)? Thanks, johannes [1] Actually, I've considered using qemu, but it doesn't have virtualized time and doesn't seem to support TSC virtualization. I guess I could remove TSC from the guest CPU and add a virtualized HPET, but I've yet to convince myself this works - on UML I made virtual time as a prototype already: https://patchwork.ozlabs.org/patch/1095814/ (though my real goal isn't to just skip time forward when the host goes idle, it's to sync with other simulated components)
On 22/05/2019 14:02, Johannes Berg wrote:> Hi, > > While my main interest is mostly in UML right now [1] I've CC'ed the > qemu and virtualization lists because something similar might actually > apply to other types of virtualization. > > I'm thinking about adding virt-io support to UML, but the tricky part is > that while I want to use the virt-io basics (because it's a nice > interface from the 'inside'), I don't actually want the stock drivers > that are part of the kernel now (like virtio-net etc.) but rather > something that integrates with wifi (probably building on hwsim). > > The 'inside' interfaces aren't really a problem - just have a specific > device ID for this, and then write a normal virtio kernel driver for it. > > The 'outside' interfaces are where my thinking breaks down right now. > > Looking at lkl, the outside is just all implemented in lkl as code that > gets linked to the library, so in UML terms it'd just be extra 'outside' > code like the timer handling or other netdev stuff we have today. > Looking at qemu, it's of course also implemented there, and then > interfaces with the real network, console abstraction, etc. > > However, like I said above, I really need something very custom and not > likely to make it upstream to any project (because what point is that if > you cannot connect to the rest of the environment I'm building), so I'm > thinking that perhaps it should be possible to write an abstract > 'outside' that lets you interact with it really from out-of-process? > Perhaps through some kind of shared memory segment? I think that gets > tricky with virt-io doing DMA (I think it does?) though, so that part > would have to be implemented directly and not out-of-process? > > But really that's why I'm asking - is there a better way than to just > link the device-side virt-io code into the same binary (be it lkl lib, > uml binary, qemu binary)? > > Thanks, > johannes > > [1] Actually, I've considered using qemu, but it doesn't have > virtualized time and doesn't seem to support TSC virtualization. I guess > I could remove TSC from the guest CPU and add a virtualized HPET, but > I've yet to convince myself this works - on UML I made virtual time as a > prototype already: > https://patchwork.ozlabs.org/patch/1095814/ > (though my real goal isn't to just skip time forward when the host goes > idle, it's to sync with other simulated components) > > > _______________________________________________ > linux-um mailing list > linux-um at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um >I have looked at using virtio semantics in UML in the past around the point when I wanted to make the recvmmsg/sendmmsg vector drivers common in UML and QEMU. It is certainly possible, I went for the native approach at the end though. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Hi Anton,> > I'm thinking about adding virt-io support to UML, but the tricky part is > > that while I want to use the virt-io basics (because it's a nice > > interface from the 'inside'), I don't actually want the stock drivers > > that are part of the kernel now (like virtio-net etc.) but rather > > something that integrates with wifi (probably building on hwsim).> I have looked at using virtio semantics in UML in the past around the > point when I wanted to make the recvmmsg/sendmmsg vector drivers common > in UML and QEMU. It is certainly possible, > > I went for the native approach at the end though.Hmm. I'm not sure what you mean by either :-) Is there any commonality between the vector drivers? I can't see how that'd work without a bus abstraction (like virtio) in qemu? I mean, the kernel driver just calls uml_vector_sendmmsg(), which I'd say belongs more to the 'outside world', but that can't really be done in qemu? Ok, I guess then I see what you mean by 'native' though. Similarly, of course, I can implement arbitrary virt-io devices - just the kernel side doesn't call a function like uml_vector_sendmmsg() directly, but instead the virt-io model, and the model calls the function, which essentially is the same just with a (convenient) abstraction layer. But this leaves the fundamental fact the model code ("vector_user.c" or a similar "virtio_user.c") is still part of the build. I guess what I'm thinking is have something like "virtio_user_rpc.c" that uses some appropriate RPC to interact with the real model. IOW, rather than having all the model-specific logic actually be here (like vector_user.c actually knows how to send network packets over a real socket fd), try to call out to some RPC that contains the real model. Now that I thought about it further, I guess my question boils down to "did anyone ever think about doing RPC for Virt-IO instead of putting the entire device model into the hypervisor/emulator/...". johannes
Stefan Hajnoczi
2019-May-23 11:59 UTC
[Qemu-devel] custom virt-io support (in user-mode-linux)
On Wed, May 22, 2019 at 03:02:38PM +0200, Johannes Berg wrote:> Hi, > > While my main interest is mostly in UML right now [1] I've CC'ed the > qemu and virtualization lists because something similar might actually > apply to other types of virtualization. > > I'm thinking about adding virt-io support to UML, but the tricky part is > that while I want to use the virt-io basics (because it's a nice > interface from the 'inside'), I don't actually want the stock drivers > that are part of the kernel now (like virtio-net etc.) but rather > something that integrates with wifi (probably building on hwsim). > > The 'inside' interfaces aren't really a problem - just have a specific > device ID for this, and then write a normal virtio kernel driver for it. > > The 'outside' interfaces are where my thinking breaks down right now. > > Looking at lkl, the outside is just all implemented in lkl as code that > gets linked to the library, so in UML terms it'd just be extra 'outside' > code like the timer handling or other netdev stuff we have today. > Looking at qemu, it's of course also implemented there, and then > interfaces with the real network, console abstraction, etc. > > However, like I said above, I really need something very custom and not > likely to make it upstream to any project (because what point is that if > you cannot connect to the rest of the environment I'm building), so I'm > thinking that perhaps it should be possible to write an abstract > 'outside' that lets you interact with it really from out-of-process? > Perhaps through some kind of shared memory segment? I think that gets > tricky with virt-io doing DMA (I think it does?) though, so that part > would have to be implemented directly and not out-of-process? > > But really that's why I'm asking - is there a better way than to just > link the device-side virt-io code into the same binary (be it lkl lib, > uml binary, qemu binary)?Hi Johannes, Check out vhost-user. It's a protocol for running a subset of a VIRTIO device's emulation in a separate process (usually just the data plane with the PCI emulation and other configuration/setup still handled by QEMU). vhost-user uses a UNIX domain socket to pass file descriptors to shared memory regions. This way the vhost-user device backend process has access to guest RAM. This would be quite different for UML since my understanding is you don't have guest RAM but actual host Linux processes, but vhost-user might still give you ideas: https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEAD Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20190523/17d9a747/attachment-0001.sig>
Johannes Berg
2019-May-23 14:25 UTC
[Qemu-devel] custom virt-io support (in user-mode-linux)
Hi Stefan,> Check out vhost-user. It's a protocol for running a subset of a VIRTIO > device's emulation in a separate process (usually just the data plane > with the PCI emulation and other configuration/setup still handled by > QEMU).Yes, I think that's basically what I'm looking for.> vhost-user uses a UNIX domain socket to pass file descriptors to shared > memory regions. This way the vhost-user device backend process has > access to guest RAM. > > This would be quite different for UML since my understanding is you > don't have guest RAM but actual host Linux processes, but vhost-user > might still give you ideas: > https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEADI guess it could still be implemented. Do you know how qemu actually creates the shared memory region though? It's normal inside kernel memory, no? Ah, no, I see ... you have to give -mem-path and then the entire guest memory isn't allocated as anonymous memory but from a file, and then you can pass a descriptor to that file and effectively the client/slave of vhost-user can access the whole guest's memory. Interesting. Next you're going to want an IOMMU there, not just fake one, to protect against hostile virt-user client? Not that I care :-) UML in fact already maps all of its memory as a file (see arch/um/ create_mem_file()), so this part is easy. What confused me at first is how all this talks about the ioctl() interface, but I think I understand now - it's basically replacing ioctl() with talking to a client. So ultimately, it would actually seem "pretty simple". Not sure I understand why there's all this stuff about multiple FDs, once you have access to the guest's memory, why do you still need a second (or more) FDs? Also, not sure I understand how the client is started? Once we have a connection, I guess as a client I'd at the very least have to handle * VHOST_USER_GET_FEATURES and reply with the features, obviously, which is in this case just VHOST_USER_F_PROTOCOL_FEATURES? * VHOST_USER_SET_FEATURES - not sure, what would that do? the master sends VHOST_USER_GET_PROTOCOL_FEATURES which is with this feature bit? Especially since it says: "Slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support this message even before VHOST_USER_SET_FEATURES was called." * VHOST_USER_GET_PROTOCOL_FEATURES - looking at the list, most I don't really need here, but OK * VHOST_USER_SET_OWNER - ?? * VHOST_USER_RESET_OWNER - ignore * VHOST_USER_SET_MEM_TABLE - store the data/FDs for later use, I guess * VHOST_USER_SET_VRING_NUM - store the data for later use * VHOST_USER_SET_VRING_ADDR - dito * VHOST_USER_SET_VRING_BASE - dito * VHOST_USER_SET_VRING_KICK - start epoll on the FD (assuming there is one, give up if not?) - well, if ring is enabled? * VHOST_USER_SET_VRING_CALL - ... I guess there might be better documentation on the ioctl interfaces? Do you know if there's a sample client/server somewhere? I guess we should implement the server in UML like it is in QEMU (unless we can figure out how to virtualize the time with HPET or something in QEMU) and then have our client and kernel driver for it... Thanks a lot! johannes
Reasonably Related Threads
- custom virt-io support (in user-mode-linux)
- custom virt-io support (in user-mode-linux)
- [Qemu-devel] custom virt-io support (in user-mode-linux)
- [Qemu-devel] custom virt-io support (in user-mode-linux)
- [Qemu-devel] custom virt-io support (in user-mode-linux)