thr3ads.net - Linux Virtualization - custom virt-io support (in user-mode-linux) [May 2019]

If this information is useful, please help other people find it:
Share via:

Johannes Berg

2019-May-22 13:02 UTC

custom virt-io support (in user-mode-linux)

Hi,

While my main interest is mostly in UML right now [1] I've CC'ed the
qemu and virtualization lists because something similar might actually
apply to other types of virtualization.

I'm thinking about adding virt-io support to UML, but the tricky part is
that while I want to use the virt-io basics (because it's a nice
interface from the 'inside'), I don't actually want the stock
drivers
that are part of the kernel now (like virtio-net etc.) but rather
something that integrates with wifi (probably building on hwsim).

The 'inside' interfaces aren't really a problem - just have a
specific
device ID for this, and then write a normal virtio kernel driver for it.

The 'outside' interfaces are where my thinking breaks down right now.

Looking at lkl, the outside is just all implemented in lkl as code that
gets linked to the library, so in UML terms it'd just be extra
'outside'
code like the timer handling or other netdev stuff we have today.
Looking at qemu, it's of course also implemented there, and then
interfaces with the real network, console abstraction, etc.

However, like I said above, I really need something very custom and not
likely to make it upstream to any project (because what point is that if
you cannot connect to the rest of the environment I'm building), so I'm
thinking that perhaps it should be possible to write an abstract
'outside' that lets you interact with it really from out-of-process?
Perhaps through some kind of shared memory segment? I think that gets
tricky with virt-io doing DMA (I think it does?) though, so that part
would have to be implemented directly and not out-of-process?

But really that's why I'm asking - is there a better way than to just
link the device-side virt-io code into the same binary (be it lkl lib,
uml binary, qemu binary)?

Thanks,
johannes

[1] Actually, I've considered using qemu, but it doesn't have
virtualized time and doesn't seem to support TSC virtualization. I guess
I could remove TSC from the guest CPU and add a virtualized HPET, but
I've yet to convince myself this works - on UML I made virtual time as a
prototype already:
https://patchwork.ozlabs.org/patch/1095814/
(though my real goal isn't to just skip time forward when the host goes
idle, it's to sync with other simulated components)

Anton Ivanov

2019-May-22 13:28 UTC

head link

custom virt-io support (in user-mode-linux)

On 22/05/2019 14:02, Johannes Berg wrote:> Hi,
> 
> While my main interest is mostly in UML right now [1] I've CC'ed
the
> qemu and virtualization lists because something similar might actually
> apply to other types of virtualization.
> 
> I'm thinking about adding virt-io support to UML, but the tricky part
is
> that while I want to use the virt-io basics (because it's a nice
> interface from the 'inside'), I don't actually want the stock
drivers
> that are part of the kernel now (like virtio-net etc.) but rather
> something that integrates with wifi (probably building on hwsim).
> 
> The 'inside' interfaces aren't really a problem - just have a
specific
> device ID for this, and then write a normal virtio kernel driver for it.
> 
> The 'outside' interfaces are where my thinking breaks down right
now.
> 
> Looking at lkl, the outside is just all implemented in lkl as code that
> gets linked to the library, so in UML terms it'd just be extra
'outside'
> code like the timer handling or other netdev stuff we have today.
> Looking at qemu, it's of course also implemented there, and then
> interfaces with the real network, console abstraction, etc.
> 
> However, like I said above, I really need something very custom and not
> likely to make it upstream to any project (because what point is that if
> you cannot connect to the rest of the environment I'm building), so
I'm
> thinking that perhaps it should be possible to write an abstract
> 'outside' that lets you interact with it really from
out-of-process?
> Perhaps through some kind of shared memory segment? I think that gets
> tricky with virt-io doing DMA (I think it does?) though, so that part
> would have to be implemented directly and not out-of-process?
> 
> But really that's why I'm asking - is there a better way than to
just
> link the device-side virt-io code into the same binary (be it lkl lib,
> uml binary, qemu binary)?
> 
> Thanks,
> johannes
> 
> [1] Actually, I've considered using qemu, but it doesn't have
> virtualized time and doesn't seem to support TSC virtualization. I
guess
> I could remove TSC from the guest CPU and add a virtualized HPET, but
> I've yet to convince myself this works - on UML I made virtual time as
a
> prototype already:
> https://patchwork.ozlabs.org/patch/1095814/
> (though my real goal isn't to just skip time forward when the host goes
> idle, it's to sync with other simulated components)
> 
> 
> _______________________________________________
> linux-um mailing list
> linux-um at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 
I have looked at using virtio semantics in UML in the past around the 
point when I wanted to make the recvmmsg/sendmmsg vector drivers common 
in UML and QEMU. It is certainly possible,

I went for the native approach at the end though.

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Johannes Berg

2019-May-22 13:46 UTC

head link

custom virt-io support (in user-mode-linux)

Hi Anton,
> > I'm thinking about adding virt-io support to UML, but the tricky
part is
> > that while I want to use the virt-io basics (because it's a nice
> > interface from the 'inside'), I don't actually want the
stock drivers
> > that are part of the kernel now (like virtio-net etc.) but rather
> > something that integrates with wifi (probably building on hwsim).
> I have looked at using virtio semantics in UML in the past around the 
> point when I wanted to make the recvmmsg/sendmmsg vector drivers common 
> in UML and QEMU. It is certainly possible,
> 
> I went for the native approach at the end though.
Hmm. I'm not sure what you mean by either :-)

Is there any commonality between the vector drivers? I can't see how
that'd work without a bus abstraction (like virtio) in qemu? I mean, the
kernel driver just calls uml_vector_sendmmsg(), which I'd say belongs
more to the 'outside world', but that can't really be done in qemu?

Ok, I guess then I see what you mean by 'native' though.

Similarly, of course, I can implement arbitrary virt-io devices - just
the kernel side doesn't call a function like uml_vector_sendmmsg()
directly, but instead the virt-io model, and the model calls the
function, which essentially is the same just with a (convenient)
abstraction layer.

But this leaves the fundamental fact the model code ("vector_user.c"
or
a similar "virtio_user.c") is still part of the build.

I guess what I'm thinking is have something like
"virtio_user_rpc.c"
that uses some appropriate RPC to interact with the real model. IOW,
rather than having all the model-specific logic actually be here (like
vector_user.c actually knows how to send network packets over a real
socket fd), try to call out to some RPC that contains the real model.

Now that I thought about it further, I guess my question boils down to
"did anyone ever think about doing RPC for Virt-IO instead of putting
the entire device model into the hypervisor/emulator/...".

johannes

Stefan Hajnoczi

2019-May-23 11:59 UTC

head link

[Qemu-devel] custom virt-io support (in user-mode-linux)

On Wed, May 22, 2019 at 03:02:38PM +0200, Johannes Berg
wrote:> Hi,
> 
> While my main interest is mostly in UML right now [1] I've CC'ed
the
> qemu and virtualization lists because something similar might actually
> apply to other types of virtualization.
> 
> I'm thinking about adding virt-io support to UML, but the tricky part
is
> that while I want to use the virt-io basics (because it's a nice
> interface from the 'inside'), I don't actually want the stock
drivers
> that are part of the kernel now (like virtio-net etc.) but rather
> something that integrates with wifi (probably building on hwsim).
> 
> The 'inside' interfaces aren't really a problem - just have a
specific
> device ID for this, and then write a normal virtio kernel driver for it.
> 
> The 'outside' interfaces are where my thinking breaks down right
now.
> 
> Looking at lkl, the outside is just all implemented in lkl as code that
> gets linked to the library, so in UML terms it'd just be extra
'outside'
> code like the timer handling or other netdev stuff we have today.
> Looking at qemu, it's of course also implemented there, and then
> interfaces with the real network, console abstraction, etc.
> 
> However, like I said above, I really need something very custom and not
> likely to make it upstream to any project (because what point is that if
> you cannot connect to the rest of the environment I'm building), so
I'm
> thinking that perhaps it should be possible to write an abstract
> 'outside' that lets you interact with it really from
out-of-process?
> Perhaps through some kind of shared memory segment? I think that gets
> tricky with virt-io doing DMA (I think it does?) though, so that part
> would have to be implemented directly and not out-of-process?
> 
> But really that's why I'm asking - is there a better way than to
just
> link the device-side virt-io code into the same binary (be it lkl lib,
> uml binary, qemu binary)?
Hi Johannes,
Check out vhost-user.  It's a protocol for running a subset of a VIRTIO
device's emulation in a separate process (usually just the data plane
with the PCI emulation and other configuration/setup still handled by
QEMU).

vhost-user uses a UNIX domain socket to pass file descriptors to shared
memory regions.  This way the vhost-user device backend process has
access to guest RAM.

This would be quite different for UML since my understanding is you
don't have guest RAM but actual host Linux processes, but vhost-user
might still give you ideas:
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEAD

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20190523/17d9a747/attachment-0001.sig>

Johannes Berg

2019-May-23 14:25 UTC

head link

[Qemu-devel] custom virt-io support (in user-mode-linux)

Hi Stefan,
> Check out vhost-user.  It's a protocol for running a subset of a VIRTIO
> device's emulation in a separate process (usually just the data plane
> with the PCI emulation and other configuration/setup still handled by
> QEMU).
Yes, I think that's basically what I'm looking for.
> vhost-user uses a UNIX domain socket to pass file descriptors to shared
> memory regions.  This way the vhost-user device backend process has
> access to guest RAM.
> 
> This would be quite different for UML since my understanding is you
> don't have guest RAM but actual host Linux processes, but vhost-user
> might still give you ideas:
>
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEAD
I guess it could still be implemented. Do you know how qemu actually
creates the shared memory region though? It's normal inside kernel
memory, no?

Ah, no, I see ... you have to give -mem-path and then the entire guest
memory isn't allocated as anonymous memory but from a file, and then you
can pass a descriptor to that file and effectively the client/slave of
vhost-user can access the whole guest's memory. Interesting. Next you're
going to want an IOMMU there, not just fake one, to protect against
hostile virt-user client? Not that I care :-)

UML in fact already maps all of its memory as a file (see arch/um/
create_mem_file()), so this part is easy.

What confused me at first is how all this talks about the ioctl()
interface, but I think I understand now - it's basically replacing
ioctl() with talking to a client.

So ultimately, it would actually seem "pretty simple".

Not sure I understand why there's all this stuff about multiple FDs,
once you have access to the guest's memory, why do you still need a
second (or more) FDs?

Also, not sure I understand how the client is started?

Once we have a connection, I guess as a client I'd at the very least
have to handle
 * VHOST_USER_GET_FEATURES and reply with the features, obviously, which
   is in this case just VHOST_USER_F_PROTOCOL_FEATURES?

 * VHOST_USER_SET_FEATURES - not sure, what would that do? the master
   sends VHOST_USER_GET_PROTOCOL_FEATURES which is with this feature
   bit? Especially since it says: "Slave that reported
   VHOST_USER_F_PROTOCOL_FEATURES must support this message even before
   VHOST_USER_SET_FEATURES was called."

 * VHOST_USER_GET_PROTOCOL_FEATURES - looking at the list, most I don't
   really need here, but OK

 * VHOST_USER_SET_OWNER - ??

 * VHOST_USER_RESET_OWNER - ignore

 * VHOST_USER_SET_MEM_TABLE - store the data/FDs for later use, I guess

 * VHOST_USER_SET_VRING_NUM - store the data for later use
 * VHOST_USER_SET_VRING_ADDR - dito
 * VHOST_USER_SET_VRING_BASE - dito
 * VHOST_USER_SET_VRING_KICK - start epoll on the FD (assuming there is
                               one, give up if not?) - well, if ring is
                               enabled?
 * VHOST_USER_SET_VRING_CALL - ...

I guess there might be better documentation on the ioctl interfaces?


Do you know if there's a sample client/server somewhere?

I guess we should implement the server in UML like it is in QEMU (unless
we can figure out how to virtualize the time with HPET or something in
QEMU) and then have our client and kernel driver for it...


Thanks a lot!

johannes

Possibly Parallel Threads

Search for more apparently analagous threads

Linux Virtualization - May 2019 - custom virt-io support (in user-mode-linux)

custom virt-io support (in user-mode-linux)

custom virt-io support (in user-mode-linux)

custom virt-io support (in user-mode-linux)

[Qemu-devel] custom virt-io support (in user-mode-linux)

[Qemu-devel] custom virt-io support (in user-mode-linux)

Possibly Parallel Threads