thr3ads.net - Virtualization - [PATCH v3 1/2] drm/virtio: Add window server support [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Gerd Hoffmann

2018-Feb-06 14:23 UTC

[PATCH v3 1/2] drm/virtio: Add window server support

Hi,
> > Hmm?  I'm assuming the wayland client (in the guest) talks to the
> > wayland proxy, using the wayland protocol, like it would talk to a
> > wayland display server.  Buffers must be passed from client to
> > server/proxy somehow, probably using fd passing, so where is the
> > problem?
> > 
> > Or did I misunderstand the role of the proxy?
> 
> Hi Gerd,
> 
> it's starting to look to me that we're talking a bit past the
other, so I
> have pasted below a few words describing my current plan regarding the 3
key
> scenarios that I'm addressing.
You are describing the details, but I'm missing the big picture ...

So, virtualization aside, how do buffers work in wayland?  As far I know
it goes like this:

  (a) software rendering: client allocates shared memory buffer, renders
      into it, then passes a file handle for that shmem block together
      with some meta data (size, format, ...) to the wayland server.

  (b) gpu rendering: client opens a render node, allocates a buffer,
      asks the cpu to renders into it, exports the buffer as dma-buf
      (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland server
      (again including meta data of course).

Is that correct?

Now, with virtualization added to the mix it becomes a bit more
complicated.  Client and server are unmodified.  The client talks to the
guest proxy (wayland protocol).  The guest proxy talks to the host proxy
(protocol to be defined). The host proxy talks to the server (wayland
protocol).

Buffers must be managed along the way, and we want avoid copying around
the buffers.  The host proxy could be implemented directly in qemu, or
as separate process which cooperates with qemu for buffer management.

Fine so far?
> I really think that whatever we come up with needs to support 3D clients as
> well.
Lets start with 3d clients, I think these are easier.  They simply use
virtio-gpu for 3d rendering as usual.  When they are done the rendered
buffer already lives in a host drm buffer (because virgl runs the actual
rendering on the host gpu).  So the client passes the dma-buf to the
guest proxy, the guest proxy imports it to look up the resource-id,
passes the resource-id to the host proxy, the host proxy looks up the
drm buffer and exports it as dma-buf, then passes it to the server.
Done, without any extra data copies.
> Creation of shareable buffer by guest
> -------------------------------------------------
> 
> 1. Client requests virtio driver to create a buffer suitable for sharing
> with host (DRM_VIRTGPU_RESOURCE_CREATE)
client or guest proxy?
> 4. QEMU maps that buffer to the guest's address space
> (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio driver
That part is problematic.  The host can't simply allocate something in
the physical address space, because most physical address space
management is done by the guest.  All pci bars are mapped by the guest
firmware for example (or by the guest OS in case of hotplug).
> 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD for each
> resource, sends data + FDs to the compositor with SCM_RIGHTS
BTW: Is there a 1:1 relationship between buffers and shmem blocks?  Or
does the wayland protocol allow for offsets in buffer meta data, so you
can place multiple buffers in a single shmem block?

cheers,
  Gerd

Michael S. Tsirkin

2018-Feb-07 01:09 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On Tue, Feb 06, 2018 at 03:23:02PM +0100, Gerd Hoffmann
wrote:> > Creation of shareable buffer by guest
> > -------------------------------------------------
> > 
> > 1. Client requests virtio driver to create a buffer suitable for
sharing
> > with host (DRM_VIRTGPU_RESOURCE_CREATE)
> 
> client or guest proxy?
> 
> > 4. QEMU maps that buffer to the guest's address space
> > (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio
driver
> 
> That part is problematic.  The host can't simply allocate something in
> the physical address space, because most physical address space
> management is done by the guest.  All pci bars are mapped by the guest
> firmware for example (or by the guest OS in case of hotplug).
> 
> > 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD for
each
> > resource, sends data + FDs to the compositor with SCM_RIGHTS
If you squint hard, this sounds a bit like a use-case for vhost-user-gpu, does
it not?
> BTW: Is there a 1:1 relationship between buffers and shmem blocks?  Or
> does the wayland protocol allow for offsets in buffer meta data, so you
> can place multiple buffers in a single shmem block?
> 
> cheers,
>   Gerd

Tomeu Vizoso

2018-Feb-07 07:41 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On 02/07/2018 02:09 AM, Michael S. Tsirkin wrote:> On Tue, Feb 06, 2018 at 03:23:02PM +0100, Gerd Hoffmann wrote:
>>> Creation of shareable buffer by guest
>>> -------------------------------------------------
>>>
>>> 1. Client requests virtio driver to create a buffer suitable for
sharing
>>> with host (DRM_VIRTGPU_RESOURCE_CREATE)
>>
>> client or guest proxy?
>>
>>> 4. QEMU maps that buffer to the guest's address space
>>> (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio
driver
>>
>> That part is problematic.  The host can't simply allocate something
in
>> the physical address space, because most physical address space
>> management is done by the guest.  All pci bars are mapped by the guest
>> firmware for example (or by the guest OS in case of hotplug).
>>
>>> 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD for
each
>>> resource, sends data + FDs to the compositor with SCM_RIGHTS
> 
> If you squint hard, this sounds a bit like a use-case for vhost-user-gpu,
does it not?
Can you extend on what makes you think that?

As an aside, crosvm runs the virtio-gpu device in a separate, jailed
process, among other virtual devices.

https://chromium.googlesource.com/chromiumos/platform/crosvm/

Regards,

Tomeu

Tomeu Vizoso

2018-Feb-07 09:49 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On 02/06/2018 03:23 PM, Gerd Hoffmann wrote:>    Hi,
> 
>>> Hmm?  I'm assuming the wayland client (in the guest) talks to
the
>>> wayland proxy, using the wayland protocol, like it would talk to a
>>> wayland display server.  Buffers must be passed from client to
>>> server/proxy somehow, probably using fd passing, so where is the
>>> problem?
>>>
>>> Or did I misunderstand the role of the proxy?
>>
>> Hi Gerd,
>>
>> it's starting to look to me that we're talking a bit past the
other, so I
>> have pasted below a few words describing my current plan regarding the
3 key
>> scenarios that I'm addressing.
> 
> You are describing the details, but I'm missing the big picture ...
> 
> So, virtualization aside, how do buffers work in wayland?  As far I know
> it goes like this:
> 
>    (a) software rendering: client allocates shared memory buffer, renders
>        into it, then passes a file handle for that shmem block together
>        with some meta data (size, format, ...) to the wayland server.
> 
>    (b) gpu rendering: client opens a render node, allocates a buffer,
>        asks the cpu to renders into it, exports the buffer as dma-buf
>        (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland server
>        (again including meta data of course).
> 
> Is that correct?
Both are correct descriptions of typical behaviors. But it isn't spec'ed
anywhere who has to do the buffer allocation.

In practical terms, the buffer allocation happens in either the 2D GUI 
toolkit (gtk+, for example), or the EGL implementation. Someone using 
this in a real product would most probably be interested in avoiding any 
extra copies and make sure that both allocate buffers via virtio-gpu, for 
example.

Depending on the use case, they could be also interested in supporting 
unmodified clients with an extra copy per buffer presentation.

That's to say that if we cannot come up with a zero-copy solution for 
unmodified clients, we should at least support zero-copy for cooperative 
clients.
> Now, with virtualization added to the mix it becomes a bit more
> complicated.  Client and server are unmodified.  The client talks to the
> guest proxy (wayland protocol).  The guest proxy talks to the host proxy
> (protocol to be defined). The host proxy talks to the server (wayland
> protocol).
> 
> Buffers must be managed along the way, and we want avoid copying around
> the buffers.  The host proxy could be implemented directly in qemu, or
> as separate process which cooperates with qemu for buffer management.
> 
> Fine so far?
Yep.
>> I really think that whatever we come up with needs to support 3D
clients as
>> well.
> 
> Lets start with 3d clients, I think these are easier.  They simply use
> virtio-gpu for 3d rendering as usual.  When they are done the rendered
> buffer already lives in a host drm buffer (because virgl runs the actual
> rendering on the host gpu).  So the client passes the dma-buf to the
> guest proxy, the guest proxy imports it to look up the resource-id,
> passes the resource-id to the host proxy, the host proxy looks up the
> drm buffer and exports it as dma-buf, then passes it to the server.
> Done, without any extra data copies.
Yep.
>> Creation of shareable buffer by guest
>> -------------------------------------------------
>>
>> 1. Client requests virtio driver to create a buffer suitable for
sharing
>> with host (DRM_VIRTGPU_RESOURCE_CREATE)
> 
> client or guest proxy?
As per the above, the GUI toolkit could have been modified so the client 
directly creates a shareable buffer, and renders directly to it without 
any extra copies.

If clients cannot be modified, then it's the guest proxy what has to 
create the shareable buffer and keep it in sync with the client's 
non-shareable buffer at the right times, by intercepting 
wl_surface.commit messages and copying buffer contents.
>> 4. QEMU maps that buffer to the guest's address space
>> (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio driver
> 
> That part is problematic.  The host can't simply allocate something in
> the physical address space, because most physical address space
> management is done by the guest.  All pci bars are mapped by the guest
> firmware for example (or by the guest OS in case of hotplug).
How can KVM_SET_USER_MEMORY_REGION ever be safely used then? I would have 
expected that callers of that ioctl have enough knowledge to be able to 
choose a physical address that won't conflict with the guest's kernel.

I see that the ivshmem device in QEMU registers the memory region in BAR 
2 of a PCI device instead. Would that be better in your opinion?
>> 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD for
each
>> resource, sends data + FDs to the compositor with SCM_RIGHTS
> 
> BTW: Is there a 1:1 relationship between buffers and shmem blocks?  Or
> does the wayland protocol allow for offsets in buffer meta data, so you
> can place multiple buffers in a single shmem block?
The latter: 
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool

Regards,

Tomeu

Tomeu Vizoso

2018-Feb-09 11:14 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

Hi Gerd and Stefan,

can we reach agreement on whether vsock should be involved in this?

Thanks,

Tomeu

On 02/07/2018 10:49 AM, Tomeu Vizoso wrote:> On 02/06/2018 03:23 PM, Gerd Hoffmann wrote:
>> ?? Hi,
>>
>>>> Hmm?? I'm assuming the wayland client (in the guest) talks
to the
>>>> wayland proxy, using the wayland protocol, like it would talk
to a
>>>> wayland display server.? Buffers must be passed from client to
>>>> server/proxy somehow, probably using fd passing, so where is
the
>>>> problem?
>>>>
>>>> Or did I misunderstand the role of the proxy?
>>>
>>> Hi Gerd,
>>>
>>> it's starting to look to me that we're talking a bit past
the other, so I
>>> have pasted below a few words describing my current plan regarding
the
>>> 3 key
>>> scenarios that I'm addressing.
>>
>> You are describing the details, but I'm missing the big picture ...
>>
>> So, virtualization aside, how do buffers work in wayland?? As far I
know
>> it goes like this:
>>
>> ?? (a) software rendering: client allocates shared memory buffer,
renders
>> ?????? into it, then passes a file handle for that shmem block together
>> ?????? with some meta data (size, format, ...) to the wayland server.
>>
>> ?? (b) gpu rendering: client opens a render node, allocates a buffer,
>> ?????? asks the cpu to renders into it, exports the buffer as dma-buf
>> ?????? (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland
server
>> ?????? (again including meta data of course).
>>
>> Is that correct?
> 
> Both are correct descriptions of typical behaviors. But it isn't
spec'ed
> anywhere who has to do the buffer allocation.
> 
> In practical terms, the buffer allocation happens in either the 2D GUI 
> toolkit (gtk+, for example), or the EGL implementation. Someone using 
> this in a real product would most probably be interested in avoiding any 
> extra copies and make sure that both allocate buffers via virtio-gpu, for 
> example.
> 
> Depending on the use case, they could be also interested in supporting 
> unmodified clients with an extra copy per buffer presentation.
> 
> That's to say that if we cannot come up with a zero-copy solution for 
> unmodified clients, we should at least support zero-copy for cooperative 
> clients.
> 
>> Now, with virtualization added to the mix it becomes a bit more
>> complicated.? Client and server are unmodified.? The client talks to
the
>> guest proxy (wayland protocol).? The guest proxy talks to the host
proxy
>> (protocol to be defined). The host proxy talks to the server (wayland
>> protocol).
>>
>> Buffers must be managed along the way, and we want avoid copying around
>> the buffers.? The host proxy could be implemented directly in qemu, or
>> as separate process which cooperates with qemu for buffer management.
>>
>> Fine so far?
> 
> Yep.
> 
>>> I really think that whatever we come up with needs to support 3D 
>>> clients as
>>> well.
>>
>> Lets start with 3d clients, I think these are easier.? They simply use
>> virtio-gpu for 3d rendering as usual.? When they are done the rendered
>> buffer already lives in a host drm buffer (because virgl runs the
actual
>> rendering on the host gpu).? So the client passes the dma-buf to the
>> guest proxy, the guest proxy imports it to look up the resource-id,
>> passes the resource-id to the host proxy, the host proxy looks up the
>> drm buffer and exports it as dma-buf, then passes it to the server.
>> Done, without any extra data copies.
> 
> Yep.
> 
>>> Creation of shareable buffer by guest
>>> -------------------------------------------------
>>>
>>> 1. Client requests virtio driver to create a buffer suitable for
sharing
>>> with host (DRM_VIRTGPU_RESOURCE_CREATE)
>>
>> client or guest proxy?
> 
> As per the above, the GUI toolkit could have been modified so the client 
> directly creates a shareable buffer, and renders directly to it without 
> any extra copies.
> 
> If clients cannot be modified, then it's the guest proxy what has to 
> create the shareable buffer and keep it in sync with the client's 
> non-shareable buffer at the right times, by intercepting 
> wl_surface.commit messages and copying buffer contents.
> 
>>> 4. QEMU maps that buffer to the guest's address space
>>> (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio
driver
>>
>> That part is problematic.? The host can't simply allocate something
in
>> the physical address space, because most physical address space
>> management is done by the guest.? All pci bars are mapped by the guest
>> firmware for example (or by the guest OS in case of hotplug).
> 
> How can KVM_SET_USER_MEMORY_REGION ever be safely used then? I would have 
> expected that callers of that ioctl have enough knowledge to be able to 
> choose a physical address that won't conflict with the guest's
kernel.
> 
> I see that the ivshmem device in QEMU registers the memory region in BAR 
> 2 of a PCI device instead. Would that be better in your opinion?
> 
>>> 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD for
each
>>> resource, sends data + FDs to the compositor with SCM_RIGHTS
>>
>> BTW: Is there a 1:1 relationship between buffers and shmem blocks?? Or
>> does the wayland protocol allow for offsets in buffer meta data, so you
>> can place multiple buffers in a single shmem block?
> 
> The latter: 
>
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool
> 
> Regards,
> 
> Tomeu

Gerd Hoffmann

2018-Feb-12 11:45 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

Hi,
> >    (a) software rendering: client allocates shared memory buffer,
renders
> >        into it, then passes a file handle for that shmem block
together
> >        with some meta data (size, format, ...) to the wayland server.
> > 
> >    (b) gpu rendering: client opens a render node, allocates a buffer,
> >        asks the cpu to renders into it, exports the buffer as dma-buf
> >        (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland
server
> >        (again including meta data of course).
> > 
> > Is that correct?
> 
> Both are correct descriptions of typical behaviors. But it isn't
spec'ed
> anywhere who has to do the buffer allocation.
Well, according to Pekka's reply it is spec'ed that way, for the
existing buffer types.  So for server allocated buffers you need
(a) a wayland protocol extension and (b) support for the extension
in the clients.
> That's to say that if we cannot come up with a zero-copy solution for
> unmodified clients, we should at least support zero-copy for cooperative
> clients.
"cooperative clients" == "client which has support for the
wayland
protocol extension", correct?
> > > Creation of shareable buffer by guest
> > > -------------------------------------------------
> > > 
> > > 1. Client requests virtio driver to create a buffer suitable for
sharing
> > > with host (DRM_VIRTGPU_RESOURCE_CREATE)
> > 
> > client or guest proxy?
> 
> As per the above, the GUI toolkit could have been modified so the client
> directly creates a shareable buffer, and renders directly to it without any
> extra copies.
> 
> If clients cannot be modified, then it's the guest proxy what has to
create
> the shareable buffer and keep it in sync with the client's
non-shareable
> buffer at the right times, by intercepting wl_surface.commit messages and
> copying buffer contents.
Ok.
> > > 4. QEMU maps that buffer to the guest's address space
> > > (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio
driver
> > 
> > That part is problematic.  The host can't simply allocate
something in
> > the physical address space, because most physical address space
> > management is done by the guest.  All pci bars are mapped by the guest
> > firmware for example (or by the guest OS in case of hotplug).
> 
> How can KVM_SET_USER_MEMORY_REGION ever be safely used then? I would have
> expected that callers of that ioctl have enough knowledge to be able to
> choose a physical address that won't conflict with the guest's
kernel.
Depends on the kind of region.  Guest RAM is allocated and mapped by
qemu, guest firmware can query qemu about RAM mappings using a special
interface, then create a e820 memory map for the guest os.  PCI device
bars are mapped according to the pci config space registers, which in
turn are initialized by the guest firmware, so it is basically in the
guests hand where they show up.
> I see that the ivshmem device in QEMU registers the memory region in BAR 2
> of a PCI device instead. Would that be better in your opinion?
Yes.
> > > 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD
for each
> > > resource, sends data + FDs to the compositor with SCM_RIGHTS
> > 
> > BTW: Is there a 1:1 relationship between buffers and shmem blocks?  Or
> > does the wayland protocol allow for offsets in buffer meta data, so
you
> > can place multiple buffers in a single shmem block?
> 
> The latter:
>
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool
Ah, good, that makes it alot easier.

So, yes, using ivshmem would be one option.  Tricky part here is the
buffer management though.  It's just a raw piece of memory.  The guest
proxy could mmap the pci bar and manage it.  But then it is again either
unmodified guest + copying the data, or modified client (which requests
buffers from guest proxy) for zero-copy.

Another idea would be extending stdvga.  Basically qemu would have to
use shmem as backing storage for vga memory instead of anonymous memory,
so it would be very  simliar to ivshmem on the host side.  But on the
guest side we have a drm driver for it (bochs-drm).  So clients can
allocate dumb drm buffers for software rendering, and the buffer would
already be backed by a host shmem segment.  Given that wayland already
supports drm buffers for 3d rendering that could work without extending
the wayland protocol.  The client proxy would have to translate the drm
buffer into an pci bar offset and pass it to the host side.  The host
proxy could register the pci bar as wl_shm_pool, then just pass through
the offset to reference the individual buffers.

Drawback of both approaches would be that software rendering and gpu
rendering would use quite different code paths.

We also need a solution for the keymap shmem block.  I guess the keymap
doesn't change all that often, so maybe it is easiest to just copy it
over (host proxy -> guest proxy) instead of trying to map the host shmem
into the guest?

cheers,
  Gerd

Possibly Parallel Threads

Search for more possibly parallel threads

Virtualization - Feb 2018 - [PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

Possibly Parallel Threads