thr3ads.net - Linux Virtualization - [PATCH v3 1/2] drm/virtio: Add window server support [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Gerd Hoffmann

2018-Feb-12 11:45 UTC

[PATCH v3 1/2] drm/virtio: Add window server support

Hi,
> >    (a) software rendering: client allocates shared memory buffer,
renders
> >        into it, then passes a file handle for that shmem block
together
> >        with some meta data (size, format, ...) to the wayland server.
> > 
> >    (b) gpu rendering: client opens a render node, allocates a buffer,
> >        asks the cpu to renders into it, exports the buffer as dma-buf
> >        (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland
server
> >        (again including meta data of course).
> > 
> > Is that correct?
> 
> Both are correct descriptions of typical behaviors. But it isn't
spec'ed
> anywhere who has to do the buffer allocation.
Well, according to Pekka's reply it is spec'ed that way, for the
existing buffer types.  So for server allocated buffers you need
(a) a wayland protocol extension and (b) support for the extension
in the clients.
> That's to say that if we cannot come up with a zero-copy solution for
> unmodified clients, we should at least support zero-copy for cooperative
> clients.
"cooperative clients" == "client which has support for the
wayland
protocol extension", correct?
> > > Creation of shareable buffer by guest
> > > -------------------------------------------------
> > > 
> > > 1. Client requests virtio driver to create a buffer suitable for
sharing
> > > with host (DRM_VIRTGPU_RESOURCE_CREATE)
> > 
> > client or guest proxy?
> 
> As per the above, the GUI toolkit could have been modified so the client
> directly creates a shareable buffer, and renders directly to it without any
> extra copies.
> 
> If clients cannot be modified, then it's the guest proxy what has to
create
> the shareable buffer and keep it in sync with the client's
non-shareable
> buffer at the right times, by intercepting wl_surface.commit messages and
> copying buffer contents.
Ok.
> > > 4. QEMU maps that buffer to the guest's address space
> > > (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the virtio
driver
> > 
> > That part is problematic.  The host can't simply allocate
something in
> > the physical address space, because most physical address space
> > management is done by the guest.  All pci bars are mapped by the guest
> > firmware for example (or by the guest OS in case of hotplug).
> 
> How can KVM_SET_USER_MEMORY_REGION ever be safely used then? I would have
> expected that callers of that ioctl have enough knowledge to be able to
> choose a physical address that won't conflict with the guest's
kernel.
Depends on the kind of region.  Guest RAM is allocated and mapped by
qemu, guest firmware can query qemu about RAM mappings using a special
interface, then create a e820 memory map for the guest os.  PCI device
bars are mapped according to the pci config space registers, which in
turn are initialized by the guest firmware, so it is basically in the
guests hand where they show up.
> I see that the ivshmem device in QEMU registers the memory region in BAR 2
> of a PCI device instead. Would that be better in your opinion?
Yes.
> > > 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD
for each
> > > resource, sends data + FDs to the compositor with SCM_RIGHTS
> > 
> > BTW: Is there a 1:1 relationship between buffers and shmem blocks?  Or
> > does the wayland protocol allow for offsets in buffer meta data, so
you
> > can place multiple buffers in a single shmem block?
> 
> The latter:
>
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool
Ah, good, that makes it alot easier.

So, yes, using ivshmem would be one option.  Tricky part here is the
buffer management though.  It's just a raw piece of memory.  The guest
proxy could mmap the pci bar and manage it.  But then it is again either
unmodified guest + copying the data, or modified client (which requests
buffers from guest proxy) for zero-copy.

Another idea would be extending stdvga.  Basically qemu would have to
use shmem as backing storage for vga memory instead of anonymous memory,
so it would be very  simliar to ivshmem on the host side.  But on the
guest side we have a drm driver for it (bochs-drm).  So clients can
allocate dumb drm buffers for software rendering, and the buffer would
already be backed by a host shmem segment.  Given that wayland already
supports drm buffers for 3d rendering that could work without extending
the wayland protocol.  The client proxy would have to translate the drm
buffer into an pci bar offset and pass it to the host side.  The host
proxy could register the pci bar as wl_shm_pool, then just pass through
the offset to reference the individual buffers.

Drawback of both approaches would be that software rendering and gpu
rendering would use quite different code paths.

We also need a solution for the keymap shmem block.  I guess the keymap
doesn't change all that often, so maybe it is easiest to just copy it
over (host proxy -> guest proxy) instead of trying to map the host shmem
into the guest?

cheers,
  Gerd

Pekka Paalanen

2018-Feb-13 07:41 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On Mon, 12 Feb 2018 12:45:40 +0100
Gerd Hoffmann <kraxel at redhat.com> wrote:
>   Hi,
> 
> > >    (a) software rendering: client allocates shared memory buffer,
renders
> > >        into it, then passes a file handle for that shmem block
together
> > >        with some meta data (size, format, ...) to the wayland
server.
> > > 
> > >    (b) gpu rendering: client opens a render node, allocates a
buffer,
> > >        asks the cpu to renders into it, exports the buffer as
dma-buf
> > >        (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland
server
> > >        (again including meta data of course).
> > > 
> > > Is that correct?  
> > 
> > Both are correct descriptions of typical behaviors. But it isn't
spec'ed
> > anywhere who has to do the buffer allocation.  
> 
> Well, according to Pekka's reply it is spec'ed that way, for the
> existing buffer types.  So for server allocated buffers you need
> (a) a wayland protocol extension and (b) support for the extension
> in the clients.
Correct. Or simply a libEGL that uses such Wayland extension behind
everyone's back. I believe such things did at least exist, but are
probably not relevant for this discussion.

(If there is a standard library, like libEGL, loaded and used by both a
server and a client, that library can advertise custom private Wayland
protocol extensions and the client side can take advantage of them,
both without needing any code changes on either the server or the
client.)
> We also need a solution for the keymap shmem block.  I guess the keymap
> doesn't change all that often, so maybe it is easiest to just copy it
> over (host proxy -> guest proxy) instead of trying to map the host shmem
> into the guest?
Yes, I believe that would be a perfectly valid solution for that
particular case.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20180213/c6786785/attachment-0001.sig>

Tomeu Vizoso

2018-Feb-13 14:27 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On 02/12/2018 12:45 PM, Gerd Hoffmann wrote:>    Hi,
> 
>>>     (a) software rendering: client allocates shared memory buffer,
renders
>>>         into it, then passes a file handle for that shmem block
together
>>>         with some meta data (size, format, ...) to the wayland
server.
>>>
>>>     (b) gpu rendering: client opens a render node, allocates a
buffer,
>>>         asks the cpu to renders into it, exports the buffer as
dma-buf
>>>         (DRM_IOCTL_PRIME_HANDLE_TO_FD), passes this to the wayland
server
>>>         (again including meta data of course).
>>>
>>> Is that correct?
>>
>> Both are correct descriptions of typical behaviors. But it isn't
spec'ed
>> anywhere who has to do the buffer allocation.
> 
> Well, according to Pekka's reply it is spec'ed that way, for the
> existing buffer types.  So for server allocated buffers you need
> (a) a wayland protocol extension and (b) support for the extension
> in the clients.
> 
>> That's to say that if we cannot come up with a zero-copy solution
for
>> unmodified clients, we should at least support zero-copy for
cooperative
>> clients.
> 
> "cooperative clients" == "client which has support for the
wayland
> protocol extension", correct?
Guess it could be that, but I was rather thinking of clients that would 
allocate the buffer for wl_shm_pool with DRM_VIRTGPU_RESOURCE_CREATE or 
equivalent. Then that buffer would be exported and the fd passed using 
the standard wl_shm protocol.
>>>> 4. QEMU maps that buffer to the guest's address space
>>>> (KVM_SET_USER_MEMORY_REGION), passes the guest PFN to the
virtio driver
>>>
>>> That part is problematic.  The host can't simply allocate
something in
>>> the physical address space, because most physical address space
>>> management is done by the guest.  All pci bars are mapped by the
guest
>>> firmware for example (or by the guest OS in case of hotplug).
>>
>> How can KVM_SET_USER_MEMORY_REGION ever be safely used then? I would
have
>> expected that callers of that ioctl have enough knowledge to be able to
>> choose a physical address that won't conflict with the guest's
kernel.
> 
> Depends on the kind of region.  Guest RAM is allocated and mapped by
> qemu, guest firmware can query qemu about RAM mappings using a special
> interface, then create a e820 memory map for the guest os.  PCI device
> bars are mapped according to the pci config space registers, which in
> turn are initialized by the guest firmware, so it is basically in the
> guests hand where they show up.
> 
>> I see that the ivshmem device in QEMU registers the memory region in
BAR 2
>> of a PCI device instead. Would that be better in your opinion?
> 
> Yes.
Would it make sense for virtio-gpu to map buffers to the guest via PCI 
BARs? So we can use a single drm driver for both 2d and 3d.
>>>> 4. QEMU pops data+buffers from the virtqueue, looks up shmem FD
for each
>>>> resource, sends data + FDs to the compositor with SCM_RIGHTS
>>>
>>> BTW: Is there a 1:1 relationship between buffers and shmem blocks? 
Or
>>> does the wayland protocol allow for offsets in buffer meta data, so
you
>>> can place multiple buffers in a single shmem block?
>>
>> The latter:
>>
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool
> 
> Ah, good, that makes it alot easier.
> 
> So, yes, using ivshmem would be one option.  Tricky part here is the
> buffer management though.  It's just a raw piece of memory.  The guest
> proxy could mmap the pci bar and manage it.  But then it is again either
> unmodified guest + copying the data, or modified client (which requests
> buffers from guest proxy) for zero-copy.
> 
> Another idea would be extending stdvga.  Basically qemu would have to
> use shmem as backing storage for vga memory instead of anonymous memory,
> so it would be very  simliar to ivshmem on the host side.  But on the
> guest side we have a drm driver for it (bochs-drm).  So clients can
> allocate dumb drm buffers for software rendering, and the buffer would
> already be backed by a host shmem segment.  Given that wayland already
> supports drm buffers for 3d rendering that could work without extending
> the wayland protocol.  The client proxy would have to translate the drm
> buffer into an pci bar offset and pass it to the host side.  The host
> proxy could register the pci bar as wl_shm_pool, then just pass through
> the offset to reference the individual buffers.
> 
> Drawback of both approaches would be that software rendering and gpu
> rendering would use quite different code paths.
Yeah, would be great if we could find a way to avoid that.
> We also need a solution for the keymap shmem block.  I guess the keymap
> doesn't change all that often, so maybe it is easiest to just copy it
> over (host proxy -> guest proxy) instead of trying to map the host shmem
> into the guest?
I think that should be fine for now. Something similar will have to 
happen for the clipboard, which currently uses pipes to exchange data.

Thanks,

Tomeu

Tomeu Vizoso

2018-Feb-15 15:28 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

On 02/12/2018 12:45 PM, Gerd Hoffmann wrote:>>>> 4. QEMU pops 
data+buffers from the virtqueue, looks up shmem FD for each
 >>>> resource, sends data + FDs to the compositor with SCM_RIGHTS
 >>>
 >>> BTW: Is there a 1:1 relationship between buffers and shmem blocks?
Or
 >>> does the wayland protocol allow for offsets in buffer meta data,
so you
 >>> can place multiple buffers in a single shmem block?
 >>
 >> The latter:
 >> 
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_shm_pool
 >
 > Ah, good, that makes it alot easier.
 >
 > So, yes, using ivshmem would be one option.  Tricky part here is the
 > buffer management though.  It's just a raw piece of memory.  The guest
 > proxy could mmap the pci bar and manage it.  But then it is again either
 > unmodified guest + copying the data, or modified client (which requests
 > buffers from guest proxy) for zero-copy.

What if at VIRTIO_GPU_CMD_RESOURCE_CREATE_2D time we created a ivshmem 
device to back that resource. The ivshmem device would in turn be backed 
by a hostmem device that wraps a shmem FD.

The guest client can then export that resource/BO and pass the FD to the 
guest proxy. The guest proxy would import it and put the resource_id in 
the equivalent message in our protocol extension.

QEMU would get that resource id from vsock, look up which hostmem device 
is associated with that resource, and pass its FD to the compositor.

 > We also need a solution for the keymap shmem block.  I guess the keymap
 > doesn't change all that often, so maybe it is easiest to just copy it
 > over (host proxy -> guest proxy) instead of trying to map the host
shmem
 > into the guest?

Not sure if that would be much simpler than creating a ivshmem+hostmem 
combo that wraps the incoming shmem FD and then having virtio-gpu create 
a BO that imports it.

Regards,

Tomeu

Gerd Hoffmann

2018-Feb-16 10:48 UTC

head link

[PATCH v3 1/2] drm/virtio: Add window server support

> > Yes.
> 
> Would it make sense for virtio-gpu to map buffers to the guest via PCI
BARs?
> So we can use a single drm driver for both 2d and 3d.
Should be doable.

I'm wondering two things though:

(1) Will shmem actually help avoiding a copy?

virtio-gpu with virgl will (even if the guest doesn't use opengl) store
the resources in gpu memory.  So the VIRTIO_GPU_CMD_TRANSFER_TO_HOST_2D
copy goes from guest memory directly to gpu memory, and if we export
that as dma-buf and pass it to the wayland server it should be able to
render it without doing another copy.

How does the wl_shm_pool workflow look like inside the wayland server?
Can it ask the gpu to render directly from the pool?  Or is a copy to
gpu memory needed here?  If the latter we would effectively trade one
copy for another ...

(2) Could we handle the mapping without needing shmem?

Possibly we could extend the vgem driver.  So we pass in a iov (which
qemu gets from guest via VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING), get
back a drm object.  Which effectively creates drm objects on the host
which match the drm object in the guest (both backed by the same set of
physical pages).

cheers,
  Gerd

Apparently Analagous Threads

Search for more apparently analagous threads

Linux Virtualization - Feb 2018 - [PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

[PATCH v3 1/2] drm/virtio: Add window server support

Apparently Analagous Threads