? 2022/1/31 ??5:15, Eugenio Perez Martin ??:> On Fri, Jan 28, 2022 at 7:02 AM Jason Wang <jasowang at redhat.com>
wrote:
>>
>> ? 2022/1/22 ??4:27, Eugenio P?rez ??:
>>> This series enables shadow virtqueue (SVQ) for vhost-vdpa devices.
This
>>> is intended as a new method of tracking the memory the devices
touch
>>> during a migration process: Instead of relay on vhost device's
dirty
>>> logging capability, SVQ intercepts the VQ dataplane forwarding the
>>> descriptors between VM and device. This way qemu is the effective
>>> writer of guests memory, like in qemu's emulated virtio device
>>> operation.
>>>
>>> When SVQ is enabled qemu offers a new virtual address space to the
>>> device to read and write into, and it maps new vrings and the guest
>>> memory in it. SVQ also intercepts kicks and calls between the
device
>>> and the guest. Used buffers relay would cause dirty memory being
>>> tracked, but at this RFC SVQ is not enabled on migration
automatically.
>>>
>>> Thanks of being a buffers relay system, SVQ can be used also to
>>> communicate devices and drivers with different capabilities, like
>>> devices that only support packed vring and not split and old guests
with
>>> no driver packed support.
>>>
>>> It is based on the ideas of DPDK SW assisted LM, in the series of
>>> DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these
does
>>> not map the shadow vq in guest's VA, but in qemu's.
>>>
>>> This version of SVQ is limited in the amount of features it can use
with
>>> guest and device, because this series is already very big
otherwise.
>>> Features like indirect or event_idx will be addressed in future
series.
>>>
>>> SVQ needs to be enabled with cmdline parameter x-svq, like:
>>>
>>> -netdev
type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true
>>>
>>> In this version it cannot be enabled or disabled in runtime.
Further
>>> series will remove this limitation and will enable it only for
migration
>>> time.
>>>
>>> Some patches are intentionally very small to ease review, but they
can
>>> be squashed if preferred.
>>>
>>> Patches 1-10 prepares the SVQ and QEMU to support both guest to
device
>>> and device to guest notifications forwarding, with the extra qemu
hop.
>>> That part can be tested in isolation if cmdline change is
reproduced.
>>>
>>> Patches from 11 to 18 implement the actual buffer forwarding, but
with
>>> no IOMMU support. It requires a vdpa device capable of addressing
all
>>> qemu vaddr.
>>>
>>> Patches 19 to 23 adds the iommu support, so the device with address
>>> range limitations can access SVQ through this new virtual address
space
>>> created.
>>>
>>> The rest of the series add the last pieces needed for migration.
>>>
>>> Comments are welcome.
>>
>> I wonder the performance impact. So performance numbers are more than
>> welcomed.
>>
> Sure, I'll do it for the next revision. Since this one brings a decent
> amount of changes, I chose to collect the feedback first.
A simple single TCP_STREAM netperf test should be sufficient to give
some basic understanding about the performance impact.
Thanks
>
> Thanks!
>
>> Thanks
>>
>>
>>> TODO:
>>> * Event, indirect, packed, and other features of virtio.
>>> * To separate buffers forwarding in its own AIO context, so we can
>>> throw more threads to that task and we don't need to stop
the main
>>> event loop.
>>> * Support virtio-net control vq.
>>> * Proper documentation.
>>>
>>> Changes from v5 RFC:
>>> * Remove dynamic enablement of SVQ, making less dependent of the
device.
>>> * Enable live migration if SVQ is enabled.
>>> * Fix SVQ when driver reset.
>>> * Comments addressed, specially in the iova area.
>>> * Rebase on latest master, adding multiqueue support (but no
networking
>>> control vq processing).
>>> v5 link:
>>> https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg07250.html
>>>
>>> Changes from v4 RFC:
>>> * Support of allocating / freeing iova ranges in IOVA tree.
Extending
>>> already present iova-tree for that.
>>> * Proper validation of guest features. Now SVQ can negotiate a
>>> different set of features with the device when enabled.
>>> * Support of host notifiers memory regions
>>> * Handling of SVQ full queue in case guest's descriptors span
to
>>> different memory regions (qemu's VA chunks).
>>> * Flush pending used buffers at end of SVQ operation.
>>> * QMP command now looks by NetClientState name. Other devices will
need
>>> to implement it's way to enable vdpa.
>>> * Rename QMP command to set, so it looks more like a way of working
>>> * Better use of qemu error system
>>> * Make a few assertions proper error-handling paths.
>>> * Add more documentation
>>> * Less coupling of virtio / vhost, that could cause friction on
changes
>>> * Addressed many other small comments and small fixes.
>>>
>>> Changes from v3 RFC:
>>> * Move everything to vhost-vdpa backend. A big change, this
allowed
>>> some cleanup but more code has been added in other places.
>>> * More use of glib utilities, especially to manage memory.
>>> v3 link:
>>>
https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg06032.html
>>>
>>> Changes from v2 RFC:
>>> * Adding vhost-vdpa devices support
>>> * Fixed some memory leaks pointed by different comments
>>> v2 link:
>>>
https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg05600.html
>>>
>>> Changes from v1 RFC:
>>> * Use QMP instead of migration to start SVQ mode.
>>> * Only accepting IOMMU devices, closer behavior with target
devices
>>> (vDPA)
>>> * Fix invalid masking/unmasking of vhost call fd.
>>> * Use of proper methods for synchronization.
>>> * No need to modify VirtIO device code, all of the changes are
>>> contained in vhost code.
>>> * Delete superfluous code.
>>> * An intermediate RFC was sent with only the notifications
forwarding
>>> changes. It can be seen in
>>> https://patchew.org/QEMU/20210129205415.876290-1-eperezma at
redhat.com/
>>> v1 link:
>>> https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html
>>>
>>> Eugenio P?rez (20):
>>> virtio: Add VIRTIO_F_QUEUE_STATE
>>> virtio-net: Honor VIRTIO_CONFIG_S_DEVICE_STOPPED
>>> virtio: Add virtio_queue_is_host_notifier_enabled
>>> vhost: Make vhost_virtqueue_{start,stop} public
>>> vhost: Add x-vhost-enable-shadow-vq qmp
>>> vhost: Add VhostShadowVirtqueue
>>> vdpa: Register vdpa devices in a list
>>> vhost: Route guest->host notification through shadow
virtqueue
>>> Add vhost_svq_get_svq_call_notifier
>>> Add vhost_svq_set_guest_call_notifier
>>> vdpa: Save call_fd in vhost-vdpa
>>> vhost-vdpa: Take into account SVQ in
vhost_vdpa_set_vring_call
>>> vhost: Route host->guest notification through shadow
virtqueue
>>> virtio: Add vhost_shadow_vq_get_vring_addr
>>> vdpa: Save host and guest features
>>> vhost: Add vhost_svq_valid_device_features to shadow vq
>>> vhost: Shadow virtqueue buffers forwarding
>>> vhost: Add VhostIOVATree
>>> vhost: Use a tree to store memory mappings
>>> vdpa: Add custom IOTLB translations to SVQ
>>>
>>> Eugenio P?rez (31):
>>> vdpa: Reorder virtio/vhost-vdpa.c functions
>>> vhost: Add VhostShadowVirtqueue
>>> vdpa: Add vhost_svq_get_dev_kick_notifier
>>> vdpa: Add vhost_svq_set_svq_kick_fd
>>> vhost: Add Shadow VirtQueue kick forwarding capabilities
>>> vhost: Route guest->host notification through shadow
virtqueue
>>> vhost: dd vhost_svq_get_svq_call_notifier
>>> vhost: Add vhost_svq_set_guest_call_notifier
>>> vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call
>>> vhost: Route host->guest notification through shadow
virtqueue
>>> vhost: Add vhost_svq_valid_device_features to shadow vq
>>> vhost: Add vhost_svq_valid_guest_features to shadow vq
>>> vhost: Add vhost_svq_ack_guest_features to shadow vq
>>> virtio: Add vhost_shadow_vq_get_vring_addr
>>> vdpa: Add vhost_svq_get_num
>>> vhost: pass queue index to vhost_vq_get_addr
>>> vdpa: adapt vhost_ops callbacks to svq
>>> vhost: Shadow virtqueue buffers forwarding
>>> utils: Add internal DMAMap to iova-tree
>>> util: Store DMA entries in a list
>>> util: Add iova_tree_alloc
>>> vhost: Add VhostIOVATree
>>> vdpa: Add custom IOTLB translations to SVQ
>>> vhost: Add vhost_svq_get_last_used_idx
>>> vdpa: Adapt vhost_vdpa_get_vring_base to SVQ
>>> vdpa: Clear VHOST_VRING_F_LOG at vhost_vdpa_set_vring_addr in
SVQ
>>> vdpa: Never set log_base addr if SVQ is enabled
>>> vdpa: Expose VHOST_F_LOG_ALL on SVQ
>>> vdpa: Make ncs autofree
>>> vdpa: Move vhost_vdpa_get_iova_range to net/vhost-vdpa.c
>>> vdpa: Add x-svq to NetdevVhostVDPAOptions
>>>
>>> qapi/net.json | 5 +-
>>> hw/virtio/vhost-iova-tree.h | 27 +
>>> hw/virtio/vhost-shadow-virtqueue.h | 46 ++
>>> include/hw/virtio/vhost-vdpa.h | 7 +
>>> include/qemu/iova-tree.h | 17 +
>>> hw/virtio/vhost-iova-tree.c | 157 ++++++
>>> hw/virtio/vhost-shadow-virtqueue.c | 761
+++++++++++++++++++++++++++++
>>> hw/virtio/vhost-vdpa.c | 740
++++++++++++++++++++++++----
>>> hw/virtio/vhost.c | 6 +-
>>> net/vhost-vdpa.c | 58 ++-
>>> util/iova-tree.c | 161 +++++-
>>> hw/virtio/meson.build | 2 +-
>>> 12 files changed, 1852 insertions(+), 135 deletions(-)
>>> create mode 100644 hw/virtio/vhost-iova-tree.h
>>> create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
>>> create mode 100644 hw/virtio/vhost-iova-tree.c
>>> create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
>>>