Following patchset introduces dedicated group for descriptor table to reduce live migration downtime when passthrough VQ is being switched to shadow VQ. As this RFC set is to seek early feedback on the uAPI and driver API part, for now there's no associated driver patch consuming the API. As soon as the support is in place on both hardware device and driver, performance data will be show using real hardware device. The target goal of this series is to reduce the SVQ switching overhead to less than 300ms on a ~100GB guest with 2 non-mq vhost-vdpa devices. The plan of the intended driver implementation is to use a dedicated group (specifically, 2 in below table) to host descriptor table for all data vqs, different from where buffer addresses are contained (in group 0 as below). cvq does not have to allocate dedicated group for descriptor table, so its buffers and descriptor table would always belong to a same group (1). | data vq | ctrl vq ==============+==========+==========vq_group | 0 | 1 vq_desc_group | 2 | 1 --- Si-Wei Liu (3): vdpa: introduce dedicated descriptor group for virtqueue vhost-vdpa: introduce descriptor group backend feature vhost-vdpa: uAPI to get dedicated descriptor group id drivers/vhost/vdpa.c | 27 +++++++++++++++++++++++++++ include/linux/vdpa.h | 11 +++++++++++ include/uapi/linux/vhost.h | 8 ++++++++ include/uapi/linux/vhost_types.h | 5 +++++ 4 files changed, 51 insertions(+) -- 1.8.3.1
Si-Wei Liu
2023-Aug-09 12:54 UTC
[PATCH RFC 1/3] vdpa: introduce dedicated descriptor group for virtqueue
In some cases, the access to the virtqueue's descriptor table (including the associated available and used ring for split layout) has to be isolated from guest memory access where the buffer address contained in vring descriptor points to. Introduce dedicated descriptor group with driver op .get_vq_desc_group(), denoting the descriptor table portion of a virtqueue can optionally reside on a different group than what's obtained via the .get_vq_group() op. The descriptor group may or may not share a same group ID as the default group of the virtqueue. If the descriptor group has a different ID, it means the descriptor table portion of the virtqueue potentially can be placed onto a separate address space than where guest memory resides. For this to work, .set_group_asid() API will accept the dedicated group ID for descriptor table to get it associated to certain ASID, while the .reset() semantics of resetting all groups (including descriptor table group) back to ASID 0 remain same. QEMU's shadow virtqueue is going to use dedicated descriptor group not just to isolate the access between descriptors and buffers, but also speed up the mapping setup process for shadow vring descriptors. Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- include/linux/vdpa.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index db1b0ea..17a4efa 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -204,6 +204,16 @@ struct vdpa_map_file { * @vdev: vdpa device * @idx: virtqueue index * Returns u32: group id for this virtqueue + * @get_vq_desc_group: Get the group id for the descriptor table of + * a specific virtqueue (optional) + * @vdev: vdpa device + * @idx: virtqueue index + * Returns u32: group id for the descriptor table + * portion of this virtqueue. Could be different + * than the one from @get_vq_group, in which case + * the access to the descriptor table can be + * confined to a separate asid, isolating from + * the virtqueue's buffer address access. * @get_device_features: Get virtio features supported by the device * @vdev: vdpa device * Returns the virtio features support by the @@ -357,6 +367,7 @@ struct vdpa_config_ops { /* Device ops */ u32 (*get_vq_align)(struct vdpa_device *vdev); u32 (*get_vq_group)(struct vdpa_device *vdev, u16 idx); + u32 (*get_vq_desc_group)(struct vdpa_device *vdev, u16 idx); u64 (*get_device_features)(struct vdpa_device *vdev); int (*set_driver_features)(struct vdpa_device *vdev, u64 features); u64 (*get_driver_features)(struct vdpa_device *vdev); -- 1.8.3.1
Si-Wei Liu
2023-Aug-09 12:54 UTC
[PATCH RFC 2/3] vhost-vdpa: introduce descriptor group backend feature
Userspace knows if the device has dedicated descriptor group or not by checking this feature bit. It's only exposed if the vdpa driver backend implements the .get_vq_desc_group() operation callback. Userspace trying to negotiate this feature when it or the dependent _F_IOTLB_ASID feature hasn't been exposed will result in an error. Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- drivers/vhost/vdpa.c | 17 +++++++++++++++++ include/uapi/linux/vhost_types.h | 5 +++++ 2 files changed, 22 insertions(+) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index b43e868..f2e5dce 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -389,6 +389,14 @@ static bool vhost_vdpa_can_resume(const struct vhost_vdpa *v) return ops->resume; } +static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->get_vq_desc_group; +} + static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; @@ -679,6 +687,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if (copy_from_user(&features, featurep, sizeof(features))) return -EFAULT; if (features & ~(VHOST_VDPA_BACKEND_FEATURES | + BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | BIT_ULL(VHOST_BACKEND_F_SUSPEND) | BIT_ULL(VHOST_BACKEND_F_RESUME))) return -EOPNOTSUPP; @@ -688,6 +697,12 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if ((features & BIT_ULL(VHOST_BACKEND_F_RESUME)) && !vhost_vdpa_can_resume(v)) return -EOPNOTSUPP; + if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && + !(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) + return -EINVAL; + if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && + !vhost_vdpa_has_desc_group(v)) + return -EOPNOTSUPP; vhost_set_backend_features(&v->vdev, features); return 0; } @@ -741,6 +756,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, features |= BIT_ULL(VHOST_BACKEND_F_SUSPEND); if (vhost_vdpa_can_resume(v)) features |= BIT_ULL(VHOST_BACKEND_F_RESUME); + if (vhost_vdpa_has_desc_group(v)) + features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); if (copy_to_user(featurep, &features, sizeof(features))) r = -EFAULT; break; diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index d3aad12a..0856f84 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -181,5 +181,10 @@ struct vhost_vdpa_iova_range { #define VHOST_BACKEND_F_SUSPEND 0x4 /* Device can be resumed */ #define VHOST_BACKEND_F_RESUME 0x5 +/* Device may expose the descriptor table, avail and used ring in a + * different group for ASID binding than the buffers it contains. + * Requires VHOST_BACKEND_F_IOTLB_ASID. + */ +#define VHOST_BACKEND_F_DESC_ASID 0x6 #endif -- 1.8.3.1
Si-Wei Liu
2023-Aug-09 12:54 UTC
[PATCH RFC 3/3] vhost-vdpa: uAPI to get dedicated descriptor group id
With _F_DESC_ASID backend feature, the device can now support the VHOST_VDPA_GET_VRING_DESC_GROUP ioctl, and it may expose the descriptor table (including avail and used ring) in a different group than the buffers it contains. This new uAPI will fetch the group ID of the descriptor table. Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- drivers/vhost/vdpa.c | 10 ++++++++++ include/uapi/linux/vhost.h | 8 ++++++++ 2 files changed, 18 insertions(+) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index f2e5dce..eabac06 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -602,6 +602,16 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, else if (copy_to_user(argp, &s, sizeof(s))) return -EFAULT; return 0; + case VHOST_VDPA_GET_VRING_DESC_GROUP: + if (!vhost_vdpa_has_desc_group(v)) + return -EOPNOTSUPP; + s.index = idx; + s.num = ops->get_vq_desc_group(vdpa, idx); + if (s.num >= vdpa->ngroups) + return -EIO; + else if (copy_to_user(argp, &s, sizeof(s))) + return -EFAULT; + return 0; case VHOST_VDPA_SET_GROUP_ASID: if (copy_from_user(&s, argp, sizeof(s))) return -EFAULT; diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index f5c48b6..05faad2 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -219,4 +219,12 @@ */ #define VHOST_VDPA_RESUME _IO(VHOST_VIRTIO, 0x7E) +/* Get the dedicated group for the descriptor table of a virtqueue: + * read index, write group in num. + * The virtqueue index is stored in the index field of vhost_vring_state. + * The group id for the descriptor table of this specific virtqueue + * is returned via num field of vhost_vring_state. + */ +#define VHOST_VDPA_GET_VRING_DESC_GROUP _IOWR(VHOST_VIRTIO, 0x7F, \ + struct vhost_vring_state) #endif -- 1.8.3.1
On Wed, Aug 9, 2023 at 8:56?PM Si-Wei Liu <si-wei.liu at oracle.com> wrote:> > Following patchset introduces dedicated group for descriptor table to > reduce live migration downtime when passthrough VQ is being switched > to shadow VQ. As this RFC set is to seek early feedback on the uAPI > and driver API part, for now there's no associated driver patch consuming > the API. As soon as the support is in place on both hardware device and > driver, performance data will be show using real hardware device. The > target goal of this series is to reduce the SVQ switching overhead > to less than 300ms on a ~100GB guest with 2 non-mq vhost-vdpa devices. > > The plan of the intended driver implementation is to use a dedicated > group (specifically, 2 in below table) to host descriptor table for > all data vqs, different from where buffer addresses are contained (in > group 0 as below). cvq does not have to allocate dedicated group for > descriptor table, so its buffers and descriptor table would always > belong to a same group (1).I'm fine with this, but I think we need an implementation in the driver (e.g the simulator). Thanks> > > | data vq | ctrl vq > ==============+==========+==========> vq_group | 0 | 1 > vq_desc_group | 2 | 1 > > > --- > > Si-Wei Liu (3): > vdpa: introduce dedicated descriptor group for virtqueue > vhost-vdpa: introduce descriptor group backend feature > vhost-vdpa: uAPI to get dedicated descriptor group id > > drivers/vhost/vdpa.c | 27 +++++++++++++++++++++++++++ > include/linux/vdpa.h | 11 +++++++++++ > include/uapi/linux/vhost.h | 8 ++++++++ > include/uapi/linux/vhost_types.h | 5 +++++ > 4 files changed, 51 insertions(+) > > -- > 1.8.3.1 >