Si-Wei Liu
2023-Sep-12 00:28 UTC
[PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
On 9/10/2023 8:52 PM, Jason Wang wrote:> On Sat, Sep 9, 2023 at 9:46?PM Si-Wei Liu <si-wei.liu at oracle.com> wrote: >> Userspace needs this feature flag to distinguish if vhost-vdpa >> iotlb in the kernel supports persistent IOTLB mapping across >> device reset. > As discussed, the IOTLB persists for devices with platform IOMMU at > least. You've mentioned that this behaviour is covered by Qemu since > it reset IOTLB on reset. I wonder what happens if we simply decouple > the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there?Not sure I understand. Simple decouple meaning to remove memory_listener registration/unregistration calls *unconditionally* from the vDPA dev start/stop path in QEMU, or make it conditional around the existence of PERSIST_IOTLB? If unconditional, it breaks older host kernel, as the older kernel still silently drops all mapping across vDPA reset (VM reboot), ending up with network loss afterwards; if make the QEMU change conditional on PERSIST_IOTLB without the .reset_map API, it can't cover the virtio-vdpa 1:1 identity mapping case.> Btw, is there a Qemu patch for reference for this new feature?There's a WIP version, not ready yet for review: https://github.com/siwliu-kernel/qemu branch: vdpa-svq-asid-poc Will need to clean up code and split to smaller patches before I can post it, if the kernel part can be settled. Thanks, -Siwei> > Thanks > >> There are two cases that backend may claim >> this feature bit on: >> >> - parent device that has to work with platform IOMMU >> - parent device with on-chip IOMMU that has the expected >> .reset_map support in driver >> >> Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> >> --- >> RFC v2 -> v3: >> - fix missing return due to merge error >> >> --- >> drivers/vhost/vdpa.c | 16 +++++++++++++++- >> include/uapi/linux/vhost_types.h | 2 ++ >> 2 files changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >> index 71fbd559..b404504 100644 >> --- a/drivers/vhost/vdpa.c >> +++ b/drivers/vhost/vdpa.c >> @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) >> return ops->get_vq_desc_group; >> } >> >> +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) >> +{ >> + struct vdpa_device *vdpa = v->vdpa; >> + const struct vdpa_config_ops *ops = vdpa->config; >> + >> + return (!ops->set_map && !ops->dma_map) || ops->reset_map; >> +} >> + >> static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) >> { >> struct vdpa_device *vdpa = v->vdpa; >> @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> if (features & ~(VHOST_VDPA_BACKEND_FEATURES | >> BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | >> BIT_ULL(VHOST_BACKEND_F_SUSPEND) | >> - BIT_ULL(VHOST_BACKEND_F_RESUME))) >> + BIT_ULL(VHOST_BACKEND_F_RESUME) | >> + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) >> return -EOPNOTSUPP; >> if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && >> !vhost_vdpa_can_suspend(v)) >> @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && >> !vhost_vdpa_has_desc_group(v)) >> return -EOPNOTSUPP; >> + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && >> + !vhost_vdpa_has_persistent_map(v)) >> + return -EOPNOTSUPP; >> vhost_set_backend_features(&v->vdev, features); >> return 0; >> } >> @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> features |= BIT_ULL(VHOST_BACKEND_F_RESUME); >> if (vhost_vdpa_has_desc_group(v)) >> features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); >> + if (vhost_vdpa_has_persistent_map(v)) >> + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); >> if (copy_to_user(featurep, &features, sizeof(features))) >> r = -EFAULT; >> break; >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h >> index 6acc604..0fdb6f0 100644 >> --- a/include/uapi/linux/vhost_types.h >> +++ b/include/uapi/linux/vhost_types.h >> @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { >> * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. >> */ >> #define VHOST_BACKEND_F_DESC_ASID 0x6 >> +/* IOTLB don't flush memory mapping across device reset */ >> +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 >> >> #endif >> -- >> 1.8.3.1 >>
Jason Wang
2023-Sep-12 07:01 UTC
[PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
On Tue, Sep 12, 2023 at 8:28?AM Si-Wei Liu <si-wei.liu at oracle.com> wrote:> > > > On 9/10/2023 8:52 PM, Jason Wang wrote: > > On Sat, Sep 9, 2023 at 9:46?PM Si-Wei Liu <si-wei.liu at oracle.com> wrote: > >> Userspace needs this feature flag to distinguish if vhost-vdpa > >> iotlb in the kernel supports persistent IOTLB mapping across > >> device reset. > > As discussed, the IOTLB persists for devices with platform IOMMU at > > least. You've mentioned that this behaviour is covered by Qemu since > > it reset IOTLB on reset. I wonder what happens if we simply decouple > > the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there? > Not sure I understand. Simple decouple meaning to remove memory_listener > registration/unregistration calls *unconditionally* from the vDPA dev > start/stop path in QEMU, or make it conditional around the existence of > PERSIST_IOTLB?If my memory is correct, currently the listeners were registered/unregistered during start/stop. I mean what if we register/unregister during init/free?> If unconditional, it breaks older host kernel, as the > older kernel still silently drops all mapping across vDPA reset (VM > reboot),Ok, right.> ending up with network loss afterwards; if make the QEMU change > conditional on PERSIST_IOTLB without the .reset_map API, it can't cover > the virtio-vdpa 1:1 identity mapping case.Ok, I see. Let's add the above and explain why it can't cover the 1:1 mapping somewhere (probably the commit log) in the next version. So I think we can probably introduce reset_map() but not say it's for on-chip IOMMU. We can probably say, it's for resetting the vendor specific mapping into initialization state?> > > Btw, is there a Qemu patch for reference for this new feature? > There's a WIP version, not ready yet for review: > > https://github.com/siwliu-kernel/qemu > branch: vdpa-svq-asid-poc > > Will need to clean up code and split to smaller patches before I can > post it, if the kernel part can be settled.Ok. Thanks> > Thanks, > -Siwei > > > > > Thanks > > > >> There are two cases that backend may claim > >> this feature bit on: > >> > >> - parent device that has to work with platform IOMMU > >> - parent device with on-chip IOMMU that has the expected > >> .reset_map support in driver > >> > >> Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> > >> --- > >> RFC v2 -> v3: > >> - fix missing return due to merge error > >> > >> --- > >> drivers/vhost/vdpa.c | 16 +++++++++++++++- > >> include/uapi/linux/vhost_types.h | 2 ++ > >> 2 files changed, 17 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > >> index 71fbd559..b404504 100644 > >> --- a/drivers/vhost/vdpa.c > >> +++ b/drivers/vhost/vdpa.c > >> @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) > >> return ops->get_vq_desc_group; > >> } > >> > >> +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) > >> +{ > >> + struct vdpa_device *vdpa = v->vdpa; > >> + const struct vdpa_config_ops *ops = vdpa->config; > >> + > >> + return (!ops->set_map && !ops->dma_map) || ops->reset_map; > >> +} > >> + > >> static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) > >> { > >> struct vdpa_device *vdpa = v->vdpa; > >> @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > >> BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | > >> BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > >> - BIT_ULL(VHOST_BACKEND_F_RESUME))) > >> + BIT_ULL(VHOST_BACKEND_F_RESUME) | > >> + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) > >> return -EOPNOTSUPP; > >> if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > >> !vhost_vdpa_can_suspend(v)) > >> @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && > >> !vhost_vdpa_has_desc_group(v)) > >> return -EOPNOTSUPP; > >> + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && > >> + !vhost_vdpa_has_persistent_map(v)) > >> + return -EOPNOTSUPP; > >> vhost_set_backend_features(&v->vdev, features); > >> return 0; > >> } > >> @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> features |= BIT_ULL(VHOST_BACKEND_F_RESUME); > >> if (vhost_vdpa_has_desc_group(v)) > >> features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); > >> + if (vhost_vdpa_has_persistent_map(v)) > >> + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); > >> if (copy_to_user(featurep, &features, sizeof(features))) > >> r = -EFAULT; > >> break; > >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h > >> index 6acc604..0fdb6f0 100644 > >> --- a/include/uapi/linux/vhost_types.h > >> +++ b/include/uapi/linux/vhost_types.h > >> @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { > >> * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. > >> */ > >> #define VHOST_BACKEND_F_DESC_ASID 0x6 > >> +/* IOTLB don't flush memory mapping across device reset */ > >> +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 > >> > >> #endif > >> -- > >> 1.8.3.1 > >> >