On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote:> On 2019/9/17 ??9:02, Tiwei Bie wrote: > > This RFC is to demonstrate below ideas, > > > > a) Build vhost-mdev on top of the same abstraction defined in > > the virtio-mdev series [1]; > > > > b) Introduce /dev/vhost-mdev to do vhost ioctls and support > > setting mdev device as backend; > > > > Now the userspace API looks like this: > > > > - Userspace generates a compatible mdev device; > > > > - Userspace opens this mdev device with VFIO API (including > > doing IOMMU programming for this mdev device with VFIO's > > container/group based interface); > > > > - Userspace opens /dev/vhost-mdev and gets vhost fd; > > > > - Userspace uses vhost ioctls to setup vhost (userspace should > > do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device > > fd first before doing other vhost ioctls); > > > > Only compile test has been done for this series for now. > > > Have a hard thought on the architecture:Thanks a lot! Do appreciate it!> > 1) Create a vhost char device and pass vfio mdev device fd to it as a > backend and translate vhost-mdev ioctl to virtio mdev transport (e.g > read/write). DMA was done through the VFIO DMA mapping on the container that > is attached.Yeah, that's what we are doing in this series.> > We have two more choices: > > 2) Use vfio-mdev but do not create vhost-mdev device, instead, just > implement vhost ioctl on vfio_device_ops, and translate them into > virtio-mdev transport or just pass ioctl to parent.Yeah. Instead of introducing /dev/vhost-mdev char device, do vhost ioctls on VFIO device fd directly. That's what we did in RFC v3.> > 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still > try to add dev to vfio group and talk to parent with device specific opsIf my understanding is correct, this means we need to introduce a new VFIO device driver to replace the existing vfio-mdev driver in our case. Below is a quick draft just to show my understanding: #include <linux/init.h> #include <linux/module.h> #include <linux/device.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/vfio.h> #include <linux/mdev.h> #include "mdev_private.h" /* XXX: we need a proper way to include below vhost header. */ #include "../../vhost/vhost.h" static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; /* ... */ vhost_dev_init(...); return 0; } static void vfio_vhost_mdev_release(void *device_data) { /* ... */ module_put(THIS_MODULE); } static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { struct mdev_device *mdev = device_data; struct mdev_parent *parent = mdev->parent; /* * Use vhost ioctls. * * We will have a different parent_ops design. * And potentially, we can share the same parent_ops * with virtio_mdev. */ switch (cmd) { case VHOST_GET_FEATURES: parent->ops->get_features(mdev, ...); break; /* ... */ } return 0; } static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma) { /* ... */ return 0; } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release = vfio_vhost_mdev_release, .ioctl = vfio_vhost_mdev_unlocked_ioctl, .read = vfio_vhost_mdev_read, .write = vfio_vhost_mdev_write, .mmap = vfio_vhost_mdev_mmap, }; static int vfio_vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); /* ... */ return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); } static void vfio_vhost_mdev_remove(struct device *dev) { /* ... */ vfio_del_group_dev(dev); } static struct mdev_driver vfio_vhost_mdev_driver = { .name = "vfio_vhost_mdev", .probe = vfio_vhost_mdev_probe, .remove = vfio_vhost_mdev_remove, }; static int __init vfio_vhost_mdev_init(void) { return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE); } module_init(vfio_vhost_mdev_init) static void __exit vfio_vhost_mdev_exit(void) { mdev_unregister_driver(&vfio_vhost_mdev_driver); } module_exit(vfio_vhost_mdev_exit)> > So I have some questions: > > 1) Compared to method 2, what's the advantage of creating a new vhost char > device? I guess it's for keep the API compatibility?One benefit is that we can avoid doing vhost ioctls on VFIO device fd.> > 2) For method 2, is there any easy way for user/admin to distinguish e.g > ordinary vfio-mdev for vhost from ordinary vfio-mdev?I think device-api could be a choice.> I saw you introduce > ops matching helper but it's not friendly to management.The ops matching helper is just to check whether a given vfio-device is based on a mdev device.> > 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that > assumes the parameter comes from userspace, it prevents support kernel > virtio drivers. > > 4) So comes the idea of method 3, since it register a new vhost-mdev driver, > we can use device specific ops instead of VFIO ones, then we can have a > common API between vDPA parent and vhost-mdev/virtio-mdev drivers.As the above draft shows, this requires introducing a new VFIO device driver. I think Alex's opinion matters here. Thanks, Tiwei> > What's your thoughts? > > Thanks > > > > > > RFCv3: https://patchwork.kernel.org/patch/11117785/ > > > > [1] https://lkml.org/lkml/2019/9/10/135 > > > > Tiwei Bie (3): > > vfio: support getting vfio device from device fd > > vfio: support checking vfio driver by device ops > > vhost: introduce mdev based hardware backend > > > > drivers/vfio/mdev/vfio_mdev.c | 3 +- > > drivers/vfio/vfio.c | 32 +++ > > drivers/vhost/Kconfig | 9 + > > drivers/vhost/Makefile | 3 + > > drivers/vhost/mdev.c | 462 +++++++++++++++++++++++++++++++ > > drivers/vhost/vhost.c | 39 ++- > > drivers/vhost/vhost.h | 6 + > > include/linux/vfio.h | 11 + > > include/uapi/linux/vhost.h | 10 + > > include/uapi/linux/vhost_types.h | 5 + > > 10 files changed, 573 insertions(+), 7 deletions(-) > > create mode 100644 drivers/vhost/mdev.c > >
Jason Wang
2019-Sep-18 05:51 UTC
[RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/17 ??6:58, Tiwei Bie wrote:> On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote: >> On 2019/9/17 ??9:02, Tiwei Bie wrote: >>> This RFC is to demonstrate below ideas, >>> >>> a) Build vhost-mdev on top of the same abstraction defined in >>> the virtio-mdev series [1]; >>> >>> b) Introduce /dev/vhost-mdev to do vhost ioctls and support >>> setting mdev device as backend; >>> >>> Now the userspace API looks like this: >>> >>> - Userspace generates a compatible mdev device; >>> >>> - Userspace opens this mdev device with VFIO API (including >>> doing IOMMU programming for this mdev device with VFIO's >>> container/group based interface); >>> >>> - Userspace opens /dev/vhost-mdev and gets vhost fd; >>> >>> - Userspace uses vhost ioctls to setup vhost (userspace should >>> do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device >>> fd first before doing other vhost ioctls); >>> >>> Only compile test has been done for this series for now. >> >> Have a hard thought on the architecture: > Thanks a lot! Do appreciate it! > >> 1) Create a vhost char device and pass vfio mdev device fd to it as a >> backend and translate vhost-mdev ioctl to virtio mdev transport (e.g >> read/write). DMA was done through the VFIO DMA mapping on the container that >> is attached. > Yeah, that's what we are doing in this series. > >> We have two more choices: >> >> 2) Use vfio-mdev but do not create vhost-mdev device, instead, just >> implement vhost ioctl on vfio_device_ops, and translate them into >> virtio-mdev transport or just pass ioctl to parent. > Yeah. Instead of introducing /dev/vhost-mdev char device, do > vhost ioctls on VFIO device fd directly. That's what we did > in RFC v3. > >> 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still >> try to add dev to vfio group and talk to parent with device specific ops > If my understanding is correct, this means we need to introduce > a new VFIO device driver to replace the existing vfio-mdev driver > in our case. Below is a quick draft just to show my understanding: > > #include <linux/init.h> > #include <linux/module.h> > #include <linux/device.h> > #include <linux/kernel.h> > #include <linux/slab.h> > #include <linux/vfio.h> > #include <linux/mdev.h> > > #include "mdev_private.h" > > /* XXX: we need a proper way to include below vhost header. */ > #include "../../vhost/vhost.h" > > static int vfio_vhost_mdev_open(void *device_data) > { > if (!try_module_get(THIS_MODULE)) > return -ENODEV; > > /* ... */ > vhost_dev_init(...); > > return 0; > } > > static void vfio_vhost_mdev_release(void *device_data) > { > /* ... */ > module_put(THIS_MODULE); > } > > static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, > unsigned int cmd, unsigned long arg) > { > struct mdev_device *mdev = device_data; > struct mdev_parent *parent = mdev->parent; > > /* > * Use vhost ioctls. > * > * We will have a different parent_ops design. > * And potentially, we can share the same parent_ops > * with virtio_mdev. > */ > switch (cmd) { > case VHOST_GET_FEATURES: > parent->ops->get_features(mdev, ...); > break; > /* ... */ > } > > return 0; > } > > static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, > size_t count, loff_t *ppos) > { > /* ... */ > return 0; > } > > static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf, > size_t count, loff_t *ppos) > { > /* ... */ > return 0; > } > > static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma) > { > /* ... */ > return 0; > } > > static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { > .name = "vfio-vhost-mdev", > .open = vfio_vhost_mdev_open, > .release = vfio_vhost_mdev_release, > .ioctl = vfio_vhost_mdev_unlocked_ioctl, > .read = vfio_vhost_mdev_read, > .write = vfio_vhost_mdev_write, > .mmap = vfio_vhost_mdev_mmap, > }; > > static int vfio_vhost_mdev_probe(struct device *dev) > { > struct mdev_device *mdev = to_mdev_device(dev); > > /* ... */ > return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); > } > > static void vfio_vhost_mdev_remove(struct device *dev) > { > /* ... */ > vfio_del_group_dev(dev); > } > > static struct mdev_driver vfio_vhost_mdev_driver = { > .name = "vfio_vhost_mdev", > .probe = vfio_vhost_mdev_probe, > .remove = vfio_vhost_mdev_remove, > }; > > static int __init vfio_vhost_mdev_init(void) > { > return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE); > } > module_init(vfio_vhost_mdev_init) > > static void __exit vfio_vhost_mdev_exit(void) > { > mdev_unregister_driver(&vfio_vhost_mdev_driver); > } > module_exit(vfio_vhost_mdev_exit)Yes, something like this basically.>> So I have some questions: >> >> 1) Compared to method 2, what's the advantage of creating a new vhost char >> device? I guess it's for keep the API compatibility? > One benefit is that we can avoid doing vhost ioctls on > VFIO device fd.Yes, but any benefit from doing this?> >> 2) For method 2, is there any easy way for user/admin to distinguish e.g >> ordinary vfio-mdev for vhost from ordinary vfio-mdev? > I think device-api could be a choice.Ok.> >> I saw you introduce >> ops matching helper but it's not friendly to management. > The ops matching helper is just to check whether a given > vfio-device is based on a mdev device. > >> 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that >> assumes the parameter comes from userspace, it prevents support kernel >> virtio drivers. >> >> 4) So comes the idea of method 3, since it register a new vhost-mdev driver, >> we can use device specific ops instead of VFIO ones, then we can have a >> common API between vDPA parent and vhost-mdev/virtio-mdev drivers. > As the above draft shows, this requires introducing a new > VFIO device driver. I think Alex's opinion matters here.Yes, it is. Thanks> Thanks, > Tiwei > >> What's your thoughts? >> >> Thanks >> >> >>> RFCv3: https://patchwork.kernel.org/patch/11117785/ >>> >>> [1] https://lkml.org/lkml/2019/9/10/135 >>> >>> Tiwei Bie (3): >>> vfio: support getting vfio device from device fd >>> vfio: support checking vfio driver by device ops >>> vhost: introduce mdev based hardware backend >>> >>> drivers/vfio/mdev/vfio_mdev.c | 3 +- >>> drivers/vfio/vfio.c | 32 +++ >>> drivers/vhost/Kconfig | 9 + >>> drivers/vhost/Makefile | 3 + >>> drivers/vhost/mdev.c | 462 +++++++++++++++++++++++++++++++ >>> drivers/vhost/vhost.c | 39 ++- >>> drivers/vhost/vhost.h | 6 + >>> include/linux/vfio.h | 11 + >>> include/uapi/linux/vhost.h | 10 + >>> include/uapi/linux/vhost_types.h | 5 + >>> 10 files changed, 573 insertions(+), 7 deletions(-) >>> create mode 100644 drivers/vhost/mdev.c >>>
Michael S. Tsirkin
2019-Sep-18 14:32 UTC
[RFC v4 0/3] vhost: introduce mdev based hardware backend
On Wed, Sep 18, 2019 at 01:51:21PM +0800, Jason Wang wrote:> > On 2019/9/17 ??6:58, Tiwei Bie wrote: > > On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote: > > > On 2019/9/17 ??9:02, Tiwei Bie wrote: > > > > This RFC is to demonstrate below ideas, > > > > > > > > a) Build vhost-mdev on top of the same abstraction defined in > > > > the virtio-mdev series [1]; > > > > > > > > b) Introduce /dev/vhost-mdev to do vhost ioctls and support > > > > setting mdev device as backend; > > > > > > > > Now the userspace API looks like this: > > > > > > > > - Userspace generates a compatible mdev device; > > > > > > > > - Userspace opens this mdev device with VFIO API (including > > > > doing IOMMU programming for this mdev device with VFIO's > > > > container/group based interface); > > > > > > > > - Userspace opens /dev/vhost-mdev and gets vhost fd; > > > > > > > > - Userspace uses vhost ioctls to setup vhost (userspace should > > > > do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device > > > > fd first before doing other vhost ioctls); > > > > > > > > Only compile test has been done for this series for now. > > > > > > Have a hard thought on the architecture: > > Thanks a lot! Do appreciate it! > > > > > 1) Create a vhost char device and pass vfio mdev device fd to it as a > > > backend and translate vhost-mdev ioctl to virtio mdev transport (e.g > > > read/write). DMA was done through the VFIO DMA mapping on the container that > > > is attached. > > Yeah, that's what we are doing in this series. > > > > > We have two more choices: > > > > > > 2) Use vfio-mdev but do not create vhost-mdev device, instead, just > > > implement vhost ioctl on vfio_device_ops, and translate them into > > > virtio-mdev transport or just pass ioctl to parent. > > Yeah. Instead of introducing /dev/vhost-mdev char device, do > > vhost ioctls on VFIO device fd directly. That's what we did > > in RFC v3. > > > > > 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still > > > try to add dev to vfio group and talk to parent with device specific ops > > If my understanding is correct, this means we need to introduce > > a new VFIO device driver to replace the existing vfio-mdev driver > > in our case. Below is a quick draft just to show my understanding: > > > > #include <linux/init.h> > > #include <linux/module.h> > > #include <linux/device.h> > > #include <linux/kernel.h> > > #include <linux/slab.h> > > #include <linux/vfio.h> > > #include <linux/mdev.h> > > > > #include "mdev_private.h" > > > > /* XXX: we need a proper way to include below vhost header. */ > > #include "../../vhost/vhost.h" > > > > static int vfio_vhost_mdev_open(void *device_data) > > { > > if (!try_module_get(THIS_MODULE)) > > return -ENODEV; > > > > /* ... */ > > vhost_dev_init(...); > > > > return 0; > > } > > > > static void vfio_vhost_mdev_release(void *device_data) > > { > > /* ... */ > > module_put(THIS_MODULE); > > } > > > > static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, > > unsigned int cmd, unsigned long arg) > > { > > struct mdev_device *mdev = device_data; > > struct mdev_parent *parent = mdev->parent; > > > > /* > > * Use vhost ioctls. > > * > > * We will have a different parent_ops design. > > * And potentially, we can share the same parent_ops > > * with virtio_mdev. > > */ > > switch (cmd) { > > case VHOST_GET_FEATURES: > > parent->ops->get_features(mdev, ...); > > break; > > /* ... */ > > } > > > > return 0; > > } > > > > static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, > > size_t count, loff_t *ppos) > > { > > /* ... */ > > return 0; > > } > > > > static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf, > > size_t count, loff_t *ppos) > > { > > /* ... */ > > return 0; > > } > > > > static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma) > > { > > /* ... */ > > return 0; > > } > > > > static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { > > .name = "vfio-vhost-mdev", > > .open = vfio_vhost_mdev_open, > > .release = vfio_vhost_mdev_release, > > .ioctl = vfio_vhost_mdev_unlocked_ioctl, > > .read = vfio_vhost_mdev_read, > > .write = vfio_vhost_mdev_write, > > .mmap = vfio_vhost_mdev_mmap, > > }; > > > > static int vfio_vhost_mdev_probe(struct device *dev) > > { > > struct mdev_device *mdev = to_mdev_device(dev); > > > > /* ... */ > > return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); > > } > > > > static void vfio_vhost_mdev_remove(struct device *dev) > > { > > /* ... */ > > vfio_del_group_dev(dev); > > } > > > > static struct mdev_driver vfio_vhost_mdev_driver = { > > .name = "vfio_vhost_mdev", > > .probe = vfio_vhost_mdev_probe, > > .remove = vfio_vhost_mdev_remove, > > }; > > > > static int __init vfio_vhost_mdev_init(void) > > { > > return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE); > > } > > module_init(vfio_vhost_mdev_init) > > > > static void __exit vfio_vhost_mdev_exit(void) > > { > > mdev_unregister_driver(&vfio_vhost_mdev_driver); > > } > > module_exit(vfio_vhost_mdev_exit) > > > Yes, something like this basically. > > > > > So I have some questions: > > > > > > 1) Compared to method 2, what's the advantage of creating a new vhost char > > > device? I guess it's for keep the API compatibility? > > One benefit is that we can avoid doing vhost ioctls on > > VFIO device fd. > > > Yes, but any benefit from doing this?It does seem a bit more modular, but it's certainly not a big deal.> > > > > 2) For method 2, is there any easy way for user/admin to distinguish e.g > > > ordinary vfio-mdev for vhost from ordinary vfio-mdev? > > I think device-api could be a choice. > > > Ok. > > > > > > > I saw you introduce > > > ops matching helper but it's not friendly to management. > > The ops matching helper is just to check whether a given > > vfio-device is based on a mdev device. > > > > > 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that > > > assumes the parameter comes from userspace, it prevents support kernel > > > virtio drivers. > > > > > > 4) So comes the idea of method 3, since it register a new vhost-mdev driver, > > > we can use device specific ops instead of VFIO ones, then we can have a > > > common API between vDPA parent and vhost-mdev/virtio-mdev drivers. > > As the above draft shows, this requires introducing a new > > VFIO device driver. I think Alex's opinion matters here. > > > Yes, it is. > > Thanks > > > > Thanks, > > Tiwei > > > > > What's your thoughts? > > > > > > Thanks > > > > > > > > > > RFCv3: https://patchwork.kernel.org/patch/11117785/ > > > > > > > > [1] https://lkml.org/lkml/2019/9/10/135 > > > > > > > > Tiwei Bie (3): > > > > vfio: support getting vfio device from device fd > > > > vfio: support checking vfio driver by device ops > > > > vhost: introduce mdev based hardware backend > > > > > > > > drivers/vfio/mdev/vfio_mdev.c | 3 +- > > > > drivers/vfio/vfio.c | 32 +++ > > > > drivers/vhost/Kconfig | 9 + > > > > drivers/vhost/Makefile | 3 + > > > > drivers/vhost/mdev.c | 462 +++++++++++++++++++++++++++++++ > > > > drivers/vhost/vhost.c | 39 ++- > > > > drivers/vhost/vhost.h | 6 + > > > > include/linux/vfio.h | 11 + > > > > include/uapi/linux/vhost.h | 10 + > > > > include/uapi/linux/vhost_types.h | 5 + > > > > 10 files changed, 573 insertions(+), 7 deletions(-) > > > > create mode 100644 drivers/vhost/mdev.c > > > >
Possibly Parallel Threads
- [RFC v4 0/3] vhost: introduce mdev based hardware backend
- [RFC v4 0/3] vhost: introduce mdev based hardware backend
- [RFC v4 0/3] vhost: introduce mdev based hardware backend
- [RFC v4 0/3] vhost: introduce mdev based hardware backend
- [RFC v4 0/3] vhost: introduce mdev based hardware backend