Tiwei Bie
2019-Jul-04 07:02 UTC
[RFC v2] vhost: introduce mdev based hardware vhost backend
On Thu, Jul 04, 2019 at 02:35:20PM +0800, Jason Wang wrote:> On 2019/7/4 ??2:21, Tiwei Bie wrote: > > On Thu, Jul 04, 2019 at 12:31:48PM +0800, Jason Wang wrote: > > > On 2019/7/3 ??9:08, Tiwei Bie wrote: > > > > On Wed, Jul 03, 2019 at 08:16:23PM +0800, Jason Wang wrote: > > > > > On 2019/7/3 ??7:52, Tiwei Bie wrote: > > > > > > On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote: > > > > > > > On 2019/7/3 ??5:13, Tiwei Bie wrote: > > > > > > > > Details about this can be found here: > > > > > > > > > > > > > > > > https://lwn.net/Articles/750770/ > > > > > > > > > > > > > > > > What's new in this version > > > > > > > > =========================> > > > > > > > > > > > > > > > A new VFIO device type is introduced - vfio-vhost. This addressed > > > > > > > > some comments from here:https://patchwork.ozlabs.org/cover/984763/ > > > > > > > > > > > > > > > > Below is the updated device interface: > > > > > > > > > > > > > > > > Currently, there are two regions of this device: 1) CONFIG_REGION > > > > > > > > (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the > > > > > > > > device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which > > > > > > > > can be used to notify the device. > > > > > > > > > > > > > > > > 1. CONFIG_REGION > > > > > > > > > > > > > > > > The region described by CONFIG_REGION is the main control interface. > > > > > > > > Messages will be written to or read from this region. > > > > > > > > > > > > > > > > The message type is determined by the `request` field in message > > > > > > > > header. The message size is encoded in the message header too. > > > > > > > > The message format looks like this: > > > > > > > > > > > > > > > > struct vhost_vfio_op { > > > > > > > > __u64 request; > > > > > > > > __u32 flags; > > > > > > > > /* Flag values: */ > > > > > > > > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ > > > > > > > > __u32 size; > > > > > > > > union { > > > > > > > > __u64 u64; > > > > > > > > struct vhost_vring_state state; > > > > > > > > struct vhost_vring_addr addr; > > > > > > > > } payload; > > > > > > > > }; > > > > > > > > > > > > > > > > The existing vhost-kernel ioctl cmds are reused as the message > > > > > > > > requests in above structure. > > > > > > > Still a comments like V1. What's the advantage of inventing a new protocol? > > > > > > I'm trying to make it work in VFIO's way.. > > > > > > > > > > > > > I believe either of the following should be better: > > > > > > > > > > > > > > - using vhost ioctl,? we can start from SET_VRING_KICK/SET_VRING_CALL and > > > > > > > extend it with e.g notify region. The advantages is that all exist userspace > > > > > > > program could be reused without modification (or minimal modification). And > > > > > > > vhost API hides lots of details that is not necessary to be understood by > > > > > > > application (e.g in the case of container). > > > > > > Do you mean reusing vhost's ioctl on VFIO device fd directly, > > > > > > or introducing another mdev driver (i.e. vhost_mdev instead of > > > > > > using the existing vfio_mdev) for mdev device? > > > > > Can we simply add them into ioctl of mdev_parent_ops? > > > > Right, either way, these ioctls have to be and just need to be > > > > added in the ioctl of the mdev_parent_ops. But another thing we > > > > also need to consider is that which file descriptor the userspace > > > > will do the ioctl() on. So I'm wondering do you mean let the > > > > userspace do the ioctl() on the VFIO device fd of the mdev > > > > device? > > > > > > > Yes. > > Got it! I'm not sure what's Alex opinion on this. If we all > > agree with this, I can do it in this way. > > > > > Is there any other way btw? > > Just a quick thought.. Maybe totally a bad idea. > > > It's not for sure :)Thanks!> > > > I was thinking > > whether it would be odd to do non-VFIO's ioctls on VFIO's device > > fd. So I was wondering whether it's possible to allow binding > > another mdev driver (e.g. vhost_mdev) to the supported mdev > > devices. The new mdev driver, vhost_mdev, can provide similar > > ways to let userspace open the mdev device and do the vhost ioctls > > on it. To distinguish with the vfio_mdev compatible mdev devices, > > the device API of the new vhost_mdev compatible mdev devices > > might be e.g. "vhost-net" for net? > > > > So in VFIO case, the device will be for passthru directly. And > > in VHOST case, the device can be used to accelerate the existing > > virtualized devices. > > > > How do you think? > > > If my understanding is correct, there will be no VFIO ioctl if we go for > vhost_mdev?Yeah, exactly. If we go for vhost_mdev, we may have some vhost nodes in /dev similar to what /dev/vfio/* does to handle the $UUID and open the device (e.g. similar to VFIO_GROUP_GET_DEVICE_FD in VFIO). And to setup the device, we can try to reuse the ioctls of the existing kernel vhost as much as possible. Thanks, Tiwei> > Thanks > > > > > > Thanks, > > Tiwei > > > Thanks > > >
Jason Wang
2019-Jul-05 00:30 UTC
[RFC v2] vhost: introduce mdev based hardware vhost backend
On 2019/7/4 ??3:02, Tiwei Bie wrote:> On Thu, Jul 04, 2019 at 02:35:20PM +0800, Jason Wang wrote: >> On 2019/7/4 ??2:21, Tiwei Bie wrote: >>> On Thu, Jul 04, 2019 at 12:31:48PM +0800, Jason Wang wrote: >>>> On 2019/7/3 ??9:08, Tiwei Bie wrote: >>>>> On Wed, Jul 03, 2019 at 08:16:23PM +0800, Jason Wang wrote: >>>>>> On 2019/7/3 ??7:52, Tiwei Bie wrote: >>>>>>> On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote: >>>>>>>> On 2019/7/3 ??5:13, Tiwei Bie wrote: >>>>>>>>> Details about this can be found here: >>>>>>>>> >>>>>>>>> https://lwn.net/Articles/750770/ >>>>>>>>> >>>>>>>>> What's new in this version >>>>>>>>> =========================>>>>>>>>> >>>>>>>>> A new VFIO device type is introduced - vfio-vhost. This addressed >>>>>>>>> some comments from here:https://patchwork.ozlabs.org/cover/984763/ >>>>>>>>> >>>>>>>>> Below is the updated device interface: >>>>>>>>> >>>>>>>>> Currently, there are two regions of this device: 1) CONFIG_REGION >>>>>>>>> (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the >>>>>>>>> device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which >>>>>>>>> can be used to notify the device. >>>>>>>>> >>>>>>>>> 1. CONFIG_REGION >>>>>>>>> >>>>>>>>> The region described by CONFIG_REGION is the main control interface. >>>>>>>>> Messages will be written to or read from this region. >>>>>>>>> >>>>>>>>> The message type is determined by the `request` field in message >>>>>>>>> header. The message size is encoded in the message header too. >>>>>>>>> The message format looks like this: >>>>>>>>> >>>>>>>>> struct vhost_vfio_op { >>>>>>>>> __u64 request; >>>>>>>>> __u32 flags; >>>>>>>>> /* Flag values: */ >>>>>>>>> #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ >>>>>>>>> __u32 size; >>>>>>>>> union { >>>>>>>>> __u64 u64; >>>>>>>>> struct vhost_vring_state state; >>>>>>>>> struct vhost_vring_addr addr; >>>>>>>>> } payload; >>>>>>>>> }; >>>>>>>>> >>>>>>>>> The existing vhost-kernel ioctl cmds are reused as the message >>>>>>>>> requests in above structure. >>>>>>>> Still a comments like V1. What's the advantage of inventing a new protocol? >>>>>>> I'm trying to make it work in VFIO's way.. >>>>>>> >>>>>>>> I believe either of the following should be better: >>>>>>>> >>>>>>>> - using vhost ioctl,? we can start from SET_VRING_KICK/SET_VRING_CALL and >>>>>>>> extend it with e.g notify region. The advantages is that all exist userspace >>>>>>>> program could be reused without modification (or minimal modification). And >>>>>>>> vhost API hides lots of details that is not necessary to be understood by >>>>>>>> application (e.g in the case of container). >>>>>>> Do you mean reusing vhost's ioctl on VFIO device fd directly, >>>>>>> or introducing another mdev driver (i.e. vhost_mdev instead of >>>>>>> using the existing vfio_mdev) for mdev device? >>>>>> Can we simply add them into ioctl of mdev_parent_ops? >>>>> Right, either way, these ioctls have to be and just need to be >>>>> added in the ioctl of the mdev_parent_ops. But another thing we >>>>> also need to consider is that which file descriptor the userspace >>>>> will do the ioctl() on. So I'm wondering do you mean let the >>>>> userspace do the ioctl() on the VFIO device fd of the mdev >>>>> device? >>>>> >>>> Yes. >>> Got it! I'm not sure what's Alex opinion on this. If we all >>> agree with this, I can do it in this way. >>> >>>> Is there any other way btw? >>> Just a quick thought.. Maybe totally a bad idea. >> >> It's not for sure :) > Thanks! > >> >>> I was thinking >>> whether it would be odd to do non-VFIO's ioctls on VFIO's device >>> fd. So I was wondering whether it's possible to allow binding >>> another mdev driver (e.g. vhost_mdev) to the supported mdev >>> devices. The new mdev driver, vhost_mdev, can provide similar >>> ways to let userspace open the mdev device and do the vhost ioctls >>> on it. To distinguish with the vfio_mdev compatible mdev devices, >>> the device API of the new vhost_mdev compatible mdev devices >>> might be e.g. "vhost-net" for net? >>> >>> So in VFIO case, the device will be for passthru directly. And >>> in VHOST case, the device can be used to accelerate the existing >>> virtualized devices. >>> >>> How do you think? >> >> If my understanding is correct, there will be no VFIO ioctl if we go for >> vhost_mdev? > Yeah, exactly. If we go for vhost_mdev, we may have some vhost nodes > in /dev similar to what /dev/vfio/* does to handle the $UUID and open > the device (e.g. similar to VFIO_GROUP_GET_DEVICE_FD in VFIO). And > to setup the device, we can try to reuse the ioctls of the existing > kernel vhost as much as possible.Interesting, actually, I've considered something similar. I think there should be no issues other than DMA: - Need to invent new API for DMA mapping other than SET_MEM_TABLE? (Which is too heavyweight). - Need to consider a way to co-work with both on chip IOMMU (your proposal should be fine) and scalable IOV. Thanks> > Thanks, > Tiwei > >> Thanks >> >> >>> Thanks, >>> Tiwei >>>> Thanks >>>>
Tiwei Bie
2019-Jul-05 02:23 UTC
[RFC v2] vhost: introduce mdev based hardware vhost backend
On Fri, Jul 05, 2019 at 08:30:00AM +0800, Jason Wang wrote:> On 2019/7/4 ??3:02, Tiwei Bie wrote: > > On Thu, Jul 04, 2019 at 02:35:20PM +0800, Jason Wang wrote: > > > On 2019/7/4 ??2:21, Tiwei Bie wrote: > > > > On Thu, Jul 04, 2019 at 12:31:48PM +0800, Jason Wang wrote: > > > > > On 2019/7/3 ??9:08, Tiwei Bie wrote: > > > > > > On Wed, Jul 03, 2019 at 08:16:23PM +0800, Jason Wang wrote: > > > > > > > On 2019/7/3 ??7:52, Tiwei Bie wrote: > > > > > > > > On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote: > > > > > > > > > On 2019/7/3 ??5:13, Tiwei Bie wrote: > > > > > > > > > > Details about this can be found here: > > > > > > > > > > > > > > > > > > > > https://lwn.net/Articles/750770/ > > > > > > > > > > > > > > > > > > > > What's new in this version > > > > > > > > > > =========================> > > > > > > > > > > > > > > > > > > > A new VFIO device type is introduced - vfio-vhost. This addressed > > > > > > > > > > some comments from here:https://patchwork.ozlabs.org/cover/984763/ > > > > > > > > > > > > > > > > > > > > Below is the updated device interface: > > > > > > > > > > > > > > > > > > > > Currently, there are two regions of this device: 1) CONFIG_REGION > > > > > > > > > > (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the > > > > > > > > > > device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which > > > > > > > > > > can be used to notify the device. > > > > > > > > > > > > > > > > > > > > 1. CONFIG_REGION > > > > > > > > > > > > > > > > > > > > The region described by CONFIG_REGION is the main control interface. > > > > > > > > > > Messages will be written to or read from this region. > > > > > > > > > > > > > > > > > > > > The message type is determined by the `request` field in message > > > > > > > > > > header. The message size is encoded in the message header too. > > > > > > > > > > The message format looks like this: > > > > > > > > > > > > > > > > > > > > struct vhost_vfio_op { > > > > > > > > > > __u64 request; > > > > > > > > > > __u32 flags; > > > > > > > > > > /* Flag values: */ > > > > > > > > > > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ > > > > > > > > > > __u32 size; > > > > > > > > > > union { > > > > > > > > > > __u64 u64; > > > > > > > > > > struct vhost_vring_state state; > > > > > > > > > > struct vhost_vring_addr addr; > > > > > > > > > > } payload; > > > > > > > > > > }; > > > > > > > > > > > > > > > > > > > > The existing vhost-kernel ioctl cmds are reused as the message > > > > > > > > > > requests in above structure. > > > > > > > > > Still a comments like V1. What's the advantage of inventing a new protocol? > > > > > > > > I'm trying to make it work in VFIO's way.. > > > > > > > > > > > > > > > > > I believe either of the following should be better: > > > > > > > > > > > > > > > > > > - using vhost ioctl,? we can start from SET_VRING_KICK/SET_VRING_CALL and > > > > > > > > > extend it with e.g notify region. The advantages is that all exist userspace > > > > > > > > > program could be reused without modification (or minimal modification). And > > > > > > > > > vhost API hides lots of details that is not necessary to be understood by > > > > > > > > > application (e.g in the case of container). > > > > > > > > Do you mean reusing vhost's ioctl on VFIO device fd directly, > > > > > > > > or introducing another mdev driver (i.e. vhost_mdev instead of > > > > > > > > using the existing vfio_mdev) for mdev device? > > > > > > > Can we simply add them into ioctl of mdev_parent_ops? > > > > > > Right, either way, these ioctls have to be and just need to be > > > > > > added in the ioctl of the mdev_parent_ops. But another thing we > > > > > > also need to consider is that which file descriptor the userspace > > > > > > will do the ioctl() on. So I'm wondering do you mean let the > > > > > > userspace do the ioctl() on the VFIO device fd of the mdev > > > > > > device? > > > > > > > > > > > Yes. > > > > Got it! I'm not sure what's Alex opinion on this. If we all > > > > agree with this, I can do it in this way. > > > > > > > > > Is there any other way btw? > > > > Just a quick thought.. Maybe totally a bad idea. > > > > > > It's not for sure :) > > Thanks! > > > > > > > > > I was thinking > > > > whether it would be odd to do non-VFIO's ioctls on VFIO's device > > > > fd. So I was wondering whether it's possible to allow binding > > > > another mdev driver (e.g. vhost_mdev) to the supported mdev > > > > devices. The new mdev driver, vhost_mdev, can provide similar > > > > ways to let userspace open the mdev device and do the vhost ioctls > > > > on it. To distinguish with the vfio_mdev compatible mdev devices, > > > > the device API of the new vhost_mdev compatible mdev devices > > > > might be e.g. "vhost-net" for net? > > > > > > > > So in VFIO case, the device will be for passthru directly. And > > > > in VHOST case, the device can be used to accelerate the existing > > > > virtualized devices. > > > > > > > > How do you think? > > > > > > If my understanding is correct, there will be no VFIO ioctl if we go for > > > vhost_mdev? > > Yeah, exactly. If we go for vhost_mdev, we may have some vhost nodes > > in /dev similar to what /dev/vfio/* does to handle the $UUID and open > > the device (e.g. similar to VFIO_GROUP_GET_DEVICE_FD in VFIO). And > > to setup the device, we can try to reuse the ioctls of the existing > > kernel vhost as much as possible. > > > Interesting, actually, I've considered something similar. I think there > should be no issues other than DMA:Yeah, that's something we need to optimize to make it more lightweight and efficient. How about allowing userspace to do map/unmap operations like what VFIO provides?> > - Need to invent new API for DMA mapping other than SET_MEM_TABLE? (Which is > too heavyweight). > > - Need to consider a way to co-work with both on chip IOMMU (your proposal > should be fine) and scalable IOV.Maybe we can make it possible to let the parent device know the mappings (mapping events) if they need (it would be helpful for software-based device as well). Thanks, Tiwei> > Thanks > > > > > > Thanks, > > Tiwei > > > > > Thanks > > > > > > > > > > Thanks, > > > > Tiwei > > > > > Thanks > > > > >
Apparently Analagous Threads
- [RFC v2] vhost: introduce mdev based hardware vhost backend
- [RFC v2] vhost: introduce mdev based hardware vhost backend
- [RFC v2] vhost: introduce mdev based hardware vhost backend
- [RFC v2] vhost: introduce mdev based hardware vhost backend
- [RFC v2] vhost: introduce mdev based hardware vhost backend