On 2019/10/24 ??4:03, Jason Wang wrote:> > On 2019/10/24 ??12:21, Tiwei Bie wrote: >> On Wed, Oct 23, 2019 at 06:29:21PM +0800, Jason Wang wrote: >>> On 2019/10/23 ??6:11, Tiwei Bie wrote: >>>> On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote: >>>>> On 2019/10/23 ??3:07, Tiwei Bie wrote: >>>>>> On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote: >>>>>>> On 2019/10/23 ??11:02, Tiwei Bie wrote: >>>>>>>> On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote: >>>>>>>>> On 2019/10/22 ??5:52, Tiwei Bie wrote: >>>>>>>>>> This patch introduces a mdev based hardware vhost backend. >>>>>>>>>> This backend is built on top of the same abstraction used >>>>>>>>>> in virtio-mdev and provides a generic vhost interface for >>>>>>>>>> userspace to accelerate the virtio devices in guest. >>>>>>>>>> >>>>>>>>>> This backend is implemented as a mdev device driver on top >>>>>>>>>> of the same mdev device ops used in virtio-mdev but using >>>>>>>>>> a different mdev class id, and it will register the device >>>>>>>>>> as a VFIO device for userspace to use. Userspace can setup >>>>>>>>>> the IOMMU with the existing VFIO container/group APIs and >>>>>>>>>> then get the device fd with the device name. After getting >>>>>>>>>> the device fd of this device, userspace can use vhost ioctls >>>>>>>>>> to setup the backend. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Tiwei Bie <tiwei.bie at intel.com> >>>>>>>>>> --- >>>>>>>>>> This patch depends on below series: >>>>>>>>>> https://lkml.org/lkml/2019/10/17/286 >>>>>>>>>> >>>>>>>>>> v1 -> v2: >>>>>>>>>> - Replace _SET_STATE with _SET_STATUS (MST); >>>>>>>>>> - Check status bits at each step (MST); >>>>>>>>>> - Report the max ring size and max number of queues (MST); >>>>>>>>>> - Add missing MODULE_DEVICE_TABLE (Jason); >>>>>>>>>> - Only support the network backend w/o multiqueue for now; >>>>>>>>> Any idea on how to extend it to support devices other than >>>>>>>>> net? I think we >>>>>>>>> want a generic API or an API that could be made generic in the >>>>>>>>> future. >>>>>>>>> >>>>>>>>> Do we want to e.g having a generic vhost mdev for all kinds of >>>>>>>>> devices or >>>>>>>>> introducing e.g vhost-net-mdev and vhost-scsi-mdev? >>>>>>>> One possible way is to do what vhost-user does. I.e. Apart from >>>>>>>> the generic ring, features, ... related ioctls, we also introduce >>>>>>>> device specific ioctls when we need them. As vhost-mdev just needs >>>>>>>> to forward configs between parent and userspace and even won't >>>>>>>> cache any info when possible, >>>>>>> So it looks to me this is only possible if we expose e.g >>>>>>> set_config and >>>>>>> get_config to userspace. >>>>>> The set_config and get_config interface isn't really everything >>>>>> of device specific settings. We also have ctrlq in virtio-net. >>>>> Yes, but it could be processed by the exist API. Isn't it? Just >>>>> set ctrl vq >>>>> address and let parent to deal with that. >>>> I mean how to expose ctrlq related settings to userspace? >>> >>> I think it works like: >>> >>> 1) userspace find ctrl_vq is supported >>> >>> 2) then it can allocate memory for ctrl vq and set its address through >>> vhost-mdev >>> >>> 3) userspace can populate ctrl vq itself >> I see. That is to say, userspace e.g. QEMU will program the >> ctrl vq with the existing VHOST_*_VRING_* ioctls, and parent >> drivers should know that the addresses used in ctrl vq are >> host virtual addresses in vhost-mdev's case. > > > That's really good point. And that means parent needs to differ vhost > from virtio. It should work.HVA may only work when we have something similar to VHOST_SET_OWNER which can reuse MM of its owner.> But is there any chance to use DMA address? I'm asking since the API > then tends to be device specific.I wonder whether we can introduce MAP IOMMU notifier and get DMA mappings from that. Thanks
On Thu, Oct 24, 2019 at 04:32:42PM +0800, Jason Wang wrote:> On 2019/10/24 ??4:03, Jason Wang wrote: > > On 2019/10/24 ??12:21, Tiwei Bie wrote: > > > On Wed, Oct 23, 2019 at 06:29:21PM +0800, Jason Wang wrote: > > > > On 2019/10/23 ??6:11, Tiwei Bie wrote: > > > > > On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote: > > > > > > On 2019/10/23 ??3:07, Tiwei Bie wrote: > > > > > > > On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote: > > > > > > > > On 2019/10/23 ??11:02, Tiwei Bie wrote: > > > > > > > > > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote: > > > > > > > > > > On 2019/10/22 ??5:52, Tiwei Bie wrote: > > > > > > > > > > > This patch introduces a mdev based hardware vhost backend. > > > > > > > > > > > This backend is built on top of the same abstraction used > > > > > > > > > > > in virtio-mdev and provides a generic vhost interface for > > > > > > > > > > > userspace to accelerate the virtio devices in guest. > > > > > > > > > > > > > > > > > > > > > > This backend is implemented as a mdev device driver on top > > > > > > > > > > > of the same mdev device ops used in virtio-mdev but using > > > > > > > > > > > a different mdev class id, and it will register the device > > > > > > > > > > > as a VFIO device for userspace to use. Userspace can setup > > > > > > > > > > > the IOMMU with the existing VFIO container/group APIs and > > > > > > > > > > > then get the device fd with the device name. After getting > > > > > > > > > > > the device fd of this device, userspace can use vhost ioctls > > > > > > > > > > > to setup the backend. > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie at intel.com> > > > > > > > > > > > --- > > > > > > > > > > > This patch depends on below series: > > > > > > > > > > > https://lkml.org/lkml/2019/10/17/286 > > > > > > > > > > > > > > > > > > > > > > v1 -> v2: > > > > > > > > > > > - Replace _SET_STATE with _SET_STATUS (MST); > > > > > > > > > > > - Check status bits at each step (MST); > > > > > > > > > > > - Report the max ring size and max number of queues (MST); > > > > > > > > > > > - Add missing MODULE_DEVICE_TABLE (Jason); > > > > > > > > > > > - Only support the network backend w/o multiqueue for now; > > > > > > > > > > Any idea on how to extend it to support > > > > > > > > > > devices other than net? I think we > > > > > > > > > > want a generic API or an API that could > > > > > > > > > > be made generic in the future. > > > > > > > > > > > > > > > > > > > > Do we want to e.g having a generic vhost > > > > > > > > > > mdev for all kinds of devices or > > > > > > > > > > introducing e.g vhost-net-mdev and vhost-scsi-mdev? > > > > > > > > > One possible way is to do what vhost-user does. I.e. Apart from > > > > > > > > > the generic ring, features, ... related ioctls, we also introduce > > > > > > > > > device specific ioctls when we need them. As vhost-mdev just needs > > > > > > > > > to forward configs between parent and userspace and even won't > > > > > > > > > cache any info when possible, > > > > > > > > So it looks to me this is only possible if we > > > > > > > > expose e.g set_config and > > > > > > > > get_config to userspace. > > > > > > > The set_config and get_config interface isn't really everything > > > > > > > of device specific settings. We also have ctrlq in virtio-net. > > > > > > Yes, but it could be processed by the exist API. Isn't > > > > > > it? Just set ctrl vq > > > > > > address and let parent to deal with that. > > > > > I mean how to expose ctrlq related settings to userspace? > > > > > > > > I think it works like: > > > > > > > > 1) userspace find ctrl_vq is supported > > > > > > > > 2) then it can allocate memory for ctrl vq and set its address through > > > > vhost-mdev > > > > > > > > 3) userspace can populate ctrl vq itself > > > I see. That is to say, userspace e.g. QEMU will program the > > > ctrl vq with the existing VHOST_*_VRING_* ioctls, and parent > > > drivers should know that the addresses used in ctrl vq are > > > host virtual addresses in vhost-mdev's case. > > > > > > That's really good point. And that means parent needs to differ vhost > > from virtio. It should work. > > > HVA may only work when we have something similar to VHOST_SET_OWNER which > can reuse MM of its owner.We already have VHOST_SET_OWNER in vhost now, parent can handle the commands in its .kick_vq() which is called by vq's .handle_kick callback. Virtio-user did something similar: https://github.com/DPDK/dpdk/blob/0da7f445df445630c794897347ee360d6fe6348b/drivers/net/virtio/virtio_user_ethdev.c#L313-L322> > > > But is there any chance to use DMA address? I'm asking since the API > > then tends to be device specific. > > > I wonder whether we can introduce MAP IOMMU notifier and get DMA mappings > from that.I think this will complicate things unnecessarily and may bring pains. Because, in vhost-mdev, mdev's ctrl vq is supposed to be managed by host. And we should try to avoid putting ctrl vq and Rx/Tx vqs in the same DMA space to prevent guests having the chance to bypass the host (e.g. QEMU) to setup the backend accelerator directly.> > Thanks >
On 2019/10/24 ??5:18, Tiwei Bie wrote:> On Thu, Oct 24, 2019 at 04:32:42PM +0800, Jason Wang wrote: >> On 2019/10/24 ??4:03, Jason Wang wrote: >>> On 2019/10/24 ??12:21, Tiwei Bie wrote: >>>> On Wed, Oct 23, 2019 at 06:29:21PM +0800, Jason Wang wrote: >>>>> On 2019/10/23 ??6:11, Tiwei Bie wrote: >>>>>> On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote: >>>>>>> On 2019/10/23 ??3:07, Tiwei Bie wrote: >>>>>>>> On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote: >>>>>>>>> On 2019/10/23 ??11:02, Tiwei Bie wrote: >>>>>>>>>> On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote: >>>>>>>>>>> On 2019/10/22 ??5:52, Tiwei Bie wrote: >>>>>>>>>>>> This patch introduces a mdev based hardware vhost backend. >>>>>>>>>>>> This backend is built on top of the same abstraction used >>>>>>>>>>>> in virtio-mdev and provides a generic vhost interface for >>>>>>>>>>>> userspace to accelerate the virtio devices in guest. >>>>>>>>>>>> >>>>>>>>>>>> This backend is implemented as a mdev device driver on top >>>>>>>>>>>> of the same mdev device ops used in virtio-mdev but using >>>>>>>>>>>> a different mdev class id, and it will register the device >>>>>>>>>>>> as a VFIO device for userspace to use. Userspace can setup >>>>>>>>>>>> the IOMMU with the existing VFIO container/group APIs and >>>>>>>>>>>> then get the device fd with the device name. After getting >>>>>>>>>>>> the device fd of this device, userspace can use vhost ioctls >>>>>>>>>>>> to setup the backend. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Tiwei Bie <tiwei.bie at intel.com> >>>>>>>>>>>> --- >>>>>>>>>>>> This patch depends on below series: >>>>>>>>>>>> https://lkml.org/lkml/2019/10/17/286 >>>>>>>>>>>> >>>>>>>>>>>> v1 -> v2: >>>>>>>>>>>> - Replace _SET_STATE with _SET_STATUS (MST); >>>>>>>>>>>> - Check status bits at each step (MST); >>>>>>>>>>>> - Report the max ring size and max number of queues (MST); >>>>>>>>>>>> - Add missing MODULE_DEVICE_TABLE (Jason); >>>>>>>>>>>> - Only support the network backend w/o multiqueue for now; >>>>>>>>>>> Any idea on how to extend it to support >>>>>>>>>>> devices other than net? I think we >>>>>>>>>>> want a generic API or an API that could >>>>>>>>>>> be made generic in the future. >>>>>>>>>>> >>>>>>>>>>> Do we want to e.g having a generic vhost >>>>>>>>>>> mdev for all kinds of devices or >>>>>>>>>>> introducing e.g vhost-net-mdev and vhost-scsi-mdev? >>>>>>>>>> One possible way is to do what vhost-user does. I.e. Apart from >>>>>>>>>> the generic ring, features, ... related ioctls, we also introduce >>>>>>>>>> device specific ioctls when we need them. As vhost-mdev just needs >>>>>>>>>> to forward configs between parent and userspace and even won't >>>>>>>>>> cache any info when possible, >>>>>>>>> So it looks to me this is only possible if we >>>>>>>>> expose e.g set_config and >>>>>>>>> get_config to userspace. >>>>>>>> The set_config and get_config interface isn't really everything >>>>>>>> of device specific settings. We also have ctrlq in virtio-net. >>>>>>> Yes, but it could be processed by the exist API. Isn't >>>>>>> it? Just set ctrl vq >>>>>>> address and let parent to deal with that. >>>>>> I mean how to expose ctrlq related settings to userspace? >>>>> I think it works like: >>>>> >>>>> 1) userspace find ctrl_vq is supported >>>>> >>>>> 2) then it can allocate memory for ctrl vq and set its address through >>>>> vhost-mdev >>>>> >>>>> 3) userspace can populate ctrl vq itself >>>> I see. That is to say, userspace e.g. QEMU will program the >>>> ctrl vq with the existing VHOST_*_VRING_* ioctls, and parent >>>> drivers should know that the addresses used in ctrl vq are >>>> host virtual addresses in vhost-mdev's case. >>> >>> That's really good point. And that means parent needs to differ vhost >>> from virtio. It should work. >> >> HVA may only work when we have something similar to VHOST_SET_OWNER which >> can reuse MM of its owner. > We already have VHOST_SET_OWNER in vhost now, parent can handle > the commands in its .kick_vq() which is called by vq's .handle_kick > callback. Virtio-user did something similar: > > https://github.com/DPDK/dpdk/blob/0da7f445df445630c794897347ee360d6fe6348b/drivers/net/virtio/virtio_user_ethdev.c#L313-L322This probably means a process context is required, something like kthread that is used by vhost which seems a burden for parent. Or we can extend ioctl to processing kick in the system call context.> >> >>> But is there any chance to use DMA address? I'm asking since the API >>> then tends to be device specific. >> >> I wonder whether we can introduce MAP IOMMU notifier and get DMA mappings >> from that. > I think this will complicate things unnecessarily and may > bring pains. Because, in vhost-mdev, mdev's ctrl vq is > supposed to be managed by host.Yes.> And we should try to avoid > putting ctrl vq and Rx/Tx vqs in the same DMA space to prevent > guests having the chance to bypass the host (e.g. QEMU) to > setup the backend accelerator directly.That's really good point.? So when "vhost" type is created, parent should assume addr of ctrl_vq is hva. Thanks> >> Thanks >>
Reasonably Related Threads
- [PATCH v2] vhost: introduce mdev based hardware backend
- [PATCH v2] vhost: introduce mdev based hardware backend
- [PATCH v2] vhost: introduce mdev based hardware backend
- [PATCH v2] vhost: introduce mdev based hardware backend
- [PATCH v2] vhost: introduce mdev based hardware backend