Stefano Garzarella
2023-Jul-18 10:32 UTC
vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead]
On Thu, Apr 20, 2023 at 6:20?AM Jason Wang <jasowang at redhat.com> wrote:> > On Wed, Apr 19, 2023 at 11:33?PM Eugenio Perez Martin > <eperezma at redhat.com> wrote: > > > > On Wed, Apr 19, 2023 at 12:56?AM <peili.dev at gmail.com> wrote: > > > > > > From: Pei Li <peili.dev at gmail.com> > > > > > > Currently, part of the vdpa initialization / startup process > > > needs to trigger many ioctls per vq, which is very inefficient > > > and causing unnecessary context switch between user mode and > > > kernel mode. > > > > > > This patch creates an additional ioctl() command, namely > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > ioctl() call. > > I'd expect there's a kernel patch but I didn't see that? > > If we want to go this way. Why simply have a more generic way, that is > introducing something like: > > VHOST_CMD_BATCH which did something like > > struct vhost_cmd_batch { > int ncmds; > struct vhost_ioctls[]; > }; > > Then you can batch other ioctls other than GET_VRING_GROUP? >Just restarting this discussion, since I recently worked more with io_uring passthrough commands and I think it can help here. The NVMe guys had a similar problem (ioctl too slow for their use case)[1][2], so they developed a new feature in io_uring that basically allows you to do IOCTLs asynchronously and in batches using io_uring. The same feature is also used by ublk [3] and I recently talked about this at DevConf with German [4]. Basically, there's a new callback in fops (struct file_operations.uring_cmd). IIUC for NVMe (drivers/nvme/host/ioctl.c) they used exactly the same values used for IOCTLs also for the new uring_cmd callback. We could do the same. The changes in the vhost-vdpa kernel module should be simple, and we could share the code for handling ioctl and uring_cmd. That way any new command can be supported with both for compatibility. In QEMU then we can start using it to optimize the control path. What do you think? If it's interesting, I could throw down an RFC with the changes or if anyone is interested in working on it, I can help with the details. Thanks, Stefano [1] https://lpc.events/event/11/contributions/989/ [2] https://lpc.events/event/16/contributions/1382/ [3] https://lwn.net/Articles/903855/ [4] https://www.youtube.com/watch?v=6JqNPirreoY
Jason Wang
2023-Jul-26 08:10 UTC
vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead]
On Tue, Jul 18, 2023 at 6:32?PM Stefano Garzarella <sgarzare at redhat.com> wrote:> > On Thu, Apr 20, 2023 at 6:20?AM Jason Wang <jasowang at redhat.com> wrote: > > > > On Wed, Apr 19, 2023 at 11:33?PM Eugenio Perez Martin > > <eperezma at redhat.com> wrote: > > > > > > On Wed, Apr 19, 2023 at 12:56?AM <peili.dev at gmail.com> wrote: > > > > > > > > From: Pei Li <peili.dev at gmail.com> > > > > > > > > Currently, part of the vdpa initialization / startup process > > > > needs to trigger many ioctls per vq, which is very inefficient > > > > and causing unnecessary context switch between user mode and > > > > kernel mode. > > > > > > > > This patch creates an additional ioctl() command, namely > > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > > ioctl() call. > > > > I'd expect there's a kernel patch but I didn't see that? > > > > If we want to go this way. Why simply have a more generic way, that is > > introducing something like: > > > > VHOST_CMD_BATCH which did something like > > > > struct vhost_cmd_batch { > > int ncmds; > > struct vhost_ioctls[]; > > }; > > > > Then you can batch other ioctls other than GET_VRING_GROUP? > > > > Just restarting this discussion, since I recently worked more with > io_uring passthrough commands and I think it can help here. > > The NVMe guys had a similar problem (ioctl too slow for their use > case)[1][2], so they developed a new feature in io_uring that > basically allows you to do IOCTLs asynchronously and in batches using > io_uring. > > The same feature is also used by ublk [3] and I recently talked about > this at DevConf with German [4]. > > Basically, there's a new callback in fops (struct file_operations.uring_cmd). > IIUC for NVMe (drivers/nvme/host/ioctl.c) they used exactly the same > values used for IOCTLs also for the new uring_cmd callback. > > We could do the same. The changes in the vhost-vdpa kernel module > should be simple, and we could share the code for handling ioctl and > uring_cmd. > That way any new command can be supported with both for compatibility. > > In QEMU then we can start using it to optimize the control path. > > What do you think?This looks interesting.> > If it's interesting, I could throw down an RFC with the changes or if > anyone is interested in working on it, I can help with the details.Please do that. Thanks> > Thanks, > Stefano > > [1] https://lpc.events/event/11/contributions/989/ > [2] https://lpc.events/event/16/contributions/1382/ > [3] https://lwn.net/Articles/903855/ > [4] https://www.youtube.com/watch?v=6JqNPirreoY >