On Thu, Mar 23, 2023 at 11:42:07AM +0800, Jason Wang wrote:>On Tue, Mar 21, 2023 at 11:48?PM Stefano Garzarella <sgarzare at redhat.com> wrote: >> >> The new "use_va" module parameter (default: true) is used in >> vdpa_alloc_device() to inform the vDPA framework that the device >> supports VA. >> >> vringh is initialized to use VA only when "use_va" is true and the >> user's mm has been bound. So, only when the bus supports user VA >> (e.g. vhost-vdpa). >> >> vdpasim_mm_work_fn work is used to serialize the binding to a new >> address space when the .bind_mm callback is invoked, and unbinding >> when the .unbind_mm callback is invoked. >> >> Call mmget_not_zero()/kthread_use_mm() inside the worker function >> to pin the address space only as long as needed, following the >> documentation of mmget() in include/linux/sched/mm.h: >> >> * Never use this function to pin this address space for an >> * unbounded/indefinite amount of time. > >I wonder if everything would be simplified if we just allow the parent >to advertise whether or not it requires the address space. > >Then when vhost-vDPA probes the device it can simply advertise >use_work as true so vhost core can use get_task_mm() in this case?IIUC set user_worker to true, it also creates the kthread in the vhost core (but we can add another variable to avoid this). My biggest concern is the comment in include/linux/sched/mm.h. get_task_mm() uses mmget(), but in the documentation they advise against pinning the address space indefinitely, so I preferred in keeping mmgrab() in the vhost core, then call mmget_not_zero() in the worker only when it is running. In the future maybe mm will be used differently from parent if somehow it is supported by iommu, so I would leave it to the parent to handle this. Thanks, Stefano
On Thu, Mar 23, 2023 at 10:50:06AM +0100, Stefano Garzarella wrote:> On Thu, Mar 23, 2023 at 11:42:07AM +0800, Jason Wang wrote: > > On Tue, Mar 21, 2023 at 11:48?PM Stefano Garzarella <sgarzare at redhat.com> wrote: > > > > > > The new "use_va" module parameter (default: true) is used in > > > vdpa_alloc_device() to inform the vDPA framework that the device > > > supports VA. > > > > > > vringh is initialized to use VA only when "use_va" is true and the > > > user's mm has been bound. So, only when the bus supports user VA > > > (e.g. vhost-vdpa). > > > > > > vdpasim_mm_work_fn work is used to serialize the binding to a new > > > address space when the .bind_mm callback is invoked, and unbinding > > > when the .unbind_mm callback is invoked. > > > > > > Call mmget_not_zero()/kthread_use_mm() inside the worker function > > > to pin the address space only as long as needed, following the > > > documentation of mmget() in include/linux/sched/mm.h: > > > > > > * Never use this function to pin this address space for an > > > * unbounded/indefinite amount of time. > > > > I wonder if everything would be simplified if we just allow the parent > > to advertise whether or not it requires the address space. > > > > Then when vhost-vDPA probes the device it can simply advertise > > use_work as true so vhost core can use get_task_mm() in this case? > > IIUC set user_worker to true, it also creates the kthread in the vhost > core (but we can add another variable to avoid this). > > My biggest concern is the comment in include/linux/sched/mm.h. > get_task_mm() uses mmget(), but in the documentation they advise against > pinning the address space indefinitely, so I preferred in keeping > mmgrab() in the vhost core, then call mmget_not_zero() in the worker > only when it is running. > > In the future maybe mm will be used differently from parent if somehow > it is supported by iommu, so I would leave it to the parent to handle > this. > > Thanks, > StefanoI think iommufd is supposed to handle all this detail, yes.
On Thu, Mar 23, 2023 at 5:50?PM Stefano Garzarella <sgarzare at redhat.com> wrote:> > On Thu, Mar 23, 2023 at 11:42:07AM +0800, Jason Wang wrote: > >On Tue, Mar 21, 2023 at 11:48?PM Stefano Garzarella <sgarzare at redhat.com> wrote: > >> > >> The new "use_va" module parameter (default: true) is used in > >> vdpa_alloc_device() to inform the vDPA framework that the device > >> supports VA. > >> > >> vringh is initialized to use VA only when "use_va" is true and the > >> user's mm has been bound. So, only when the bus supports user VA > >> (e.g. vhost-vdpa). > >> > >> vdpasim_mm_work_fn work is used to serialize the binding to a new > >> address space when the .bind_mm callback is invoked, and unbinding > >> when the .unbind_mm callback is invoked. > >> > >> Call mmget_not_zero()/kthread_use_mm() inside the worker function > >> to pin the address space only as long as needed, following the > >> documentation of mmget() in include/linux/sched/mm.h: > >> > >> * Never use this function to pin this address space for an > >> * unbounded/indefinite amount of time. > > > >I wonder if everything would be simplified if we just allow the parent > >to advertise whether or not it requires the address space. > > > >Then when vhost-vDPA probes the device it can simply advertise > >use_work as true so vhost core can use get_task_mm() in this case? > > IIUC set user_worker to true, it also creates the kthread in the vhost > core (but we can add another variable to avoid this). > > My biggest concern is the comment in include/linux/sched/mm.h. > get_task_mm() uses mmget(), but in the documentation they advise against > pinning the address space indefinitely, so I preferred in keeping > mmgrab() in the vhost core, then call mmget_not_zero() in the worker > only when it is running.Ok.> > In the future maybe mm will be used differently from parent if somehow > it is supported by iommu, so I would leave it to the parent to handle > this.This should be possible, I was told by Intel that their IOMMU can access the process page table for shared virtual memory. Thanks> > Thanks, > Stefano >