On Fri, Mar 24, 2023 at 10:54:39AM +0800, Jason Wang
wrote:>On Thu, Mar 23, 2023 at 5:50?PM Stefano Garzarella <sgarzare at
redhat.com> wrote:
>>
>> On Thu, Mar 23, 2023 at 11:42:07AM +0800, Jason Wang wrote:
>> >On Tue, Mar 21, 2023 at 11:48?PM Stefano Garzarella <sgarzare at
redhat.com> wrote:
>> >>
>> >> The new "use_va" module parameter (default: true) is
used in
>> >> vdpa_alloc_device() to inform the vDPA framework that the
device
>> >> supports VA.
>> >>
>> >> vringh is initialized to use VA only when "use_va"
is true and the
>> >> user's mm has been bound. So, only when the bus supports
user VA
>> >> (e.g. vhost-vdpa).
>> >>
>> >> vdpasim_mm_work_fn work is used to serialize the binding to a
new
>> >> address space when the .bind_mm callback is invoked, and
unbinding
>> >> when the .unbind_mm callback is invoked.
>> >>
>> >> Call mmget_not_zero()/kthread_use_mm() inside the worker
function
>> >> to pin the address space only as long as needed, following the
>> >> documentation of mmget() in include/linux/sched/mm.h:
>> >>
>> >> * Never use this function to pin this address space for an
>> >> * unbounded/indefinite amount of time.
>> >
>> >I wonder if everything would be simplified if we just allow the
parent
>> >to advertise whether or not it requires the address space.
>> >
>> >Then when vhost-vDPA probes the device it can simply advertise
>> >use_work as true so vhost core can use get_task_mm() in this case?
>>
>> IIUC set user_worker to true, it also creates the kthread in the vhost
>> core (but we can add another variable to avoid this).
>>
>> My biggest concern is the comment in include/linux/sched/mm.h.
>> get_task_mm() uses mmget(), but in the documentation they advise
against
>> pinning the address space indefinitely, so I preferred in keeping
>> mmgrab() in the vhost core, then call mmget_not_zero() in the worker
>> only when it is running.
>
>Ok.
>
>>
>> In the future maybe mm will be used differently from parent if somehow
>> it is supported by iommu, so I would leave it to the parent to handle
>> this.
>
>This should be possible, I was told by Intel that their IOMMU can
>access the process page table for shared virtual memory.
Cool, we should investigate this. Do you have any pointers to their
documentation?
Thanks,
Stefano