thr3ads.net - Nouveau - [Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access [Mar 2021]

If this information is useful, please help other people find it:
Share via:

Jason Gunthorpe

2021-Mar-02 00:05 UTC

[Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access

On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> +/**
> + * make_device_exclusive_range() - Mark a range for exclusive use by a
device
> + * @mm: mm_struct of assoicated target process
> + * @start: start of the region to mark for exclusive device access
> + * @end: end address of region
> + * @pages: returns the pages which were successfully mark for exclusive
acces
> + *
> + * Returns: number of pages successfully marked for exclusive access
> + *
> + * This function finds the ptes mapping page(s) to the given address range
and
> + * replaces them with special swap entries preventing userspace CPU
access. On
> + * fault these entries are replaced with the original mapping after
calling MMU
> + * notifiers.
> + */
> +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> +				unsigned long end, struct page **pages)
> +{
> +	long npages = (end - start) >> PAGE_SHIFT;
> +	long i;
> +
> +	npages = get_user_pages_remote(mm, start, npages,
> +				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> +				       pages, NULL, NULL);
> +	for (i = 0; i < npages; i++) {
> +		if (!trylock_page(pages[i])) {
> +			put_page(pages[i]);
> +			pages[i] = NULL;
> +			continue;
> +		}
> +
> +		if (!try_to_protect(pages[i])) {
Isn't this racy? get_user_pages returns the ptes at an instant in
time, they could have already been changed to something else?

I would think you'd want to switch to the swap entry atomically under
th PTLs?

Jason

Alistair Popple

2021-Mar-02 08:57 UTC

head link

[Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access

On Tuesday, 2 March 2021 11:05:59 AM AEDT Jason Gunthorpe
wrote:> On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> 
> > +/**
> > + * make_device_exclusive_range() - Mark a range for exclusive use by
a
device> > + * @mm: mm_struct of assoicated target process
> > + * @start: start of the region to mark for exclusive device access
> > + * @end: end address of region
> > + * @pages: returns the pages which were successfully mark for
exclusive
acces> > + *
> > + * Returns: number of pages successfully marked for exclusive access
> > + *
> > + * This function finds the ptes mapping page(s) to the given address 
range and> > + * replaces them with special swap entries preventing userspace CPU 
access. On> > + * fault these entries are replaced with the original mapping after 
calling MMU> > + * notifiers.
> > + */
> > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long 
start,> > +				unsigned long end, struct page **pages)
> > +{
> > +	long npages = (end - start) >> PAGE_SHIFT;
> > +	long i;
> > +
> > +	npages = get_user_pages_remote(mm, start, npages,
> > +				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> > +				       pages, NULL, NULL);
> > +	for (i = 0; i < npages; i++) {
> > +		if (!trylock_page(pages[i])) {
> > +			put_page(pages[i]);
> > +			pages[i] = NULL;
> > +			continue;
> > +		}
> > +
> > +		if (!try_to_protect(pages[i])) {
> 
> Isn't this racy? get_user_pages returns the ptes at an instant in
> time, they could have already been changed to something else?
Right. On it's own this does not guarantee that the page is mapped at the 
given location, only that a mapping won't get established without an mmu 
notifier callback to clear the swap entry.

The intent was a driver could use HMM or some other mechanism to keep PTEs 
synchronised if required. However I just looked at patch 8 in the series again 
and it appears I got this wrong when converting from the old migration 
approach:

+               mutex_unlock(&svmm->mutex);
+               ret = nouveau_atomic_range_fault(svmm, drm, args,
+                                               size, hmm_flags, mm);

The mutex needs to be unlocked after the range fault to ensure the PTE
hasn't
changed. But this ends up being a problem because try_to_protect() calls 
notifiers which need to take that mutex and hence deadlocks.
> I would think you'd want to switch to the swap entry atomically under
> th PTLs?
That is one approach, but the reuse of get_user_pages() to walk the page 
tables and fault/gather the pages is a nice simplification and adding a new 
FOLL flag/mode to atomically swap entries doesn't seem right.

However try_to_protect() scans the PTEs again under the PTL so checking the 
mapping of interest actually gets replaced during the rmap walk seems like a 
reasonable solution. Thanks for the comments.

 - Alistair
> Jason
>

Nouveau - Mar 2021 - [PATCH v3 5/8] mm: Device exclusive memory access

[Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access

[Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access