thr3ads.net - Nouveau - [Nouveau] [PATCH 2/5] mm: always return EBUSY for invalid ranges in hmm_range

If this information is useful, please help other people find it:
Share via:

Christoph Hellwig

2019-Jul-03 18:44 UTC

[Nouveau] hmm_range_fault related fixes and legacy API removal

Hi Jérôme, Ben and Jason,

below is a series against the hmm tree which fixes up the mmap_sem
locking in nouveau and while at it also removes leftover legacy HMM APIs
only used by nouveau.

Christoph Hellwig

2019-Jul-03 18:44 UTC

head link

[Nouveau] [PATCH 1/5] mm: return valid info from hmm_range_unregister

Checking range->valid is trivial and has no meaningful cost, but
nicely simplifies the fastpath in typical callers.  Also remove the
hmm_vma_range_done function, which now is a trivial wrapper around
hmm_range_unregister.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_svm.c |  2 +-
 include/linux/hmm.h                   | 11 +----------
 mm/hmm.c                              |  7 ++++++-
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 8c92374afcf2..9d40114d7949 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -652,7 +652,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
 		ret = hmm_vma_fault(&svmm->mirror, &range, true);
 		if (ret == 0) {
 			mutex_lock(&svmm->mutex);
-			if (!hmm_vma_range_done(&range)) {
+			if (!hmm_range_unregister(&range)) {
 				mutex_unlock(&svmm->mutex);
 				goto again;
 			}
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index b8a08b2a10ca..6b55e59fd8e3 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -462,7 +462,7 @@ int hmm_range_register(struct hmm_range *range,
 		       unsigned long start,
 		       unsigned long end,
 		       unsigned page_shift);
-void hmm_range_unregister(struct hmm_range *range);
+bool hmm_range_unregister(struct hmm_range *range);
 long hmm_range_snapshot(struct hmm_range *range);
 long hmm_range_fault(struct hmm_range *range, bool block);
 long hmm_range_dma_map(struct hmm_range *range,
@@ -484,15 +484,6 @@ long hmm_range_dma_unmap(struct hmm_range *range,
  */
 #define HMM_RANGE_DEFAULT_TIMEOUT 1000
 
-/* This is a temporary helper to avoid merge conflict between trees. */
-static inline bool hmm_vma_range_done(struct hmm_range *range)
-{
-	bool ret = hmm_range_valid(range);
-
-	hmm_range_unregister(range);
-	return ret;
-}
-
 /* This is a temporary helper to avoid merge conflict between trees. */
 static inline int hmm_vma_fault(struct hmm_mirror *mirror,
 				struct hmm_range *range, bool block)
diff --git a/mm/hmm.c b/mm/hmm.c
index d48b9283725a..ac238d3f1f4e 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -917,11 +917,15 @@ EXPORT_SYMBOL(hmm_range_register);
  *
  * Range struct is used to track updates to the CPU page table after a call to
  * hmm_range_register(). See include/linux/hmm.h for how to use it.
+ *
+ * Return:	%true if the range was still valid at the time of unregistering,
+ *		else %false.
  */
-void hmm_range_unregister(struct hmm_range *range)
+bool hmm_range_unregister(struct hmm_range *range)
 {
 	struct hmm *hmm = range->hmm;
 	unsigned long flags;
+	bool ret = range->valid;
 
 	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	list_del_init(&range->list);
@@ -938,6 +942,7 @@ void hmm_range_unregister(struct hmm_range *range)
 	 */
 	range->valid = false;
 	memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
+	return ret;
 }
 EXPORT_SYMBOL(hmm_range_unregister);
 
-- 
2.20.1

Christoph Hellwig

2019-Jul-03 18:44 UTC

head link

[Nouveau] [PATCH 2/5] mm: always return EBUSY for invalid ranges in hmm_range_{fault, snapshot}

We should not have two different error codes for the same condition.  In
addition this really complicates the code due to the special handling of
EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic
in the core vm.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
 Documentation/vm/hmm.rst |  2 +-
 mm/hmm.c                 | 10 ++++------
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 7d90964abbb0..710ce1c701bf 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -237,7 +237,7 @@ The usage pattern is::
       ret = hmm_range_snapshot(&range);
       if (ret) {
           up_read(&mm->mmap_sem);
-          if (ret == -EAGAIN) {
+          if (ret == -EBUSY) {
             /*
              * No need to check hmm_range_wait_until_valid() return value
              * on retry we will get proper error with hmm_range_snapshot()
diff --git a/mm/hmm.c b/mm/hmm.c
index ac238d3f1f4e..3abc2e3c1e9f 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -951,7 +951,7 @@ EXPORT_SYMBOL(hmm_range_unregister);
  * @range: range
  * Return: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid
  *          permission (for instance asking for write and range is read only),
- *          -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid
+ *          -EBUSY if you need to retry, -EFAULT invalid (ie either no valid
  *          vma or it is illegal to access that range), number of valid pages
  *          in range->pfns[] (from range start address).
  *
@@ -972,7 +972,7 @@ long hmm_range_snapshot(struct hmm_range *range)
 	do {
 		/* If range is no longer valid force retry. */
 		if (!range->valid)
-			return -EAGAIN;
+			return -EBUSY;
 
 		vma = find_vma(hmm->mm, start);
 		if (vma == NULL || (vma->vm_flags & device_vma))
@@ -1067,10 +1067,8 @@ long hmm_range_fault(struct hmm_range *range, bool block)
 
 	do {
 		/* If range is no longer valid force retry. */
-		if (!range->valid) {
-			up_read(&hmm->mm->mmap_sem);
-			return -EAGAIN;
-		}
+		if (!range->valid)
+			return -EBUSY;
 
 		vma = find_vma(hmm->mm, start);
 		if (vma == NULL || (vma->vm_flags & device_vma))
-- 
2.20.1

Christoph Hellwig

2019-Jul-03 18:45 UTC

head link

[Nouveau] [PATCH 3/5] mm: move hmm_vma_fault to nouveau

hmm_vma_fault is marked as a legacy API to get rid of, but seems to suit
the current nouveau flow.  Move it to the only user in preparation for
fixing a locking bug involving caller and callee.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 54 ++++++++++++++++++++++++++-
 include/linux/hmm.h                   | 54 ---------------------------
 2 files changed, 53 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 9d40114d7949..e831f4184a17 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -36,6 +36,13 @@
 #include <linux/sort.h>
 #include <linux/hmm.h>
 
+/*
+ * When waiting for mmu notifiers we need some kind of time out otherwise we
+ * could potentialy wait for ever, 1000ms ie 1s sounds like a long time to
+ * wait already.
+ */
+#define NOUVEAU_RANGE_FAULT_TIMEOUT 1000
+
 struct nouveau_svm {
 	struct nouveau_drm *drm;
 	struct mutex mutex;
@@ -475,6 +482,51 @@ nouveau_svm_fault_cache(struct nouveau_svm *svm,
 		fault->inst, fault->addr, fault->access);
 }
 
+static int
+nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range,
+		    bool block)
+{
+	long ret;
+
+	/*
+	 * With the old API the driver must set each individual entries with
+	 * the requested flags (valid, write, ...). So here we set the mask to
+	 * keep intact the entries provided by the driver and zero out the
+	 * default_flags.
+	 */
+	range->default_flags = 0;
+	range->pfn_flags_mask = -1UL;
+
+	ret = hmm_range_register(range, mirror,
+				 range->start, range->end,
+				 PAGE_SHIFT);
+	if (ret)
+		return (int)ret;
+
+	if (!hmm_range_wait_until_valid(range, NOUVEAU_RANGE_FAULT_TIMEOUT)) {
+		/*
+		 * The mmap_sem was taken by driver we release it here and
+		 * returns -EAGAIN which correspond to mmap_sem have been
+		 * drop in the old API.
+		 */
+		up_read(&range->vma->vm_mm->mmap_sem);
+		return -EAGAIN;
+	}
+
+	ret = hmm_range_fault(range, block);
+	if (ret <= 0) {
+		if (ret == -EBUSY || !ret) {
+			/* Same as above, drop mmap_sem to match old API. */
+			up_read(&range->vma->vm_mm->mmap_sem);
+			ret = -EBUSY;
+		} else if (ret == -EAGAIN)
+			ret = -EBUSY;
+		hmm_range_unregister(range);
+		return ret;
+	}
+	return 0;
+}
+
 static int
 nouveau_svm_fault(struct nvif_notify *notify)
 {
@@ -649,7 +701,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
 		range.values = nouveau_svm_pfn_values;
 		range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
 again:
-		ret = hmm_vma_fault(&svmm->mirror, &range, true);
+		ret = nouveau_range_fault(&svmm->mirror, &range, true);
 		if (ret == 0) {
 			mutex_lock(&svmm->mutex);
 			if (!hmm_range_unregister(&range)) {
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 6b55e59fd8e3..657606f48796 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -475,60 +475,6 @@ long hmm_range_dma_unmap(struct hmm_range *range,
 			 dma_addr_t *daddrs,
 			 bool dirty);
 
-/*
- * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range
- *
- * When waiting for mmu notifiers we need some kind of time out otherwise we
- * could potentialy wait for ever, 1000ms ie 1s sounds like a long time to
- * wait already.
- */
-#define HMM_RANGE_DEFAULT_TIMEOUT 1000
-
-/* This is a temporary helper to avoid merge conflict between trees. */
-static inline int hmm_vma_fault(struct hmm_mirror *mirror,
-				struct hmm_range *range, bool block)
-{
-	long ret;
-
-	/*
-	 * With the old API the driver must set each individual entries with
-	 * the requested flags (valid, write, ...). So here we set the mask to
-	 * keep intact the entries provided by the driver and zero out the
-	 * default_flags.
-	 */
-	range->default_flags = 0;
-	range->pfn_flags_mask = -1UL;
-
-	ret = hmm_range_register(range, mirror,
-				 range->start, range->end,
-				 PAGE_SHIFT);
-	if (ret)
-		return (int)ret;
-
-	if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
-		/*
-		 * The mmap_sem was taken by driver we release it here and
-		 * returns -EAGAIN which correspond to mmap_sem have been
-		 * drop in the old API.
-		 */
-		up_read(&range->vma->vm_mm->mmap_sem);
-		return -EAGAIN;
-	}
-
-	ret = hmm_range_fault(range, block);
-	if (ret <= 0) {
-		if (ret == -EBUSY || !ret) {
-			/* Same as above, drop mmap_sem to match old API. */
-			up_read(&range->vma->vm_mm->mmap_sem);
-			ret = -EBUSY;
-		} else if (ret == -EAGAIN)
-			ret = -EBUSY;
-		hmm_range_unregister(range);
-		return ret;
-	}
-	return 0;
-}
-
 /* Below are for HMM internal use only! Not to be used by device driver! */
 static inline void hmm_mm_init(struct mm_struct *mm)
 {
-- 
2.20.1

Christoph Hellwig

2019-Jul-03 18:45 UTC

head link

[Nouveau] [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
mmap_sem, but the latter unlocks it for a random selection of error
codes. Fix this up by always unlocking mmap_sem for non-zero return
values in nouveau_range_fault, and only unlocking it in the caller
for successful returns.

Signed-off-by: Christoph Hellwig <hch at lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index e831f4184a17..c0cf7aeaefb3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -500,8 +500,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct
hmm_range *range,
 	ret = hmm_range_register(range, mirror,
 				 range->start, range->end,
 				 PAGE_SHIFT);
-	if (ret)
+	if (ret) {
+		up_read(&range->vma->vm_mm->mmap_sem);
 		return (int)ret;
+	}
 
 	if (!hmm_range_wait_until_valid(range, NOUVEAU_RANGE_FAULT_TIMEOUT)) {
 		/*
@@ -515,15 +517,14 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct
hmm_range *range,
 
 	ret = hmm_range_fault(range, block);
 	if (ret <= 0) {
-		if (ret == -EBUSY || !ret) {
-			/* Same as above, drop mmap_sem to match old API. */
-			up_read(&range->vma->vm_mm->mmap_sem);
-			ret = -EBUSY;
-		} else if (ret == -EAGAIN)
+		if (ret == 0)
 			ret = -EBUSY;
+		if (ret != -EAGAIN)
+			up_read(&range->vma->vm_mm->mmap_sem);
 		hmm_range_unregister(range);
 		return ret;
 	}
+
 	return 0;
 }
 
@@ -718,8 +719,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
 						NULL);
 			svmm->vmm->vmm.object.client->super = false;
 			mutex_unlock(&svmm->mutex);
+			up_read(&svmm->mm->mmap_sem);
 		}
-		up_read(&svmm->mm->mmap_sem);
 
 		/* Cancel any faults in the window whose pages didn't manage
 		 * to keep their valid bit, or stay writeable when required.
-- 
2.20.1

Christoph Hellwig

2019-Jul-03 18:45 UTC

head link

[Nouveau] [PATCH 5/5] mm: remove the legacy hmm_pfn_* APIs

Switch the one remaining user in nouveau over to its replacement,
and remove all the wrappers.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Jason Gunthorpe <jgg at mellanox.com>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
 include/linux/hmm.h                    | 34 --------------------------
 2 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 42c026010938..b9ced2e61667 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -844,7 +844,7 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
 		struct page *page;
 		uint64_t addr;
 
-		page = hmm_pfn_to_page(range, range->pfns[i]);
+		page = hmm_device_entry_to_page(range, range->pfns[i]);
 		if (page == NULL)
 			continue;
 
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 657606f48796..cdcd78627393 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -290,40 +290,6 @@ static inline uint64_t hmm_device_entry_from_pfn(const
struct hmm_range *range,
 		range->flags[HMM_PFN_VALID];
 }
 
-/*
- * Old API:
- * hmm_pfn_to_page()
- * hmm_pfn_to_pfn()
- * hmm_pfn_from_page()
- * hmm_pfn_from_pfn()
- *
- * This are the OLD API please use new API, it is here to avoid cross-tree
- * merge painfullness ie we convert things to new API in stages.
- */
-static inline struct page *hmm_pfn_to_page(const struct hmm_range *range,
-					   uint64_t pfn)
-{
-	return hmm_device_entry_to_page(range, pfn);
-}
-
-static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range,
-					   uint64_t pfn)
-{
-	return hmm_device_entry_to_pfn(range, pfn);
-}
-
-static inline uint64_t hmm_pfn_from_page(const struct hmm_range *range,
-					 struct page *page)
-{
-	return hmm_device_entry_from_page(range, page);
-}
-
-static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
-					unsigned long pfn)
-{
-	return hmm_device_entry_from_pfn(range, pfn);
-}
-
 /*
  * Mirroring: how to synchronize device page table with CPU page table.
  *
-- 
2.20.1

Jason Gunthorpe

2019-Jul-03 19:00 UTC

head link

[Nouveau] [PATCH 1/5] mm: return valid info from hmm_range_unregister

On Wed, Jul 03, 2019 at 11:44:58AM -0700, Christoph Hellwig
wrote:> Checking range->valid is trivial and has no meaningful cost, but
> nicely simplifies the fastpath in typical callers.  
It should not be the typical caller..
> hmm_vma_range_done function, which now is a trivial wrapper around
> hmm_range_unregister.
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
>  drivers/gpu/drm/nouveau/nouveau_svm.c |  2 +-
>  include/linux/hmm.h                   | 11 +----------
>  mm/hmm.c                              |  7 ++++++-
>  3 files changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 8c92374afcf2..9d40114d7949 100644
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -652,7 +652,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
>  		ret = hmm_vma_fault(&svmm->mirror, &range, true);
>  		if (ret == 0) {
>  			mutex_lock(&svmm->mutex);
> -			if (!hmm_vma_range_done(&range)) {
> +			if (!hmm_range_unregister(&range)) {
>  				mutex_unlock(&svmm->mutex);
>  				goto again;
>  			}
In this case if we take the 'goto again' then we are pointlessly
removing and re-adding the range.

The pattern is supposed to be:

    hmm_range_register()
again:
    .. read page tables ..
    lock
    if (!hmm_range_valid())
        unlock
        goto again
    .. setup device ..
    unlock
    hmm_range_unregister()

I don't think the API should be encouraging some shortcut here..

We can't do the above pattern because the old hmm_vma API didn't allow
it, which is presumably a reason why it is obsolete.

I'd rather see drivers move to a consistent pattern so we can then
easily hoist the seqcount lock scheme into some common mmu notifier
code, as discussed.

Jason

Ralph Campbell

2019-Jul-03 20:26 UTC

head link

[Nouveau] [PATCH 5/5] mm: remove the legacy hmm_pfn_* APIs

On 7/3/19 11:45 AM, Christoph Hellwig wrote:> Switch the one remaining user in nouveau over to its replacement,
> and remove all the wrappers.
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Jason Gunthorpe <jgg at mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
>   include/linux/hmm.h                    | 34 --------------------------
>   2 files changed, 1 insertion(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 42c026010938..b9ced2e61667 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -844,7 +844,7 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
>   		struct page *page;
>   		uint64_t addr;
>   
> -		page = hmm_pfn_to_page(range, range->pfns[i]);
> +		page = hmm_device_entry_to_page(range, range->pfns[i]);
>   		if (page == NULL)
>   			continue;
>   
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 657606f48796..cdcd78627393 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -290,40 +290,6 @@ static inline uint64_t hmm_device_entry_from_pfn(const
struct hmm_range *range,
>   		range->flags[HMM_PFN_VALID];
>   }
>   
> -/*
> - * Old API:
> - * hmm_pfn_to_page()
> - * hmm_pfn_to_pfn()
> - * hmm_pfn_from_page()
> - * hmm_pfn_from_pfn()
> - *
> - * This are the OLD API please use new API, it is here to avoid cross-tree
> - * merge painfullness ie we convert things to new API in stages.
> - */
> -static inline struct page *hmm_pfn_to_page(const struct hmm_range *range,
> -					   uint64_t pfn)
> -{
> -	return hmm_device_entry_to_page(range, pfn);
> -}
> -
> -static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range,
> -					   uint64_t pfn)
> -{
> -	return hmm_device_entry_to_pfn(range, pfn);
> -}
> -
> -static inline uint64_t hmm_pfn_from_page(const struct hmm_range *range,
> -					 struct page *page)
> -{
> -	return hmm_device_entry_from_page(range, page);
> -}
> -
> -static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
> -					unsigned long pfn)
> -{
> -	return hmm_device_entry_from_pfn(range, pfn);
> -}
> -
>   /*
>    * Mirroring: how to synchronize device page table with CPU page table.
>    *
>

Ralph Campbell

2019-Jul-03 20:46 UTC

head link

[Nouveau] [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On 7/3/19 11:45 AM, Christoph Hellwig wrote:> Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
> mmap_sem, but the latter unlocks it for a random selection of error
> codes. Fix this up by always unlocking mmap_sem for non-zero return
> values in nouveau_range_fault, and only unlocking it in the caller
> for successful returns.
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Ralph Campbell <rcampbell at nvidia.com>
> ---
>   drivers/gpu/drm/nouveau/nouveau_svm.c | 15 ++++++++-------
>   1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index e831f4184a17..c0cf7aeaefb3 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -500,8 +500,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct
hmm_range *range,
You can delete the comment "With the old API the driver must ..."
(not visible in the patch here).
I suggest moving the two assignments:
	range->default_flags = 0;
	range->pfn_flags_mask = -1UL;
to just above the "again:" where the other range.xxx fields are
initialized in nouveau_svm_fault().
>   	ret = hmm_range_register(range, mirror,
>   				 range->start, range->end,
>   				 PAGE_SHIFT);
> -	if (ret)
> +	if (ret) {
> +		up_read(&range->vma->vm_mm->mmap_sem; >   		return
(int)ret;
> +	}
>   
>   	if (!hmm_range_wait_until_valid(range, NOUVEAU_RANGE_FAULT_TIMEOUT)) {
>   		/*
You can delete this comment (only the first line is visible here)
since it is about the "old API".
Also, it should return -EBUSY not -EAGAIN since it means there was a
range invalidation collision (similar to hmm_range_fault() if
!range->valid).
> @@ -515,15 +517,14 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct
hmm_range *range,
>   
>   	ret = hmm_range_fault(range, block);
nouveau_range_fault() is only called with "block = true" so
could eliminate the block parameter and pass true here.
>   	if (ret <= 0) {
> -		if (ret == -EBUSY || !ret) {
> -			/* Same as above, drop mmap_sem to match old API. */
> -			up_read(&range->vma->vm_mm->mmap_sem);
> -			ret = -EBUSY;
> -		} else if (ret == -EAGAIN)
> +		if (ret == 0)
>   			ret = -EBUSY;
> +		if (ret != -EAGAIN)
> +			up_read(&range->vma->vm_mm->mmap_sem);
Can ret == -EAGAIN happen if "block = true"?
Generally, I prefer the read_down()/read_up() in the same function
(i.e., nouveau_svm_fault()) but I can see why it should be here
if hmm_range_fault() can return with mmap_sem unlocked.
>   		hmm_range_unregister(range);
>   		return ret;
>   	}
> +
>   	return 0;
>   }
>   
> @@ -718,8 +719,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
>   						NULL);
>   			svmm->vmm->vmm.object.client->super = false;
>   			mutex_unlock(&svmm->mutex);
> +			up_read(&svmm->mm->mmap_sem);
>   		}
> -		up_read(&svmm->mm->mmap_sem);
>   
The "else" case should check for -EBUSY and goto again.
>   		/* Cancel any faults in the window whose pages didn't manage
>   		 * to keep their valid bit, or stay writeable when required.
>

Possibly Parallel Threads

Search for more seemingly similar threads

Nouveau - Jul 2019 - [PATCH 2/5] mm: always return EBUSY for invalid ranges in hmm_range_{fault, snapshot}

[Nouveau] hmm_range_fault related fixes and legacy API removal

[Nouveau] [PATCH 1/5] mm: return valid info from hmm_range_unregister

[Nouveau] [PATCH 2/5] mm: always return EBUSY for invalid ranges in hmm_range_{fault, snapshot}

[Nouveau] [PATCH 3/5] mm: move hmm_vma_fault to nouveau

[Nouveau] [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

[Nouveau] [PATCH 5/5] mm: remove the legacy hmm_pfn_* APIs

[Nouveau] [PATCH 1/5] mm: return valid info from hmm_range_unregister

[Nouveau] [PATCH 5/5] mm: remove the legacy hmm_pfn_* APIs

[Nouveau] [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

Possibly Parallel Threads