Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] hmm_range_fault related fixes and legacy API removal v3
Hi Jérôme, Ben and Jason, below is a series against the hmm tree which fixes up the mmap_sem locking in nouveau and while at it also removes leftover legacy HMM APIs only used by nouveau. The first 4 patches are a bug fix for nouveau, which I suspect should go into this merge window even if the code is marked as staging, just to avoid people copying the breakage. Changes since v2: - new patch from Jason to document FAULT_FLAG_ALLOW_RETRY semantics better - remove -EAGAIN handling in nouveau earlier Changes since v1: - don't return the valid state from hmm_range_unregister - additional nouveau cleanups
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 1/7] mm: always return EBUSY for invalid ranges in hmm_range_{fault, snapshot}
We should not have two different error codes for the same condition. In addition this really complicates the code due to the special handling of EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic in the core vm. Signed-off-by: Christoph Hellwig <hch at lst.de> Reviewed-by: Ralph Campbell <rcampbell at nvidia.com> Reviewed-by: Jason Gunthorpe <jgg at mellanox.com> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com> --- Documentation/vm/hmm.rst | 2 +- mm/hmm.c | 10 ++++------ 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 7d90964abbb0..710ce1c701bf 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -237,7 +237,7 @@ The usage pattern is:: ret = hmm_range_snapshot(&range); if (ret) { up_read(&mm->mmap_sem); - if (ret == -EAGAIN) { + if (ret == -EBUSY) { /* * No need to check hmm_range_wait_until_valid() return value * on retry we will get proper error with hmm_range_snapshot() diff --git a/mm/hmm.c b/mm/hmm.c index e1eedef129cf..16b6731a34db 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -946,7 +946,7 @@ EXPORT_SYMBOL(hmm_range_unregister); * @range: range * Return: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid * permission (for instance asking for write and range is read only), - * -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid + * -EBUSY if you need to retry, -EFAULT invalid (ie either no valid * vma or it is illegal to access that range), number of valid pages * in range->pfns[] (from range start address). * @@ -967,7 +967,7 @@ long hmm_range_snapshot(struct hmm_range *range) do { /* If range is no longer valid force retry. */ if (!range->valid) - return -EAGAIN; + return -EBUSY; vma = find_vma(hmm->mm, start); if (vma == NULL || (vma->vm_flags & device_vma)) @@ -1062,10 +1062,8 @@ long hmm_range_fault(struct hmm_range *range, bool block) do { /* If range is no longer valid force retry. */ - if (!range->valid) { - up_read(&hmm->mm->mmap_sem); - return -EAGAIN; - } + if (!range->valid) + return -EBUSY; vma = find_vma(hmm->mm, start); if (vma == NULL || (vma->vm_flags & device_vma)) -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 2/7] mm: move hmm_vma_range_done and hmm_vma_fault to nouveau
These two functions are marked as a legacy APIs to get rid of, but seem to suit the current nouveau flow. Move it to the only user in preparation for fixing a locking bug involving caller and callee. All comments referring to the old API have been removed as this now is a driver private helper. Signed-off-by: Christoph Hellwig <hch at lst.de> Reviewed-by: Jason Gunthorpe <jgg at mellanox.com> --- drivers/gpu/drm/nouveau/nouveau_svm.c | 46 ++++++++++++++++++++++- include/linux/hmm.h | 54 --------------------------- 2 files changed, 44 insertions(+), 56 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 8c92374afcf2..6c1b04de0db8 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -475,6 +475,48 @@ nouveau_svm_fault_cache(struct nouveau_svm *svm, fault->inst, fault->addr, fault->access); } +static inline bool +nouveau_range_done(struct hmm_range *range) +{ + bool ret = hmm_range_valid(range); + + hmm_range_unregister(range); + return ret; +} + +static int +nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range, + bool block) +{ + long ret; + + range->default_flags = 0; + range->pfn_flags_mask = -1UL; + + ret = hmm_range_register(range, mirror, + range->start, range->end, + PAGE_SHIFT); + if (ret) + return (int)ret; + + if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { + up_read(&range->vma->vm_mm->mmap_sem); + return -EAGAIN; + } + + ret = hmm_range_fault(range, block); + if (ret <= 0) { + if (ret == -EBUSY || !ret) { + up_read(&range->vma->vm_mm->mmap_sem); + ret = -EBUSY; + } else if (ret == -EAGAIN) + ret = -EBUSY; + hmm_range_unregister(range); + return ret; + } + return 0; +} + static int nouveau_svm_fault(struct nvif_notify *notify) { @@ -649,10 +691,10 @@ nouveau_svm_fault(struct nvif_notify *notify) range.values = nouveau_svm_pfn_values; range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT; again: - ret = hmm_vma_fault(&svmm->mirror, &range, true); + ret = nouveau_range_fault(&svmm->mirror, &range, true); if (ret == 0) { mutex_lock(&svmm->mutex); - if (!hmm_vma_range_done(&range)) { + if (!nouveau_range_done(&range)) { mutex_unlock(&svmm->mutex); goto again; } diff --git a/include/linux/hmm.h b/include/linux/hmm.h index b8a08b2a10ca..7ef56dc18050 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -484,60 +484,6 @@ long hmm_range_dma_unmap(struct hmm_range *range, */ #define HMM_RANGE_DEFAULT_TIMEOUT 1000 -/* This is a temporary helper to avoid merge conflict between trees. */ -static inline bool hmm_vma_range_done(struct hmm_range *range) -{ - bool ret = hmm_range_valid(range); - - hmm_range_unregister(range); - return ret; -} - -/* This is a temporary helper to avoid merge conflict between trees. */ -static inline int hmm_vma_fault(struct hmm_mirror *mirror, - struct hmm_range *range, bool block) -{ - long ret; - - /* - * With the old API the driver must set each individual entries with - * the requested flags (valid, write, ...). So here we set the mask to - * keep intact the entries provided by the driver and zero out the - * default_flags. - */ - range->default_flags = 0; - range->pfn_flags_mask = -1UL; - - ret = hmm_range_register(range, mirror, - range->start, range->end, - PAGE_SHIFT); - if (ret) - return (int)ret; - - if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { - /* - * The mmap_sem was taken by driver we release it here and - * returns -EAGAIN which correspond to mmap_sem have been - * drop in the old API. - */ - up_read(&range->vma->vm_mm->mmap_sem); - return -EAGAIN; - } - - ret = hmm_range_fault(range, block); - if (ret <= 0) { - if (ret == -EBUSY || !ret) { - /* Same as above, drop mmap_sem to match old API. */ - up_read(&range->vma->vm_mm->mmap_sem); - ret = -EBUSY; - } else if (ret == -EAGAIN) - ret = -EBUSY; - hmm_range_unregister(range); - return ret; - } - return 0; -} - /* Below are for HMM internal use only! Not to be used by device driver! */ static inline void hmm_mm_init(struct mm_struct *mm) { -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 3/7] nouveau: remove the block parameter to nouveau_range_fault
The parameter is always false, so remove it as well as the -EAGAIN handling that can only happen for the non-blocking case. Signed-off-by: Christoph Hellwig <hch at lst.de> --- drivers/gpu/drm/nouveau/nouveau_svm.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 6c1b04de0db8..e3097492b4ad 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -485,8 +485,7 @@ nouveau_range_done(struct hmm_range *range) } static int -nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range, - bool block) +nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) { long ret; @@ -504,13 +503,12 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range, return -EAGAIN; } - ret = hmm_range_fault(range, block); + ret = hmm_range_fault(range, true); if (ret <= 0) { if (ret == -EBUSY || !ret) { up_read(&range->vma->vm_mm->mmap_sem); ret = -EBUSY; - } else if (ret == -EAGAIN) - ret = -EBUSY; + } hmm_range_unregister(range); return ret; } @@ -691,7 +689,7 @@ nouveau_svm_fault(struct nvif_notify *notify) range.values = nouveau_svm_pfn_values; range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT; again: - ret = nouveau_range_fault(&svmm->mirror, &range, true); + ret = nouveau_range_fault(&svmm->mirror, &range); if (ret == 0) { mutex_lock(&svmm->mutex); if (!nouveau_range_done(&range)) { -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 4/7] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
Currently nouveau_svm_fault expects nouveau_range_fault to never unlock mmap_sem, but the latter unlocks it for a random selection of error codes. Fix this up by always unlocking mmap_sem for non-zero return values in nouveau_range_fault, and only unlocking it in the caller for successful returns. Signed-off-by: Christoph Hellwig <hch at lst.de> --- drivers/gpu/drm/nouveau/nouveau_svm.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index e3097492b4ad..a835cebb6d90 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -495,8 +495,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) ret = hmm_range_register(range, mirror, range->start, range->end, PAGE_SHIFT); - if (ret) + if (ret) { + up_read(&range->vma->vm_mm->mmap_sem); return (int)ret; + } if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { up_read(&range->vma->vm_mm->mmap_sem); @@ -505,10 +507,9 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) ret = hmm_range_fault(range, true); if (ret <= 0) { - if (ret == -EBUSY || !ret) { - up_read(&range->vma->vm_mm->mmap_sem); + if (ret == 0) ret = -EBUSY; - } + up_read(&range->vma->vm_mm->mmap_sem); hmm_range_unregister(range); return ret; } @@ -706,8 +707,8 @@ nouveau_svm_fault(struct nvif_notify *notify) NULL); svmm->vmm->vmm.object.client->super = false; mutex_unlock(&svmm->mutex); + up_read(&svmm->mm->mmap_sem); } - up_read(&svmm->mm->mmap_sem); /* Cancel any faults in the window whose pages didn't manage * to keep their valid bit, or stay writeable when required. -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 5/7] nouveau: return -EBUSY when hmm_range_wait_until_valid fails
-EAGAIN has a magic meaning for non-blocking faults, so don't overload it. Given that the caller doesn't check for specific error codes this change is purely cosmetic. Signed-off-by: Christoph Hellwig <hch at lst.de> Reviewed-by: Jason Gunthorpe <jgg at mellanox.com> --- drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index a835cebb6d90..545100f7c594 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -502,7 +502,7 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { up_read(&range->vma->vm_mm->mmap_sem); - return -EAGAIN; + return -EBUSY; } ret = hmm_range_fault(range, true); -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 6/7] mm: remove the legacy hmm_pfn_* APIs
Switch the one remaining user in nouveau over to its replacement, and remove all the wrappers. Signed-off-by: Christoph Hellwig <hch at lst.de> Reviewed-by: Ralph Campbell <rcampbell at nvidia.com> Reviewed-by: Jason Gunthorpe <jgg at mellanox.com> --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- include/linux/hmm.h | 34 -------------------------- 2 files changed, 1 insertion(+), 35 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 1333220787a1..345c63cb752a 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -845,7 +845,7 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm, struct page *page; uint64_t addr; - page = hmm_pfn_to_page(range, range->pfns[i]); + page = hmm_device_entry_to_page(range, range->pfns[i]); if (page == NULL) continue; diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 7ef56dc18050..9f32586684c9 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -290,40 +290,6 @@ static inline uint64_t hmm_device_entry_from_pfn(const struct hmm_range *range, range->flags[HMM_PFN_VALID]; } -/* - * Old API: - * hmm_pfn_to_page() - * hmm_pfn_to_pfn() - * hmm_pfn_from_page() - * hmm_pfn_from_pfn() - * - * This are the OLD API please use new API, it is here to avoid cross-tree - * merge painfullness ie we convert things to new API in stages. - */ -static inline struct page *hmm_pfn_to_page(const struct hmm_range *range, - uint64_t pfn) -{ - return hmm_device_entry_to_page(range, pfn); -} - -static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range, - uint64_t pfn) -{ - return hmm_device_entry_to_pfn(range, pfn); -} - -static inline uint64_t hmm_pfn_from_page(const struct hmm_range *range, - struct page *page) -{ - return hmm_device_entry_from_page(range, page); -} - -static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range, - unsigned long pfn) -{ - return hmm_device_entry_from_pfn(range, pfn); -} - /* * Mirroring: how to synchronize device page table with CPU page table. * -- 2.20.1
Christoph Hellwig
2019-Jul-24 06:52 UTC
[Nouveau] [PATCH 7/7] mm: comment on VM_FAULT_RETRY semantics in handle_mm_fault
From: Jason Gunthorpe <jgg at mellanox.com> The magic dropping of mmap_sem when handle_mm_fault returns VM_FAULT_RETRY is rather subtile. Add a comment explaining it. Signed-off-by: Jason Gunthorpe <jgg at mellanox.com> [hch: wrote a changelog] Signed-off-by: Christoph Hellwig <hch at lst.de> --- mm/hmm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/hmm.c b/mm/hmm.c index 16b6731a34db..54b3a4162ae9 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -301,8 +301,10 @@ static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, flags |= hmm_vma_walk->block ? 0 : FAULT_FLAG_ALLOW_RETRY; flags |= write_fault ? FAULT_FLAG_WRITE : 0; ret = handle_mm_fault(vma, addr, flags); - if (ret & VM_FAULT_RETRY) + if (ret & VM_FAULT_RETRY) { + /* Note, handle_mm_fault did up_read(&mm->mmap_sem)) */ return -EAGAIN; + } if (ret & VM_FAULT_ERROR) { *pfn = range->values[HMM_PFN_ERROR]; return -EFAULT; -- 2.20.1
Jason Gunthorpe
2019-Jul-26 00:16 UTC
[Nouveau] hmm_range_fault related fixes and legacy API removal v3
On Wed, Jul 24, 2019 at 08:52:51AM +0200, Christoph Hellwig wrote:> Hi Jérôme, Ben and Jason, > > below is a series against the hmm tree which fixes up the mmap_sem > locking in nouveau and while at it also removes leftover legacy HMM APIs > only used by nouveau. > > The first 4 patches are a bug fix for nouveau, which I suspect should > go into this merge window even if the code is marked as staging, just > to avoid people copying the breakage. > > Changes since v2: > - new patch from Jason to document FAULT_FLAG_ALLOW_RETRY semantics > better > - remove -EAGAIN handling in nouveau earlierI don't see Ralph's tested by, do you think it changed enough to require testing again? If so, Ralph would you be so kind? In any event, I'm sending this into linux-next and intend to forward the first four next week. Thanks, Jason
Ralph Campbell
2019-Jul-26 00:55 UTC
[Nouveau] hmm_range_fault related fixes and legacy API removal v3
On 7/25/19 5:16 PM, Jason Gunthorpe wrote:> On Wed, Jul 24, 2019 at 08:52:51AM +0200, Christoph Hellwig wrote: >> Hi Jérôme, Ben and Jason, >> >> below is a series against the hmm tree which fixes up the mmap_sem >> locking in nouveau and while at it also removes leftover legacy HMM APIs >> only used by nouveau. >> >> The first 4 patches are a bug fix for nouveau, which I suspect should >> go into this merge window even if the code is marked as staging, just >> to avoid people copying the breakage. >> >> Changes since v2: >> - new patch from Jason to document FAULT_FLAG_ALLOW_RETRY semantics >> better >> - remove -EAGAIN handling in nouveau earlier > > I don't see Ralph's tested by, do you think it changed enough to > require testing again? If so, Ralph would you be so kind? > > In any event, I'm sending this into linux-next and intend to forward > the first four next week. > > Thanks, > Jason >I have been testing Christoph's v3 with my set of v2 changes so feel free to add my tested-by.
Christoph Hellwig
2019-Jul-26 04:57 UTC
[Nouveau] hmm_range_fault related fixes and legacy API removal v3
On Fri, Jul 26, 2019 at 12:16:30AM +0000, Jason Gunthorpe wrote:> I don't see Ralph's tested by, do you think it changed enough to > require testing again? If so, Ralph would you be so kind?The changes were fairly small, but I didn't feel to carry it over given that there were changes after all.
Apparently Analagous Threads
- hmm_range_fault related fixes and legacy API removal v3
- hmm_range_fault related fixes and legacy API removal v2
- hmm_range_fault related fixes and legacy API removal v2
- hmm_range_fault related fixes and legacy API removal v2
- hmm_range_fault related fixes and legacy API removal v2