Jason Gunthorpe
2019-Jul-23 15:18 UTC
[Nouveau] [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
On Mon, Jul 22, 2019 at 11:44:24AM +0200, Christoph Hellwig wrote:> Currently nouveau_svm_fault expects nouveau_range_fault to never unlock > mmap_sem, but the latter unlocks it for a random selection of error > codes. Fix this up by always unlocking mmap_sem for non-zero return > values in nouveau_range_fault, and only unlocking it in the caller > for successful returns. > > Signed-off-by: Christoph Hellwig <hch at lst.de> > drivers/gpu/drm/nouveau/nouveau_svm.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c > index 5dd83a46578f..5de2d54b9782 100644 > +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c > @@ -494,8 +494,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) > ret = hmm_range_register(range, mirror, > range->start, range->end, > PAGE_SHIFT); > - if (ret) > + if (ret) { > + up_read(&range->vma->vm_mm->mmap_sem); > return (int)ret; > + } > > if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { > up_read(&range->vma->vm_mm->mmap_sem); > @@ -504,11 +506,9 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) > > ret = hmm_range_fault(range, true); > if (ret <= 0) { > - if (ret == -EBUSY || !ret) { > - up_read(&range->vma->vm_mm->mmap_sem); > - ret = -EBUSY; > - } else if (ret == -EAGAIN) > + if (ret == 0) > ret = -EBUSY; > + up_read(&range->vma->vm_mm->mmap_sem); > hmm_range_unregister(range); > return ret;Hum.. The caller does this: again: ret = nouveau_range_fault(&svmm->mirror, &range); if (ret == 0) { mutex_lock(&svmm->mutex); if (!nouveau_range_done(&range)) { mutex_unlock(&svmm->mutex); goto again; And we can't call nouveau_range_fault() -> hmm_range_fault() without holding the mmap_sem, so we can't allow nouveau_range_fault to unlock it. Maybe this instead? diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index a9c5c58d425b3d..92cf760a9bcc5d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -494,21 +494,16 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) ret = hmm_range_register(range, mirror, range->start, range->end, PAGE_SHIFT); - if (ret) { - up_read(&range->vma->vm_mm->mmap_sem); + if (ret) return (int)ret; - } - if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { - up_read(&range->vma->vm_mm->mmap_sem); + if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) return -EBUSY; - } ret = hmm_range_fault(range, true); if (ret <= 0) { if (ret == 0) ret = -EBUSY; - up_read(&range->vma->vm_mm->mmap_sem); hmm_range_unregister(range); return ret; } @@ -706,8 +701,8 @@ nouveau_svm_fault(struct nvif_notify *notify) NULL); svmm->vmm->vmm.object.client->super = false; mutex_unlock(&svmm->mutex); - up_read(&svmm->mm->mmap_sem); } + up_read(&svmm->mm->mmap_sem); /* Cancel any faults in the window whose pages didn't manage * to keep their valid bit, or stay writeable when required.
Christoph Hellwig
2019-Jul-23 16:30 UTC
[Nouveau] [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
On Tue, Jul 23, 2019 at 03:18:28PM +0000, Jason Gunthorpe wrote:> Hum.. > > The caller does this: > > again: > ret = nouveau_range_fault(&svmm->mirror, &range); > if (ret == 0) { > mutex_lock(&svmm->mutex); > if (!nouveau_range_done(&range)) { > mutex_unlock(&svmm->mutex); > goto again; > > And we can't call nouveau_range_fault() -> hmm_range_fault() without > holding the mmap_sem, so we can't allow nouveau_range_fault to unlock > it.Goto again can only happen if nouveau_range_fault was successful, in which case we did not drop mmap_sem. Also:> ret = hmm_range_fault(range, true); > if (ret <= 0) { > if (ret == 0) > ret = -EBUSY; > - up_read(&range->vma->vm_mm->mmap_sem); > hmm_range_unregister(range);This would hold mmap_sem over hmm_range_unregister, which can lead to deadlock if we call exit_mmap and then acquire mmap_sem again.
Jason Gunthorpe
2019-Jul-23 17:17 UTC
[Nouveau] [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
On Tue, Jul 23, 2019 at 06:30:48PM +0200, Christoph Hellwig wrote:> On Tue, Jul 23, 2019 at 03:18:28PM +0000, Jason Gunthorpe wrote: > > Hum.. > > > > The caller does this: > > > > again: > > ret = nouveau_range_fault(&svmm->mirror, &range); > > if (ret == 0) { > > mutex_lock(&svmm->mutex); > > if (!nouveau_range_done(&range)) { > > mutex_unlock(&svmm->mutex); > > goto again; > > > > And we can't call nouveau_range_fault() -> hmm_range_fault() without > > holding the mmap_sem, so we can't allow nouveau_range_fault to unlock > > it. > > Goto again can only happen if nouveau_range_fault was successful, > in which case we did not drop mmap_sem.Oh, right we switch from success = number of pages to success =0.. However the reason this looks so weird to me is that the locking pattern isn't being followed, any result of hmm_range_fault should be ignored until we enter the svmm->mutex and check if there was a colliding invalidation. So the 'goto again' *should* be possible even if range_fault failed. But that is not for this patch..> > ret = hmm_range_fault(range, true); > > if (ret <= 0) { > > if (ret == 0) > > ret = -EBUSY; > > - up_read(&range->vma->vm_mm->mmap_sem); > > hmm_range_unregister(range); > > This would hold mmap_sem over hmm_range_unregister, which can lead > to deadlock if we call exit_mmap and then acquire mmap_sem again.That reminds me, this code is also leaking hmm_range_unregister() in the success path, right? I think the right way to structure this is to move the goto again and related into the nouveau_range_fault() so the whole retry algorithm is sensibly self contained. Jason
Possibly Parallel Threads
- [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
- [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
- [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
- [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault
- [PATCH 4/5] nouveau: unlock mmap_sem on all errors from nouveau_range_fault