Thierry Reding
2019-Sep-16 14:19 UTC
[Nouveau] [PATCH 0/4] drm/nouveau: Miscellaneous fixes
From: Thierry Reding <treding at nvidia.com> Hi Ben, these are fixes for a couple of issues that I've been running into when testing on various Tegra boards. The first two patches fix up issues in the fix that I had sent out earlier to fix the regression introduced in drm-misc-next. The first one is critical because it avoids a BUG_ON as reported by Ilia, while the second is less critical, but restores the locking correctness (at least to the best of my knowledge). Patch 3 is something that I think was also caused by the reservation object rework and is kind of a continuation of my earlier attempt to fix the VMA node sharing breakage. The current ordering between TTM and GEM teardown is causing a DEBUG_LOCKS_WARN_ON() because GEM cleanup already freed a mutex that TTM teardown will still want to use. Lastly, patch 4 is quite uncritical, but it's a one-line change that is causing an ugly (but harmless) external memory address decode error on Tegra210 and later. It seems that for some reason clearing this register will cause a DMA operation to be started by the GPU. I've verified that it's tied to exactly that register write by modifying the value written to the register, and stalling for a couple of seconds after the register write. The address decode error reflects the value written into this register exactly and it always happens a couple of milliseconds after this write. Thierry Thierry Reding (4): drm/nouveau: Fix fallout from reservation object rework drm/nouveau: prime: Extend DMA reservation object lock drm/nouveau: Fix ordering between TTM and GEM release drm/nouveau: gm20b: Avoid BAR1 teardown during init drivers/gpu/drm/nouveau/nouveau_bo.c | 26 +++++++++++------- drivers/gpu/drm/nouveau/nouveau_bo.h | 4 +-- drivers/gpu/drm/nouveau/nouveau_gem.c | 7 ++--- drivers/gpu/drm/nouveau/nouveau_prime.c | 27 ++++++++++++------- .../gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c | 1 - 5 files changed, 39 insertions(+), 26 deletions(-) -- 2.23.0
Thierry Reding
2019-Sep-16 14:19 UTC
[Nouveau] [PATCH 1/4] drm/nouveau: Fix fallout from reservation object rework
From: Thierry Reding <treding at nvidia.com> Commit 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") introduced a subtle change in how the buffer allocation size is handled. Prior to that change, the size would get aligned to at least a page, whereas after that change a non-page-aligned size would get passed through unmodified. This ultimately causes a BUG_ON() to trigger in drm_gem_private_object_init() and crashes the system. Fix this by restoring the code that align the allocation size. Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Reported-by: Ilia Mirkin <imirkin at alum.mit.edu> Signed-off-by: Thierry Reding <treding at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_bo.c | 16 +++++++++------- drivers/gpu/drm/nouveau/nouveau_bo.h | 4 ++-- drivers/gpu/drm/nouveau/nouveau_gem.c | 3 ++- drivers/gpu/drm/nouveau/nouveau_prime.c | 7 ++++--- 4 files changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index e918b437af17..e7803dca32c5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -186,8 +186,8 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags, } struct nouveau_bo * -nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, - u32 tile_flags) +nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 flags, + u32 tile_mode, u32 tile_flags) { struct nouveau_drm *drm = cli->drm; struct nouveau_bo *nvbo; @@ -195,8 +195,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, struct nvif_vmm *vmm = cli->svm.cli ? &cli->svm.vmm : &cli->vmm.vmm; int i, pi = -1; - if (!size) { - NV_WARN(drm, "skipped size %016llx\n", size); + if (!*size) { + NV_WARN(drm, "skipped size %016llx\n", *size); return ERR_PTR(-EINVAL); } @@ -266,7 +266,7 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, pi = i; /* Stop once the buffer is larger than the current page size. */ - if (size >= 1ULL << vmm->page[i].shift) + if (*size >= 1ULL << vmm->page[i].shift) break; } @@ -281,6 +281,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, } nvbo->page = vmm->page[pi].shift; + nouveau_bo_fixup_align(nvbo, flags, align, size); + return nvbo; } @@ -294,7 +296,6 @@ nouveau_bo_init(struct nouveau_bo *nvbo, u64 size, int align, u32 flags, acc_size = ttm_bo_dma_acc_size(nvbo->bo.bdev, size, sizeof(*nvbo)); - nouveau_bo_fixup_align(nvbo, flags, &align, &size); nvbo->bo.mem.num_pages = size >> PAGE_SHIFT; nouveau_bo_placement_set(nvbo, flags, 0); @@ -318,7 +319,8 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, struct nouveau_bo *nvbo; int ret; - nvbo = nouveau_bo_alloc(cli, size, flags, tile_mode, tile_flags); + nvbo = nouveau_bo_alloc(cli, &size, &align, flags, tile_mode, + tile_flags); if (IS_ERR(nvbo)) return PTR_ERR(nvbo); diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h index 62930d834fba..38f9d8350963 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.h +++ b/drivers/gpu/drm/nouveau/nouveau_bo.h @@ -71,8 +71,8 @@ nouveau_bo_ref(struct nouveau_bo *ref, struct nouveau_bo **pnvbo) extern struct ttm_bo_driver nouveau_bo_driver; void nouveau_bo_move_init(struct nouveau_drm *); -struct nouveau_bo *nouveau_bo_alloc(struct nouveau_cli *, u64 size, u32 flags, - u32 tile_mode, u32 tile_flags); +struct nouveau_bo *nouveau_bo_alloc(struct nouveau_cli *, u64 *size, int *align, + u32 flags, u32 tile_mode, u32 tile_flags); int nouveau_bo_init(struct nouveau_bo *, u64 size, int align, u32 flags, struct sg_table *sg, struct dma_resv *robj); int nouveau_bo_new(struct nouveau_cli *, u64 size, int align, u32 flags, diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index c2bfc0591909..1bdffd714456 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -188,7 +188,8 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int align, uint32_t domain, if (domain & NOUVEAU_GEM_DOMAIN_COHERENT) flags |= TTM_PL_FLAG_UNCACHED; - nvbo = nouveau_bo_alloc(cli, size, flags, tile_mode, tile_flags); + nvbo = nouveau_bo_alloc(cli, &size, &align, flags, tile_mode, + tile_flags); if (IS_ERR(nvbo)) return PTR_ERR(nvbo); diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c index 84658d434225..656c334ee7d9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_prime.c +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c @@ -62,14 +62,15 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, struct nouveau_drm *drm = nouveau_drm(dev); struct nouveau_bo *nvbo; struct dma_resv *robj = attach->dmabuf->resv; - size_t size = attach->dmabuf->size; + u64 size = attach->dmabuf->size; u32 flags = 0; + int align = 0; int ret; flags = TTM_PL_FLAG_TT; dma_resv_lock(robj, NULL); - nvbo = nouveau_bo_alloc(&drm->client, size, flags, 0, 0); + nvbo = nouveau_bo_alloc(&drm->client, &size, &align, flags, 0, 0); dma_resv_unlock(robj); if (IS_ERR(nvbo)) return ERR_CAST(nvbo); @@ -84,7 +85,7 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, return ERR_PTR(-ENOMEM); } - ret = nouveau_bo_init(nvbo, size, 0, flags, sg, robj); + ret = nouveau_bo_init(nvbo, size, align, flags, sg, robj); if (ret) { nouveau_bo_ref(NULL, &nvbo); return ERR_PTR(ret); -- 2.23.0
Thierry Reding
2019-Sep-16 14:19 UTC
[Nouveau] [PATCH 2/4] drm/nouveau: prime: Extend DMA reservation object lock
From: Thierry Reding <treding at nvidia.com> Prior to commit 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object"), the reservation object was locked across all of the buffer object creation. After splitting nouveau_bo_new() into separate nouveau_bo_alloc() and nouveau_bo_init() functions, the reservation object is passed to the latter, so the lock needs to be held across that function as well. Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Signed-off-by: Thierry Reding <treding at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_prime.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c index 656c334ee7d9..bae6a3eccee0 100644 --- a/drivers/gpu/drm/nouveau/nouveau_prime.c +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c @@ -60,6 +60,7 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, struct sg_table *sg) { struct nouveau_drm *drm = nouveau_drm(dev); + struct drm_gem_object *obj; struct nouveau_bo *nvbo; struct dma_resv *robj = attach->dmabuf->resv; u64 size = attach->dmabuf->size; @@ -71,9 +72,10 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, dma_resv_lock(robj, NULL); nvbo = nouveau_bo_alloc(&drm->client, &size, &align, flags, 0, 0); - dma_resv_unlock(robj); - if (IS_ERR(nvbo)) - return ERR_CAST(nvbo); + if (IS_ERR(nvbo)) { + obj = ERR_CAST(nvbo); + goto unlock; + } nvbo->valid_domains = NOUVEAU_GEM_DOMAIN_GART; @@ -82,16 +84,22 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, ret = drm_gem_object_init(dev, &nvbo->bo.base, size); if (ret) { nouveau_bo_ref(NULL, &nvbo); - return ERR_PTR(-ENOMEM); + obj = ERR_PTR(-ENOMEM); + goto unlock; } ret = nouveau_bo_init(nvbo, size, align, flags, sg, robj); if (ret) { nouveau_bo_ref(NULL, &nvbo); - return ERR_PTR(ret); + obj = ERR_PTR(ret); + goto unlock; } - return &nvbo->bo.base; + obj = &nvbo->bo.base; + +unlock: + dma_resv_unlock(robj); + return obj; } int nouveau_gem_prime_pin(struct drm_gem_object *obj) -- 2.23.0
Thierry Reding
2019-Sep-16 14:19 UTC
[Nouveau] [PATCH 3/4] drm/nouveau: Fix ordering between TTM and GEM release
From: Thierry Reding <treding at nvidia.com> When the last reference to a TTM BO is dropped, ttm_bo_release() will acquire the DMA reservation object's wound/wait mutex while trying to clean up (ttm_bo_cleanup_refs_or_queue() via ttm_bo_release()). It is therefore essential that drm_gem_object_release() be called after the TTM BO has been uninitialized, otherwise drm_gem_object_release() has already destroyed the wound/wait mutex (via dma_resv_fini()). Signed-off-by: Thierry Reding <treding at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_bo.c | 10 ++++++++-- drivers/gpu/drm/nouveau/nouveau_gem.c | 4 ---- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index e7803dca32c5..f8015e0318d7 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -136,10 +136,16 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo) struct drm_device *dev = drm->dev; struct nouveau_bo *nvbo = nouveau_bo(bo); - if (unlikely(nvbo->bo.base.filp)) - DRM_ERROR("bo %p still attached to GEM object\n", bo); WARN_ON(nvbo->pin_refcnt > 0); nv10_bo_put_tile_region(dev, nvbo->tile, NULL); + + /* + * If nouveau_bo_new() allocated this buffer, the GEM object was never + * initialized, so don't attempt to release it. + */ + if (bo->base.dev) + drm_gem_object_release(&bo->base); + kfree(nvbo); } diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 1bdffd714456..1324c19f4e5c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -51,10 +51,6 @@ nouveau_gem_object_del(struct drm_gem_object *gem) if (gem->import_attach) drm_prime_gem_destroy(gem, nvbo->bo.sg); - drm_gem_object_release(gem); - - /* reset filp so nouveau_bo_del_ttm() can test for it */ - gem->filp = NULL; ttm_bo_put(&nvbo->bo); pm_runtime_mark_last_busy(dev); -- 2.23.0
Thierry Reding
2019-Sep-16 14:19 UTC
[Nouveau] [PATCH 4/4] drm/nouveau: gm20b: Avoid BAR1 teardown during init
From: Thierry Reding <treding at nvidia.com> Writing the 0x1704 (BUS_BAR1_BLOCK) register causes the GPU to probe the memory region at the programmed address. The result is an address decode error in the external memory controller because address 0, which is what is written to the register, is not designated as accessible to devices. Avoid triggering DMA from the GPU by removing teardown of the BAR1. Signed-off-by: Thierry Reding <treding at nvidia.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c index 950bff1955ad..1ed6170891c4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c @@ -26,7 +26,6 @@ gm20b_bar_func = { .dtor = gf100_bar_dtor, .oneinit = gf100_bar_oneinit, .bar1.init = gf100_bar_bar1_init, - .bar1.fini = gf100_bar_bar1_fini, .bar1.wait = gm107_bar_bar1_wait, .bar1.vmm = gf100_bar_bar1_vmm, .flush = g84_bar_flush, -- 2.23.0