Jon Derrick
2019-Mar-16 00:05 UTC
[Nouveau] [PATCH 0/4] NV50/GF100 behind constrained hierarchies
Hi Ben, I've been working with an mmio-constrained pci hierarchy intended almost solely for nvme devices and switches. Binding nouveau to an NV50-based gpu results in a kernel panic as the device cannot be fully mapped. I've made modifications in nv50 and vmm to unbind the driver from this hierarchy, and modified gf100 assuming it will have the same issue. 1/4 also includes a fix where nv50 doesn't look like it checks the return value for nvkm_vmm_new for bar1. I don't know if this was intentional (eg, it could work without it). I'm wholly unfamiliar with linux graphics drivers so let me know if there's a better place for this. Jon Derrick (4): drm/nouveau/bar/nv50: check bar1 vmm return value drm/nouveau/bar/nv50: ensure BAR is mapped drm/nouveau/bar/gf100: ensure BAR is mapped drm/nouveau/mmu: qualify vmm during dtor drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c | 2 ++ drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c | 14 +++++++++++--- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- 3 files changed, 14 insertions(+), 4 deletions(-) -- 2.14.4
Jon Derrick
2019-Mar-16 00:05 UTC
[Nouveau] [PATCH 1/4] drm/nouveau/bar/nv50: check bar1 vmm return value
Check bar1's new vmm creation return value for errors. Signed-off-by: Jon Derrick <jonathan.derrick at intel.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c index 157b076a1272..8e64b19f3f8a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c @@ -168,6 +168,8 @@ nv50_bar_oneinit(struct nvkm_bar *base) ret = nvkm_vmm_new(device, start, limit-- - start, NULL, 0, &bar1_lock, "bar1", &bar->bar1_vmm); + if (ret) + return ret; atomic_inc(&bar->bar1_vmm->engref[NVKM_SUBDEV_BAR]); bar->bar1_vmm->debug = bar->base.subdev.debug; -- 2.14.4
Jon Derrick
2019-Mar-16 00:05 UTC
[Nouveau] [PATCH 2/4] drm/nouveau/bar/nv50: ensure BAR is mapped
If the BAR is zero size, it indicates it was never successfully mapped. Ensure that the BAR is valid during initialization before attempting to use it. Signed-off-by: Jon Derrick <jonathan.derrick at intel.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c index 8e64b19f3f8a..f23a0ccc2bec 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c @@ -109,7 +109,7 @@ nv50_bar_oneinit(struct nvkm_bar *base) struct nvkm_device *device = bar->base.subdev.device; static struct lock_class_key bar1_lock; static struct lock_class_key bar2_lock; - u64 start, limit; + u64 start, limit, size; int ret; ret = nvkm_gpuobj_new(device, 0x20000, 0, false, NULL, &bar->mem); @@ -127,7 +127,10 @@ nv50_bar_oneinit(struct nvkm_bar *base) /* BAR2 */ start = 0x0100000000ULL; - limit = start + device->func->resource_size(device, 3); + size = device->func->resource_size(device, 3); + if (!size) + return -ENOMEM; + limit = start + size; ret = nvkm_vmm_new(device, start, limit-- - start, NULL, 0, &bar2_lock, "bar2", &bar->bar2_vmm); @@ -164,7 +167,10 @@ nv50_bar_oneinit(struct nvkm_bar *base) /* BAR1 */ start = 0x0000000000ULL; - limit = start + device->func->resource_size(device, 1); + size = device->func->resource_size(device, 1); + if (!size) + return -ENOMEM; + limit = start + size; ret = nvkm_vmm_new(device, start, limit-- - start, NULL, 0, &bar1_lock, "bar1", &bar->bar1_vmm); -- 2.14.4
Jon Derrick
2019-Mar-16 00:05 UTC
[Nouveau] [PATCH 3/4] drm/nouveau/bar/gf100: ensure BAR is mapped
If the BAR is zero size, it indicates it was never successfully mapped. Ensure that the BAR is valid during initialization before attempting to use it. Signed-off-by: Jon Derrick <jonathan.derrick at intel.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c index a3ba7f50198b..a3dcb09a40ee 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c @@ -94,6 +94,8 @@ gf100_bar_oneinit_bar(struct gf100_bar *bar, struct gf100_barN *bar_vm, return ret; bar_len = device->func->resource_size(device, bar_nr); + if (!bar_len) + return -ENOMEM; if (bar_nr == 3 && bar->bar2_halve) bar_len >>= 1; -- 2.14.4
Jon Derrick
2019-Mar-16 00:05 UTC
[Nouveau] [PATCH 4/4] drm/nouveau/mmu: qualify vmm during dtor
If the BAR initialization failed it may leave the vmm structure in an unitialized state, leading to a null-pointer-dereference when the vmm is dereferenced during teardown. Signed-off-by: Jon Derrick <jonathan.derrick at intel.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c index fa93f964e6a4..41640e0584ac 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c @@ -1783,7 +1783,7 @@ nvkm_vmm_get(struct nvkm_vmm *vmm, u8 page, u64 size, struct nvkm_vma **pvma) void nvkm_vmm_part(struct nvkm_vmm *vmm, struct nvkm_memory *inst) { - if (inst && vmm->func->part) { + if (inst && vmm && vmm->func->part) { mutex_lock(&vmm->mutex); vmm->func->part(vmm, inst); mutex_unlock(&vmm->mutex); -- 2.14.4
Ben Skeggs
2019-Mar-28 00:28 UTC
[Nouveau] [PATCH 0/4] NV50/GF100 behind constrained hierarchies
On Mon, 25 Mar 2019 at 13:33, Jon Derrick <jonathan.derrick at intel.com> wrote:> > Hi Ben,Hey John,> > I've been working with an mmio-constrained pci hierarchy intended almost > solely for nvme devices and switches. Binding nouveau to an NV50-based > gpu results in a kernel panic as the device cannot be fully mapped. > > I've made modifications in nv50 and vmm to unbind the driver from this > hierarchy, and modified gf100 assuming it will have the same issue.It would have the same issue indeed.> > 1/4 also includes a fix where nv50 doesn't look like it checks the > return value for nvkm_vmm_new for bar1. I don't know if this was > intentional (eg, it could work without it).No, that's a good catch. It would have worked at one point in time, but would be a NULL-ptr deref since a rework that was done a while ago.> > I'm wholly unfamiliar with linux graphics drivers so let me know if > there's a better place for this.This is the correct place, and I've merged all the patches into my tree. Just as a note/reminder for the future (mostly to myself), we should already be able to deal with BAR2 not being available, there's a fallback in place already. BAR1 is perhaps a bit trickier, but there are other (slow) HW mechanisms available that we could use if we need to care about making the driver work behind these hierarchies in the future. Thank you, Ben.> > Jon Derrick (4): > drm/nouveau/bar/nv50: check bar1 vmm return value > drm/nouveau/bar/nv50: ensure BAR is mapped > drm/nouveau/bar/gf100: ensure BAR is mapped > drm/nouveau/mmu: qualify vmm during dtor > > drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c | 2 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c | 14 +++++++++++--- > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- > 3 files changed, 14 insertions(+), 4 deletions(-) > > -- > 2.14.4 > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Derrick, Jonathan
2019-Mar-28 01:33 UTC
[Nouveau] [PATCH 0/4] NV50/GF100 behind constrained hierarchies
On Thu, 2019-03-28 at 10:28 +1000, Ben Skeggs wrote:> On Mon, 25 Mar 2019 at 13:33, Jon Derrick <jonathan.derrick at intel.com> wrote: > > Hi Ben, > Hey John, > > > I've been working with an mmio-constrained pci hierarchy intended almost > > solely for nvme devices and switches. Binding nouveau to an NV50-based > > gpu results in a kernel panic as the device cannot be fully mapped. > > > > I've made modifications in nv50 and vmm to unbind the driver from this > > hierarchy, and modified gf100 assuming it will have the same issue. > It would have the same issue indeed. >Cheers. I was able to find a gf100 based card and confirmed the fix. Thanks -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3278 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20190328/0c79d25b/attachment-0001.bin>
Possibly Parallel Threads
- [PATCH AUTOSEL 5.1 043/375] drm/nouveau/bar/nv50: ensure BAR is mapped
- [PATCH AUTOSEL 5.0 037/317] drm/nouveau/bar/nv50: ensure BAR is mapped
- [PATCH AUTOSEL 4.19 029/244] drm/nouveau/bar/nv50: ensure BAR is mapped
- [PATCH 0/4] NV50/GF100 behind constrained hierarchies
- [PATCH][next] drm/nouveau/gr/gf100: Remove second semicolon