Ard Biesheuvel
2016-Oct-06 15:49 UTC
[Nouveau] [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
This v4 is now a 3 piece series (since v4), after Alexandre pointed out that both GF 100 and NV50 are affected by the same issue, and that a related issue has been solved already for Tegra in commit 9d0394c6bed5 ("drm/nouveau/instmem/gk20a: set DMA mask early"). The issue that this series addresses is the fact that the Nouveau driver invokes the DMA API before setting the DMA mask. In both cases addressed here, these are simply static bidirectional mappings of scratch pages whose purpose is not well understood, and in most cases, it does not matter that these pages are always allocated below 4 GB even if the hardware can access memory much higher up. However, on platforms without any RAM below 4 GB, the preliminary DMA mask of 32 is preventing the nouveau driver from loading on GF100 and NV50 hardware with an error like the following one: nouveau 0000:02:00.0: enabling device (0000 -> 0003) nouveau 0000:02:00.0: NVIDIA GT218 (0a8280b1) nouveau 0000:02:00.0: bios: version 70.18.a6.00.00 nouveau 0000:02:00.0: fb ctor failed, -14 nouveau: probe of 0000:02:00.0 failed with error -14 So fix this by setting a preliminary DMA mask based on the MMU device 'dma_bits' property (patch #1), and postpone mapping the scratch pages to the respective FB .init() hooks. (#2 and #3) v5: move setting of preliminary DMA mask to nvkm_device_pci_new() (#1) move allocation and DMA mapping of scratch pages to .oneinit hooks (#2, #3) v4: split and move dma_set_mask to probe hook (Alexander) v3: rework code to get rid of DMA_ERROR_CODE references, which is not defined on all architectures v2: replace incorrect comparison of dma_addr_t type var against NULL Ard Biesheuvel (3): drm/nouveau: set streaming DMA mask early drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c | 37 ++++++++++++++------ drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c | 31 +++++++++------- drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c | 33 +++++++++++------ 3 files changed, 69 insertions(+), 32 deletions(-) -- 2.7.4
Ard Biesheuvel
2016-Oct-06 15:49 UTC
[Nouveau] [PATCH v5 1/3] drm/nouveau: set streaming DMA mask early
Some subdevices (i.e., fb/nv50.c and fb/gf100.c) map a scratch page using dma_map_page() way before the TTM layer has had a chance to set the DMA mask. This may prevent the driver from loading at all on platforms whose system memory is not covered by the default DMA mask of 32-bit (i.e., when all RAM is above 4 GB). So set a preliminary DMA mask right after constructing the PCI device, and base it on the .dma_bits member of the MMU subdevice, which is what the TTM layer will base the DMA mask on as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org> --- drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c | 37 ++++++++++++++------ 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c index 62ad0300cfa5..0030cd9543b2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c @@ -1665,14 +1665,31 @@ nvkm_device_pci_new(struct pci_dev *pci_dev, const char *cfg, const char *dbg, *pdevice = &pdev->device; pdev->pdev = pci_dev; - return nvkm_device_ctor(&nvkm_device_pci_func, quirk, &pci_dev->dev, - pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE : - pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ? - NVKM_DEVICE_AGP : NVKM_DEVICE_PCI, - (u64)pci_domain_nr(pci_dev->bus) << 32 | - pci_dev->bus->number << 16 | - PCI_SLOT(pci_dev->devfn) << 8 | - PCI_FUNC(pci_dev->devfn), name, - cfg, dbg, detect, mmio, subdev_mask, - &pdev->device); + ret = nvkm_device_ctor(&nvkm_device_pci_func, quirk, &pci_dev->dev, + pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE : + pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ? + NVKM_DEVICE_AGP : NVKM_DEVICE_PCI, + (u64)pci_domain_nr(pci_dev->bus) << 32 | + pci_dev->bus->number << 16 | + PCI_SLOT(pci_dev->devfn) << 8 | + PCI_FUNC(pci_dev->devfn), name, + cfg, dbg, detect, mmio, subdev_mask, + &pdev->device); + + if (ret) + return ret; + + /* + * Set a preliminary DMA mask based on the .dma_bits member of the + * MMU subdevice. This allows other subdevices to create DMA mappings + * in their init() or oneinit() methods, which may be called before the + * TTM layer sets the DMA mask definitively. + * This is necessary for platforms where the default DMA mask of 32 + * does not cover any system memory, i.e., when all RAM is > 4 GB. + */ + if (subdev_mask & BIT(NVKM_SUBDEV_MMU)) + dma_set_mask_and_coherent(&pci_dev->dev, + DMA_BIT_MASK(pdev->device.mmu->dma_bits)); + + return 0; } -- 2.7.4
Ard Biesheuvel
2016-Oct-06 15:49 UTC
[Nouveau] [PATCH v5 2/3] drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook
The 100c10 scratch page is mapped using dma_map_page() before the TTM layer has had a chance to set the DMA mask. This means we are still running with the default of 32 when this code executes, and this causes problems for platforms with no memory below 4 GB (such as AMD Seattle) So move the dma_map_page() to the .oneinit hook, which executes after the DMA mask has been set. Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org> --- drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c | 31 ++++++++++++-------- 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c index 76433cc66fff..c1995c0024ef 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c @@ -50,24 +50,39 @@ gf100_fb_intr(struct nvkm_fb *base) } int -gf100_fb_oneinit(struct nvkm_fb *fb) +gf100_fb_oneinit(struct nvkm_fb *base) { - struct nvkm_device *device = fb->subdev.device; + struct gf100_fb *fb = gf100_fb(base); + struct nvkm_device *device = fb->base.subdev.device; int ret, size = 0x1000; size = nvkm_longopt(device->cfgopt, "MmuDebugBufferSize", size); size = min(size, 0x1000); ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000, - false, &fb->mmu_rd); + false, &base->mmu_rd); if (ret) return ret; ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000, - false, &fb->mmu_wr); + false, &base->mmu_wr); if (ret) return ret; + fb->r100c10_page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!fb->r100c10_page) { + nvkm_error(&fb->base.subdev, "failed 100c10 page alloc\n"); + return -ENOMEM; + } + + fb->r100c10 = dma_map_page(device->dev, fb->r100c10_page, 0, PAGE_SIZE, + DMA_BIDIRECTIONAL); + if (dma_mapping_error(device->dev, fb->r100c10)) { + nvkm_error(&fb->base.subdev, "failed to map 100c10 page\n"); + __free_page(fb->r100c10_page); + return -EFAULT; + } + return 0; } @@ -123,14 +138,6 @@ gf100_fb_new_(const struct nvkm_fb_func *func, struct nvkm_device *device, nvkm_fb_ctor(func, device, index, &fb->base); *pfb = &fb->base; - fb->r100c10_page = alloc_page(GFP_KERNEL | __GFP_ZERO); - if (fb->r100c10_page) { - fb->r100c10 = dma_map_page(device->dev, fb->r100c10_page, 0, - PAGE_SIZE, DMA_BIDIRECTIONAL); - if (dma_mapping_error(device->dev, fb->r100c10)) - return -EFAULT; - } - return 0; } -- 2.7.4
Ard Biesheuvel
2016-Oct-06 15:49 UTC
[Nouveau] [PATCH v5 3/3] drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook
The 100c08 scratch page is mapped using dma_map_page() before the TTM layer has had a chance to set the DMA mask. This means we are still running with the default of 32 when this code executes, and this causes problems for platforms with no memory below 4 GB (such as AMD Seattle) So move the dma_map_page() to the .oneinit hook, which executes after the DMA mask has been set. Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org> --- drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c | 33 ++++++++++++++------ 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c index 1b5fb02eab2a..d9bc4d11f145 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c @@ -210,6 +210,28 @@ nv50_fb_intr(struct nvkm_fb *base) nvkm_fifo_chan_put(fifo, flags, &chan); } +static int +nv50_fb_oneinit(struct nvkm_fb *base) +{ + struct nv50_fb *fb = nv50_fb(base); + + fb->r100c08_page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!fb->r100c08_page) { + nvkm_error(&fb->base.subdev, "failed 100c08 page alloc\n"); + return -ENOMEM; + } + + fb->r100c08 = dma_map_page(device->dev, fb->r100c08_page, 0, PAGE_SIZE, + DMA_BIDIRECTIONAL); + if (dma_mapping_error(device->dev, fb->r100c08)) { + nvkm_error(&fb->base.subdev, "failed to map 100c08 page\n"); + __free_page(fb->r100c08_page); + return -EFAULT; + } + + return 0; +} + static void nv50_fb_init(struct nvkm_fb *base) { @@ -245,6 +267,7 @@ nv50_fb_dtor(struct nvkm_fb *base) static const struct nvkm_fb_func nv50_fb_ = { .dtor = nv50_fb_dtor, + .oneinit = nv50_fb_oneinit, .init = nv50_fb_init, .intr = nv50_fb_intr, .ram_new = nv50_fb_ram_new, @@ -263,16 +286,6 @@ nv50_fb_new_(const struct nv50_fb_func *func, struct nvkm_device *device, fb->func = func; *pfb = &fb->base; - fb->r100c08_page = alloc_page(GFP_KERNEL | __GFP_ZERO); - if (fb->r100c08_page) { - fb->r100c08 = dma_map_page(device->dev, fb->r100c08_page, 0, - PAGE_SIZE, DMA_BIDIRECTIONAL); - if (dma_mapping_error(device->dev, fb->r100c08)) - return -EFAULT; - } else { - nvkm_warn(&fb->base.subdev, "failed 100c08 page alloc\n"); - } - return 0; } -- 2.7.4
Alexandre Courbot
2016-Oct-07 08:12 UTC
[Nouveau] [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
On Fri, Oct 7, 2016 at 12:49 AM, Ard Biesheuvel <ard.biesheuvel at linaro.org> wrote:> This v4 is now a 3 piece series (since v4), after Alexandre pointed out that > both GF 100 and NV50 are affected by the same issue, and that a related issue > has been solved already for Tegra in commit 9d0394c6bed5 > ("drm/nouveau/instmem/gk20a: set DMA mask early"). > > The issue that this series addresses is the fact that the Nouveau driver > invokes the DMA API before setting the DMA mask. In both cases addressed > here, these are simply static bidirectional mappings of scratch pages whose > purpose is not well understood, and in most cases, it does not matter that > these pages are always allocated below 4 GB even if the hardware can access > memory much higher up. > > However, on platforms without any RAM below 4 GB, the preliminary DMA mask > of 32 is preventing the nouveau driver from loading on GF100 and NV50 > hardware with an error like the following one: > > nouveau 0000:02:00.0: enabling device (0000 -> 0003) > nouveau 0000:02:00.0: NVIDIA GT218 (0a8280b1) > nouveau 0000:02:00.0: bios: version 70.18.a6.00.00 > nouveau 0000:02:00.0: fb ctor failed, -14 > nouveau: probe of 0000:02:00.0 failed with error -14 > > So fix this by setting a preliminary DMA mask based on the MMU device 'dma_bits' > property (patch #1), and postpone mapping the scratch pages to the respective > FB .init() hooks. (#2 and #3) > > v5: move setting of preliminary DMA mask to nvkm_device_pci_new() (#1) > move allocation and DMA mapping of scratch pages to .oneinit hooks (#2, #3) > v4: split and move dma_set_mask to probe hook (Alexander) > v3: rework code to get rid of DMA_ERROR_CODE references, which is not > defined on all architectures > v2: replace incorrect comparison of dma_addr_t type var against NULL > > Ard Biesheuvel (3): > drm/nouveau: set streaming DMA mask early > drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() > hook > drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() > hookThe series, Reviewed-by: Alexandre Courbot <acourbot at nvidia.com> Thanks!
Ard Biesheuvel
2016-Oct-16 20:12 UTC
[Nouveau] [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
On 7 October 2016 at 09:12, Alexandre Courbot <gnurou at gmail.com> wrote:> On Fri, Oct 7, 2016 at 12:49 AM, Ard Biesheuvel > <ard.biesheuvel at linaro.org> wrote: >> This v4 is now a 3 piece series (since v4), after Alexandre pointed out that >> both GF 100 and NV50 are affected by the same issue, and that a related issue >> has been solved already for Tegra in commit 9d0394c6bed5 >> ("drm/nouveau/instmem/gk20a: set DMA mask early"). >> >> The issue that this series addresses is the fact that the Nouveau driver >> invokes the DMA API before setting the DMA mask. In both cases addressed >> here, these are simply static bidirectional mappings of scratch pages whose >> purpose is not well understood, and in most cases, it does not matter that >> these pages are always allocated below 4 GB even if the hardware can access >> memory much higher up. >> >> However, on platforms without any RAM below 4 GB, the preliminary DMA mask >> of 32 is preventing the nouveau driver from loading on GF100 and NV50 >> hardware with an error like the following one: >> >> nouveau 0000:02:00.0: enabling device (0000 -> 0003) >> nouveau 0000:02:00.0: NVIDIA GT218 (0a8280b1) >> nouveau 0000:02:00.0: bios: version 70.18.a6.00.00 >> nouveau 0000:02:00.0: fb ctor failed, -14 >> nouveau: probe of 0000:02:00.0 failed with error -14 >> >> So fix this by setting a preliminary DMA mask based on the MMU device 'dma_bits' >> property (patch #1), and postpone mapping the scratch pages to the respective >> FB .init() hooks. (#2 and #3) >> >> v5: move setting of preliminary DMA mask to nvkm_device_pci_new() (#1) >> move allocation and DMA mapping of scratch pages to .oneinit hooks (#2, #3) >> v4: split and move dma_set_mask to probe hook (Alexander) >> v3: rework code to get rid of DMA_ERROR_CODE references, which is not >> defined on all architectures >> v2: replace incorrect comparison of dma_addr_t type var against NULL >> >> Ard Biesheuvel (3): >> drm/nouveau: set streaming DMA mask early >> drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() >> hook >> drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() >> hook > > The series, > > Reviewed-by: Alexandre Courbot <acourbot at nvidia.com> >Thank you Alexandre. So is there a nouveau subtree this should go into? Or is it up to Dave to pull it into drm? Regards, Ard.
Karol Herbst
2016-Oct-17 06:02 UTC
[Nouveau] [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
Hi Ard, the official nouveau tree is here: https://github.com/skeggsb/nouveau But ben usually takes care of everything if you post the patches on the ML. No idea what you discussed internally though, but alex also posts his suff here. On 16 Oct 2016 22:12, "Ard Biesheuvel" <ard.biesheuvel at linaro.org> wrote: On 7 October 2016 at 09:12, Alexandre Courbot <gnurou at gmail.com> wrote:> On Fri, Oct 7, 2016 at 12:49 AM, Ard Biesheuvel > <ard.biesheuvel at linaro.org> wrote: >> This v4 is now a 3 piece series (since v4), after Alexandre pointed outthat>> both GF 100 and NV50 are affected by the same issue, and that a relatedissue>> has been solved already for Tegra in commit 9d0394c6bed5 >> ("drm/nouveau/instmem/gk20a: set DMA mask early"). >> >> The issue that this series addresses is the fact that the Nouveau driver >> invokes the DMA API before setting the DMA mask. In both cases addressed >> here, these are simply static bidirectional mappings of scratch pageswhose>> purpose is not well understood, and in most cases, it does not matterthat>> these pages are always allocated below 4 GB even if the hardware canaccess>> memory much higher up. >> >> However, on platforms without any RAM below 4 GB, the preliminary DMAmask>> of 32 is preventing the nouveau driver from loading on GF100 and NV50 >> hardware with an error like the following one: >> >> nouveau 0000:02:00.0: enabling device (0000 -> 0003) >> nouveau 0000:02:00.0: NVIDIA GT218 (0a8280b1) >> nouveau 0000:02:00.0: bios: version 70.18.a6.00.00 >> nouveau 0000:02:00.0: fb ctor failed, -14 >> nouveau: probe of 0000:02:00.0 failed with error -14 >> >> So fix this by setting a preliminary DMA mask based on the MMU device'dma_bits'>> property (patch #1), and postpone mapping the scratch pages to therespective>> FB .init() hooks. (#2 and #3) >> >> v5: move setting of preliminary DMA mask to nvkm_device_pci_new() (#1) >> move allocation and DMA mapping of scratch pages to .oneinit hooks(#2, #3)>> v4: split and move dma_set_mask to probe hook (Alexander) >> v3: rework code to get rid of DMA_ERROR_CODE references, which is not >> defined on all architectures >> v2: replace incorrect comparison of dma_addr_t type var against NULL >> >> Ard Biesheuvel (3): >> drm/nouveau: set streaming DMA mask early >> drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() >> hook >> drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() >> hook > > The series, > > Reviewed-by: Alexandre Courbot <acourbot at nvidia.com> >Thank you Alexandre. So is there a nouveau subtree this should go into? Or is it up to Dave to pull it into drm? Regards, Ard. _______________________________________________ Nouveau mailing list Nouveau at lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20161017/dd13f85a/attachment.html>
Possibly Parallel Threads
- [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
- [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
- [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page
- [PATCH v3] drm/nouveau/fb/nv50: set DMA mask before mapping scratch page
- [PATCH v4 0/3] drm/nouveau: set DMA mask before mapping scratch page