Guillaume Tucker
2018-Dec-07 23:31 UTC
[Nouveau] next/master boot bisection: Oops in nouveau driver on jetson-tk1
Please find below an automated bisection report for a kernel Oops seen during the initialisation of the nouveau GPU driver on jetson-tk1. All the LAVA test jobs for this bisection can be found here: http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table Here's the beginning of the Oops stack trace: [ 7.485361] [00000064] *pgd=f9e7b835 [ 7.485372] Internal error: Oops: 17 [#1] SMP ARM [ 7.485376] Modules linked in: snd_soc_tegra_rt5640(+) snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+) ttm tegra_devfreq tegra_wdt [ 7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next-20181206 #44 [ 7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) [ 7.555859] PC is at drm_plane_register_all+0x18/0x50 [ 7.560899] LR is at drm_modeset_register_all+0xc/0x6c Full log: http://lava.baylibre.com:10080/scheduler/job/68628#L816 The bisection was run from next-20181206 as this is where the issue was discovered on kernelci.org but the patch it found has already been merged in mainline. Hope this helps! Guillaume -----------8<------------------------8<----------- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * This automated bisection report was sent to you on the basis * * that you may be involved with the breaking commit it has * * found. No manual investigation has been done to verify it, * * and the root cause of the problem may be somewhere else. * * Hope this helps! * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Bisection result for next/master (next-20181206) on jetson-tk1 Good: 84df9525b0c2 Linux 4.19 Bad: 4c92b7b3080d Add linux-next specific files for 20181206 Found: cfea88a4d866 drm/nouveau: Start using new drm_dev initialization helpers Checks: revert: PASS verify: PASS Parameters: Tree: next URL: None Branch: master Target: jetson-tk1 Lab: lab-baylibre Config: multi_v7_defconfig Plan: dmesg-nouveau Breaking commit found: ------------------------------------------------------------------------------- commit cfea88a4d86632f28cf80be97079f131645b7869 Author: Lyude Paul <lyude at redhat.com> Date: Wed Aug 22 21:40:07 2018 -0400 drm/nouveau: Start using new drm_dev initialization helpers Per the documentation in drm_get_pci_dev(), this function is deprecated and shouldn't be used anymore. As it turns out, we're going to need to stop using drm_get_pci_dev() anyway in order to allow us to turn off the card before full system shutdowns, otherwise we'll hit race conditions with userspace while trying to tear down the card on shutdown. So, start using drm_dev_get() and drm_dev_put(), and just turn our load/unload callbacks into open coded init/fini() functions. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: Karol Herbst <kherbst at redhat.com> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 905956809d21..2b2baf6e0e0d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm) nouveau_bo_move_init(drm); } -static int nouveau_drm_probe(struct pci_dev *pdev, - const struct pci_device_id *pent) -{ - struct nvkm_device *device; - struct apertures_struct *aper; - bool boot = false; - int ret; - - if (vga_switcheroo_client_probe_defer(pdev)) - return -EPROBE_DEFER; - - /* We need to check that the chipset is supported before booting - * fbdev off the hardware, as there's no way to put it back. - */ - ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, &device); - if (ret) - return ret; - - nvkm_device_del(&device); - - /* Remove conflicting drivers (vesafb, efifb etc). */ - aper = alloc_apertures(3); - if (!aper) - return -ENOMEM; - - aper->ranges[0].base = pci_resource_start(pdev, 1); - aper->ranges[0].size = pci_resource_len(pdev, 1); - aper->count = 1; - - if (pci_resource_len(pdev, 2)) { - aper->ranges[aper->count].base = pci_resource_start(pdev, 2); - aper->ranges[aper->count].size = pci_resource_len(pdev, 2); - aper->count++; - } - - if (pci_resource_len(pdev, 3)) { - aper->ranges[aper->count].base = pci_resource_start(pdev, 3); - aper->ranges[aper->count].size = pci_resource_len(pdev, 3); - aper->count++; - } - -#ifdef CONFIG_X86 - boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; -#endif - if (nouveau_modeset != 2) - drm_fb_helper_remove_conflicting_framebuffers(aper, "nouveaufb", boot); - kfree(aper); - - ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, - true, true, ~0ULL, &device); - if (ret) - return ret; - - pci_set_master(pdev); - - if (nouveau_atomic) - driver_pci.driver_features |= DRIVER_ATOMIC; - - ret = drm_get_pci_dev(pdev, pent, &driver_pci); - if (ret) { - nvkm_device_del(&device); - return ret; - } - - return 0; -} - static int -nouveau_drm_load(struct drm_device *dev, unsigned long flags) +nouveau_drm_device_init(struct drm_device *dev) { struct nouveau_drm *drm; int ret; @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags) } static void -nouveau_drm_unload(struct drm_device *dev) +nouveau_drm_device_fini(struct drm_device *dev) { struct nouveau_drm *drm = nouveau_drm(dev); @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev) kfree(drm); } +static int nouveau_drm_probe(struct pci_dev *pdev, + const struct pci_device_id *pent) +{ + struct nvkm_device *device; + struct drm_device *drm_dev; + struct apertures_struct *aper; + bool boot = false; + int ret; + + if (vga_switcheroo_client_probe_defer(pdev)) + return -EPROBE_DEFER; + + /* We need to check that the chipset is supported before booting + * fbdev off the hardware, as there's no way to put it back. + */ + ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, &device); + if (ret) + return ret; + + nvkm_device_del(&device); + + /* Remove conflicting drivers (vesafb, efifb etc). */ + aper = alloc_apertures(3); + if (!aper) + return -ENOMEM; + + aper->ranges[0].base = pci_resource_start(pdev, 1); + aper->ranges[0].size = pci_resource_len(pdev, 1); + aper->count = 1; + + if (pci_resource_len(pdev, 2)) { + aper->ranges[aper->count].base = pci_resource_start(pdev, 2); + aper->ranges[aper->count].size = pci_resource_len(pdev, 2); + aper->count++; + } + + if (pci_resource_len(pdev, 3)) { + aper->ranges[aper->count].base = pci_resource_start(pdev, 3); + aper->ranges[aper->count].size = pci_resource_len(pdev, 3); + aper->count++; + } + +#ifdef CONFIG_X86 + boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; +#endif + if (nouveau_modeset != 2) + drm_fb_helper_remove_conflicting_framebuffers(aper, "nouveaufb", boot); + kfree(aper); + + ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, + true, true, ~0ULL, &device); + if (ret) + return ret; + + pci_set_master(pdev); + + if (nouveau_atomic) + driver_pci.driver_features |= DRIVER_ATOMIC; + + drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev); + if (IS_ERR(drm_dev)) { + ret = PTR_ERR(drm_dev); + goto fail_nvkm; + } + + ret = pci_enable_device(pdev); + if (ret) + goto fail_drm; + + drm_dev->pdev = pdev; + pci_set_drvdata(pdev, drm_dev); + + ret = nouveau_drm_device_init(drm_dev); + if (ret) + goto fail_pci; + + ret = drm_dev_register(drm_dev, pent->driver_data); + if (ret) + goto fail_drm_dev_init; + + return 0; + +fail_drm_dev_init: + nouveau_drm_device_fini(drm_dev); +fail_pci: + pci_disable_device(pdev); +fail_drm: + drm_dev_put(drm_dev); +fail_nvkm: + nvkm_device_del(&device); + return ret; +} + void nouveau_drm_device_remove(struct drm_device *dev) { + struct pci_dev *pdev = dev->pdev; struct nouveau_drm *drm = nouveau_drm(dev); struct nvkm_client *client; struct nvkm_device *device; + drm_dev_unregister(dev); + dev->irq_enabled = false; client = nvxx_client(&drm->client.base); device = nvkm_device_find(client->device); - drm_put_dev(dev); + nouveau_drm_device_fini(dev); + pci_disable_device(pdev); + drm_dev_put(dev); nvkm_device_del(&device); } @@ -1020,8 +1051,6 @@ driver_stub = { DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER | DRIVER_KMS_LEGACY_CONTEXT, - .load = nouveau_drm_load, - .unload = nouveau_drm_unload, .open = nouveau_drm_open, .postclose = nouveau_drm_postclose, .lastclose = nouveau_vga_lastclose, ------------------------------------------------------------------------------- Git bisection log: ------------------------------------------------------------------------------- git bisect start # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19 git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific files for 20181206 git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4 # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect good d49f8a52b15bf35db778035340d8a673149f9f93 # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21 # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next-fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make plane alpha configurable git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5 # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop using gart_start as offset for the GTT domain" git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for-4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48 # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove unneeded checks in dpu_crtc.c git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61 # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes-2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20' of git://github.com/skeggsb/linux into drm-next git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8 # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix potential memory leak in nouveau_drm_load() git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9 # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep track of high-speed state, program into clock git bisect bad a971558c298755d2c07bc5508c65d689471763c8 # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a way to configure scrambling/tmds for hdmi 2.0 git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56 # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using new drm_dev initialization helpers git bisect bad cfea88a4d86632f28cf80be97079f131645b7869 # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using new drm_dev initialization helpers -------------------------------------------------------------------------------
Lyude Paul
2018-Dec-08 00:08 UTC
[Nouveau] next/master boot bisection: Oops in nouveau driver on jetson-tk1
uhhhhhhhhhhhhh didn't we fix this weeks ago? with "drm/nouveau: tegra: Call nouveau_drm_device_init()" On Fri, 2018-12-07 at 23:31 +0000, Guillaume Tucker wrote:> Please find below an automated bisection report for a kernel Oops > seen during the initialisation of the nouveau GPU driver on > jetson-tk1. > > > All the LAVA test jobs for this bisection can be found here: > > > http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table > > > Here's the beginning of the Oops stack trace: > > [ 7.485361] [00000064] *pgd=f9e7b835 > [ 7.485372] Internal error: Oops: 17 [#1] SMP ARM > [ 7.485376] Modules linked in: snd_soc_tegra_rt5640(+) > snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub > snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus > snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+) > ttm tegra_devfreq tegra_wdt > [ 7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next- > 20181206 #44 > [ 7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) > [ 7.555859] PC is at drm_plane_register_all+0x18/0x50 > [ 7.560899] LR is at drm_modeset_register_all+0xc/0x6c > > > Full log: > > http://lava.baylibre.com:10080/scheduler/job/68628#L816 > > > The bisection was run from next-20181206 as this is where the > issue was discovered on kernelci.org but the patch it found has > already been merged in mainline. > > Hope this helps! > > Guillaume > > > -----------8<------------------------8<----------- > > > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > * This automated bisection report was sent to you on the basis * > * that you may be involved with the breaking commit it has * > * found. No manual investigation has been done to verify it, * > * and the root cause of the problem may be somewhere else. * > * Hope this helps! * > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > Bisection result for next/master (next-20181206) on jetson-tk1 > > Good: 84df9525b0c2 Linux 4.19 > Bad: 4c92b7b3080d Add linux-next specific files for 20181206 > Found: cfea88a4d866 drm/nouveau: Start using new drm_dev > initialization helpers > > Checks: > revert: PASS > verify: PASS > > Parameters: > Tree: next > URL: None > Branch: master > Target: jetson-tk1 > Lab: lab-baylibre > Config: multi_v7_defconfig > Plan: dmesg-nouveau > > Breaking commit found: > > ---------------------------------------------------------------------------- > --- > commit cfea88a4d86632f28cf80be97079f131645b7869 > Author: Lyude Paul <lyude at redhat.com> > Date: Wed Aug 22 21:40:07 2018 -0400 > > drm/nouveau: Start using new drm_dev initialization helpers > > Per the documentation in drm_get_pci_dev(), this function is deprecated > and shouldn't be used anymore. As it turns out, we're going to need to > stop using drm_get_pci_dev() anyway in order to allow us to turn off the > card before full system shutdowns, otherwise we'll hit race conditions > with userspace while trying to tear down the card on shutdown. > > So, start using drm_dev_get() and drm_dev_put(), and just turn our > load/unload callbacks into open coded init/fini() functions. > > Signed-off-by: Lyude Paul <lyude at redhat.com> > Cc: Karol Herbst <kherbst at redhat.com> > Signed-off-by: Ben Skeggs <bskeggs at redhat.com> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c > b/drivers/gpu/drm/nouveau/nouveau_drm.c > index 905956809d21..2b2baf6e0e0d 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm) > nouveau_bo_move_init(drm); > } > > -static int nouveau_drm_probe(struct pci_dev *pdev, > - const struct pci_device_id *pent) > -{ > - struct nvkm_device *device; > - struct apertures_struct *aper; > - bool boot = false; > - int ret; > - > - if (vga_switcheroo_client_probe_defer(pdev)) > - return -EPROBE_DEFER; > - > - /* We need to check that the chipset is supported before booting > - * fbdev off the hardware, as there's no way to put it back. > - */ > - ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, > &device); > - if (ret) > - return ret; > - > - nvkm_device_del(&device); > - > - /* Remove conflicting drivers (vesafb, efifb etc). */ > - aper = alloc_apertures(3); > - if (!aper) > - return -ENOMEM; > - > - aper->ranges[0].base = pci_resource_start(pdev, 1); > - aper->ranges[0].size = pci_resource_len(pdev, 1); > - aper->count = 1; > - > - if (pci_resource_len(pdev, 2)) { > - aper->ranges[aper->count].base = pci_resource_start(pdev, 2); > - aper->ranges[aper->count].size = pci_resource_len(pdev, 2); > - aper->count++; > - } > - > - if (pci_resource_len(pdev, 3)) { > - aper->ranges[aper->count].base = pci_resource_start(pdev, 3); > - aper->ranges[aper->count].size = pci_resource_len(pdev, 3); > - aper->count++; > - } > - > -#ifdef CONFIG_X86 > - boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; > -#endif > - if (nouveau_modeset != 2) > - drm_fb_helper_remove_conflicting_framebuffers(aper, > "nouveaufb", boot); > - kfree(aper); > - > - ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, > - true, true, ~0ULL, &device); > - if (ret) > - return ret; > - > - pci_set_master(pdev); > - > - if (nouveau_atomic) > - driver_pci.driver_features |= DRIVER_ATOMIC; > - > - ret = drm_get_pci_dev(pdev, pent, &driver_pci); > - if (ret) { > - nvkm_device_del(&device); > - return ret; > - } > - > - return 0; > -} > - > static int > -nouveau_drm_load(struct drm_device *dev, unsigned long flags) > +nouveau_drm_device_init(struct drm_device *dev) > { > struct nouveau_drm *drm; > int ret; > @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long > flags) > } > > static void > -nouveau_drm_unload(struct drm_device *dev) > +nouveau_drm_device_fini(struct drm_device *dev) > { > struct nouveau_drm *drm = nouveau_drm(dev); > > @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev) > kfree(drm); > } > > +static int nouveau_drm_probe(struct pci_dev *pdev, > + const struct pci_device_id *pent) > +{ > + struct nvkm_device *device; > + struct drm_device *drm_dev; > + struct apertures_struct *aper; > + bool boot = false; > + int ret; > + > + if (vga_switcheroo_client_probe_defer(pdev)) > + return -EPROBE_DEFER; > + > + /* We need to check that the chipset is supported before booting > + * fbdev off the hardware, as there's no way to put it back. > + */ > + ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, > &device); > + if (ret) > + return ret; > + > + nvkm_device_del(&device); > + > + /* Remove conflicting drivers (vesafb, efifb etc). */ > + aper = alloc_apertures(3); > + if (!aper) > + return -ENOMEM; > + > + aper->ranges[0].base = pci_resource_start(pdev, 1); > + aper->ranges[0].size = pci_resource_len(pdev, 1); > + aper->count = 1; > + > + if (pci_resource_len(pdev, 2)) { > + aper->ranges[aper->count].base = pci_resource_start(pdev, 2); > + aper->ranges[aper->count].size = pci_resource_len(pdev, 2); > + aper->count++; > + } > + > + if (pci_resource_len(pdev, 3)) { > + aper->ranges[aper->count].base = pci_resource_start(pdev, 3); > + aper->ranges[aper->count].size = pci_resource_len(pdev, 3); > + aper->count++; > + } > + > +#ifdef CONFIG_X86 > + boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; > +#endif > + if (nouveau_modeset != 2) > + drm_fb_helper_remove_conflicting_framebuffers(aper, > "nouveaufb", boot); > + kfree(aper); > + > + ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, > + true, true, ~0ULL, &device); > + if (ret) > + return ret; > + > + pci_set_master(pdev); > + > + if (nouveau_atomic) > + driver_pci.driver_features |= DRIVER_ATOMIC; > + > + drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev); > + if (IS_ERR(drm_dev)) { > + ret = PTR_ERR(drm_dev); > + goto fail_nvkm; > + } > + > + ret = pci_enable_device(pdev); > + if (ret) > + goto fail_drm; > + > + drm_dev->pdev = pdev; > + pci_set_drvdata(pdev, drm_dev); > + > + ret = nouveau_drm_device_init(drm_dev); > + if (ret) > + goto fail_pci; > + > + ret = drm_dev_register(drm_dev, pent->driver_data); > + if (ret) > + goto fail_drm_dev_init; > + > + return 0; > + > +fail_drm_dev_init: > + nouveau_drm_device_fini(drm_dev); > +fail_pci: > + pci_disable_device(pdev); > +fail_drm: > + drm_dev_put(drm_dev); > +fail_nvkm: > + nvkm_device_del(&device); > + return ret; > +} > + > void > nouveau_drm_device_remove(struct drm_device *dev) > { > + struct pci_dev *pdev = dev->pdev; > struct nouveau_drm *drm = nouveau_drm(dev); > struct nvkm_client *client; > struct nvkm_device *device; > > + drm_dev_unregister(dev); > + > dev->irq_enabled = false; > client = nvxx_client(&drm->client.base); > device = nvkm_device_find(client->device); > - drm_put_dev(dev); > > + nouveau_drm_device_fini(dev); > + pci_disable_device(pdev); > + drm_dev_put(dev); > nvkm_device_del(&device); > } > > @@ -1020,8 +1051,6 @@ driver_stub = { > DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER | > DRIVER_KMS_LEGACY_CONTEXT, > > - .load = nouveau_drm_load, > - .unload = nouveau_drm_unload, > .open = nouveau_drm_open, > .postclose = nouveau_drm_postclose, > .lastclose = nouveau_vga_lastclose, > ---------------------------------------------------------------------------- > --- > > > Git bisection log: > > ---------------------------------------------------------------------------- > --- > git bisect start > # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19 > git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d > # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific > files for 20181206 > git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4 > # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20- > 2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux > git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f > # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi > git bisect good d49f8a52b15bf35db778035340d8a673149f9f93 > # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20' > of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild > git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21 > # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next- > fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm- > next > git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f > # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make > plane alpha configurable > git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5 > # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop > using gart_start as offset for the GTT domain" > git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb > # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for- > 4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next > git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48 > # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove > unneeded checks in dpu_crtc.c > git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf > # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for- > upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next > git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61 > # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes- > 2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next > git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb > # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20' > of git://github.com/skeggsb/linux into drm-next > git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8 > # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix > potential memory leak in nouveau_drm_load() > git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9 > # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep > track of high-speed state, program into clock > git bisect bad a971558c298755d2c07bc5508c65d689471763c8 > # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a > way to configure scrambling/tmds for hdmi 2.0 > git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56 > # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using > new drm_dev initialization helpers > git bisect bad cfea88a4d86632f28cf80be97079f131645b7869 > # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: > Start using new drm_dev initialization helpers > ---------------------------------------------------------------------------- > ----- Cheers, Lyude Paul
Guillaume Tucker
2018-Dec-10 10:00 UTC
[Nouveau] next/master boot bisection: Oops in nouveau driver on jetson-tk1
On 08/12/2018 00:08, Lyude Paul wrote:> uhhhhhhhhhhhhh > didn't we fix this weeks ago? with "drm/nouveau: tegra: Call > nouveau_drm_device_init()"Yes here's the fix from Thierry: https://patchwork.freedesktop.org/patch/263587/ and I can confirm that it does fix the Oops when applied on top of next-20181206 (what I used for the bisection last week): http://lava.baylibre.com:10080/scheduler/job/71109 However the fix doesn't appear to have been applied in any upstream tree yet. Guillaume> On Fri, 2018-12-07 at 23:31 +0000, Guillaume Tucker wrote: >> Please find below an automated bisection report for a kernel Oops >> seen during the initialisation of the nouveau GPU driver on >> jetson-tk1. >> >> >> All the LAVA test jobs for this bisection can be found here: >> >> >> http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table >> >> >> Here's the beginning of the Oops stack trace: >> >> [ 7.485361] [00000064] *pgd=f9e7b835 >> [ 7.485372] Internal error: Oops: 17 [#1] SMP ARM >> [ 7.485376] Modules linked in: snd_soc_tegra_rt5640(+) >> snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub >> snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus >> snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+) >> ttm tegra_devfreq tegra_wdt >> [ 7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next- >> 20181206 #44 >> [ 7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) >> [ 7.555859] PC is at drm_plane_register_all+0x18/0x50 >> [ 7.560899] LR is at drm_modeset_register_all+0xc/0x6c >> >> >> Full log: >> >> http://lava.baylibre.com:10080/scheduler/job/68628#L816 >> >> >> The bisection was run from next-20181206 as this is where the >> issue was discovered on kernelci.org but the patch it found has >> already been merged in mainline. >> >> Hope this helps! >> >> Guillaume >> >> >> -----------8<------------------------8<----------- >> >> >> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * >> * This automated bisection report was sent to you on the basis * >> * that you may be involved with the breaking commit it has * >> * found. No manual investigation has been done to verify it, * >> * and the root cause of the problem may be somewhere else. * >> * Hope this helps! * >> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * >> >> Bisection result for next/master (next-20181206) on jetson-tk1 >> >> Good: 84df9525b0c2 Linux 4.19 >> Bad: 4c92b7b3080d Add linux-next specific files for 20181206 >> Found: cfea88a4d866 drm/nouveau: Start using new drm_dev >> initialization helpers >> >> Checks: >> revert: PASS >> verify: PASS >> >> Parameters: >> Tree: next >> URL: None >> Branch: master >> Target: jetson-tk1 >> Lab: lab-baylibre >> Config: multi_v7_defconfig >> Plan: dmesg-nouveau >> >> Breaking commit found: >> >> ---------------------------------------------------------------------------- >> --- >> commit cfea88a4d86632f28cf80be97079f131645b7869 >> Author: Lyude Paul <lyude at redhat.com> >> Date: Wed Aug 22 21:40:07 2018 -0400 >> >> drm/nouveau: Start using new drm_dev initialization helpers >> >> Per the documentation in drm_get_pci_dev(), this function is deprecated >> and shouldn't be used anymore. As it turns out, we're going to need to >> stop using drm_get_pci_dev() anyway in order to allow us to turn off the >> card before full system shutdowns, otherwise we'll hit race conditions >> with userspace while trying to tear down the card on shutdown. >> >> So, start using drm_dev_get() and drm_dev_put(), and just turn our >> load/unload callbacks into open coded init/fini() functions. >> >> Signed-off-by: Lyude Paul <lyude at redhat.com> >> Cc: Karol Herbst <kherbst at redhat.com> >> Signed-off-by: Ben Skeggs <bskeggs at redhat.com> >> >> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c >> b/drivers/gpu/drm/nouveau/nouveau_drm.c >> index 905956809d21..2b2baf6e0e0d 100644 >> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c >> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c >> @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm) >> nouveau_bo_move_init(drm); >> } >> >> -static int nouveau_drm_probe(struct pci_dev *pdev, >> - const struct pci_device_id *pent) >> -{ >> - struct nvkm_device *device; >> - struct apertures_struct *aper; >> - bool boot = false; >> - int ret; >> - >> - if (vga_switcheroo_client_probe_defer(pdev)) >> - return -EPROBE_DEFER; >> - >> - /* We need to check that the chipset is supported before booting >> - * fbdev off the hardware, as there's no way to put it back. >> - */ >> - ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, >> &device); >> - if (ret) >> - return ret; >> - >> - nvkm_device_del(&device); >> - >> - /* Remove conflicting drivers (vesafb, efifb etc). */ >> - aper = alloc_apertures(3); >> - if (!aper) >> - return -ENOMEM; >> - >> - aper->ranges[0].base = pci_resource_start(pdev, 1); >> - aper->ranges[0].size = pci_resource_len(pdev, 1); >> - aper->count = 1; >> - >> - if (pci_resource_len(pdev, 2)) { >> - aper->ranges[aper->count].base = pci_resource_start(pdev, 2); >> - aper->ranges[aper->count].size = pci_resource_len(pdev, 2); >> - aper->count++; >> - } >> - >> - if (pci_resource_len(pdev, 3)) { >> - aper->ranges[aper->count].base = pci_resource_start(pdev, 3); >> - aper->ranges[aper->count].size = pci_resource_len(pdev, 3); >> - aper->count++; >> - } >> - >> -#ifdef CONFIG_X86 >> - boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; >> -#endif >> - if (nouveau_modeset != 2) >> - drm_fb_helper_remove_conflicting_framebuffers(aper, >> "nouveaufb", boot); >> - kfree(aper); >> - >> - ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, >> - true, true, ~0ULL, &device); >> - if (ret) >> - return ret; >> - >> - pci_set_master(pdev); >> - >> - if (nouveau_atomic) >> - driver_pci.driver_features |= DRIVER_ATOMIC; >> - >> - ret = drm_get_pci_dev(pdev, pent, &driver_pci); >> - if (ret) { >> - nvkm_device_del(&device); >> - return ret; >> - } >> - >> - return 0; >> -} >> - >> static int >> -nouveau_drm_load(struct drm_device *dev, unsigned long flags) >> +nouveau_drm_device_init(struct drm_device *dev) >> { >> struct nouveau_drm *drm; >> int ret; >> @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long >> flags) >> } >> >> static void >> -nouveau_drm_unload(struct drm_device *dev) >> +nouveau_drm_device_fini(struct drm_device *dev) >> { >> struct nouveau_drm *drm = nouveau_drm(dev); >> >> @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev) >> kfree(drm); >> } >> >> +static int nouveau_drm_probe(struct pci_dev *pdev, >> + const struct pci_device_id *pent) >> +{ >> + struct nvkm_device *device; >> + struct drm_device *drm_dev; >> + struct apertures_struct *aper; >> + bool boot = false; >> + int ret; >> + >> + if (vga_switcheroo_client_probe_defer(pdev)) >> + return -EPROBE_DEFER; >> + >> + /* We need to check that the chipset is supported before booting >> + * fbdev off the hardware, as there's no way to put it back. >> + */ >> + ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0, >> &device); >> + if (ret) >> + return ret; >> + >> + nvkm_device_del(&device); >> + >> + /* Remove conflicting drivers (vesafb, efifb etc). */ >> + aper = alloc_apertures(3); >> + if (!aper) >> + return -ENOMEM; >> + >> + aper->ranges[0].base = pci_resource_start(pdev, 1); >> + aper->ranges[0].size = pci_resource_len(pdev, 1); >> + aper->count = 1; >> + >> + if (pci_resource_len(pdev, 2)) { >> + aper->ranges[aper->count].base = pci_resource_start(pdev, 2); >> + aper->ranges[aper->count].size = pci_resource_len(pdev, 2); >> + aper->count++; >> + } >> + >> + if (pci_resource_len(pdev, 3)) { >> + aper->ranges[aper->count].base = pci_resource_start(pdev, 3); >> + aper->ranges[aper->count].size = pci_resource_len(pdev, 3); >> + aper->count++; >> + } >> + >> +#ifdef CONFIG_X86 >> + boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; >> +#endif >> + if (nouveau_modeset != 2) >> + drm_fb_helper_remove_conflicting_framebuffers(aper, >> "nouveaufb", boot); >> + kfree(aper); >> + >> + ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug, >> + true, true, ~0ULL, &device); >> + if (ret) >> + return ret; >> + >> + pci_set_master(pdev); >> + >> + if (nouveau_atomic) >> + driver_pci.driver_features |= DRIVER_ATOMIC; >> + >> + drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev); >> + if (IS_ERR(drm_dev)) { >> + ret = PTR_ERR(drm_dev); >> + goto fail_nvkm; >> + } >> + >> + ret = pci_enable_device(pdev); >> + if (ret) >> + goto fail_drm; >> + >> + drm_dev->pdev = pdev; >> + pci_set_drvdata(pdev, drm_dev); >> + >> + ret = nouveau_drm_device_init(drm_dev); >> + if (ret) >> + goto fail_pci; >> + >> + ret = drm_dev_register(drm_dev, pent->driver_data); >> + if (ret) >> + goto fail_drm_dev_init; >> + >> + return 0; >> + >> +fail_drm_dev_init: >> + nouveau_drm_device_fini(drm_dev); >> +fail_pci: >> + pci_disable_device(pdev); >> +fail_drm: >> + drm_dev_put(drm_dev); >> +fail_nvkm: >> + nvkm_device_del(&device); >> + return ret; >> +} >> + >> void >> nouveau_drm_device_remove(struct drm_device *dev) >> { >> + struct pci_dev *pdev = dev->pdev; >> struct nouveau_drm *drm = nouveau_drm(dev); >> struct nvkm_client *client; >> struct nvkm_device *device; >> >> + drm_dev_unregister(dev); >> + >> dev->irq_enabled = false; >> client = nvxx_client(&drm->client.base); >> device = nvkm_device_find(client->device); >> - drm_put_dev(dev); >> >> + nouveau_drm_device_fini(dev); >> + pci_disable_device(pdev); >> + drm_dev_put(dev); >> nvkm_device_del(&device); >> } >> >> @@ -1020,8 +1051,6 @@ driver_stub = { >> DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER | >> DRIVER_KMS_LEGACY_CONTEXT, >> >> - .load = nouveau_drm_load, >> - .unload = nouveau_drm_unload, >> .open = nouveau_drm_open, >> .postclose = nouveau_drm_postclose, >> .lastclose = nouveau_vga_lastclose, >> ---------------------------------------------------------------------------- >> --- >> >> >> Git bisection log: >> >> ---------------------------------------------------------------------------- >> --- >> git bisect start >> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19 >> git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d >> # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific >> files for 20181206 >> git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4 >> # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20- >> 2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux >> git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f >> # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of >> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi >> git bisect good d49f8a52b15bf35db778035340d8a673149f9f93 >> # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20' >> of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild >> git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21 >> # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next- >> fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm- >> next >> git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f >> # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make >> plane alpha configurable >> git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5 >> # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop >> using gart_start as offset for the GTT domain" >> git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb >> # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for- >> 4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next >> git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48 >> # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove >> unneeded checks in dpu_crtc.c >> git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf >> # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for- >> upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next >> git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61 >> # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes- >> 2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next >> git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb >> # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20' >> of git://github.com/skeggsb/linux into drm-next >> git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8 >> # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix >> potential memory leak in nouveau_drm_load() >> git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9 >> # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep >> track of high-speed state, program into clock >> git bisect bad a971558c298755d2c07bc5508c65d689471763c8 >> # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a >> way to configure scrambling/tmds for hdmi 2.0 >> git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56 >> # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using >> new drm_dev initialization helpers >> git bisect bad cfea88a4d86632f28cf80be97079f131645b7869 >> # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: >> Start using new drm_dev initialization helpers >> ---------------------------------------------------------------------------- >> ---
Reasonably Related Threads
- next/master boot bisection: Oops in nouveau driver on jetson-tk1
- next/master boot bisection: Oops in nouveau driver on jetson-tk1
- [PATCH 0/3] drm/nouveau: Fixup module probe to add ->shutdown()
- [PATCHv2 1/2] fbdev: allow passing more than one aperture for handoff
- [PATCH 2/3] drm/nouveau: manage nouveau_drm lifetime with devres