Karol Herbst
2019-May-07 20:12 UTC
[Nouveau] [PATCH v2 0/4] Potential fix for runpm issues on various laptops
CCing linux-pci and Bjorn Helgaas. Maybe we could get better insights on how a reasonable fix would look like. Anyway, to me this entire issue looks like something which has to be fixed on a PCI level instead of inside a driver, so it makes sense to ask the pci folks if they have any better suggestions. Original cover letter: While investigating the runpm issues on my GP107 I noticed that something inside devinit makes runpm break. If Nouveau loads up to the point right before doing devinit, runpm works without any issues, if devinit is ran, not anymore. Out of curiousity I even tried to "bisect" devinit by not running it on vbios provided signed PMU image, but on the devinit parser we have inside Nouveau. Allthough this one isn't as feature complete as the vbios one, I was able to reproduce the runpm issues as well. From that point I was able to only run a certain amount of commands until I got to some PCIe initialization code inside devinit which trigger those runpm issues. Devinit on my GPU was changing the PCIe link from 8.0 to 2.5, reversing that on the fini path makes runpm work again. There are a few other things going on, but with my limited knowledge about PCIe in general, the change in the link speed sounded like it could cause issues on resume if the controller and the device disagree on the actual link. Maybe this is just a bug within the PCI subsystem inside linux instead and the controller has to be forced to do _something_? Anyway, with this runpm seems to work nicely on my machine. Secure booting the gr (even with my workaround applied I need anyway) might fail after the GPU got runtime resumed though... Karol Herbst (4): drm: don't set the pci power state if the pci subsystem handles the ACPI bits pci: enable pcie link changes for pascal pci: add nvkm_pcie_get_speed pci: save the boot pcie link speed and restore it on fini drm/nouveau/include/nvkm/subdev/pci.h | 6 +++-- drm/nouveau/nouveau_acpi.c | 7 +++++- drm/nouveau/nouveau_acpi.h | 2 ++ drm/nouveau/nouveau_drm.c | 14 +++++++++--- drm/nouveau/nouveau_drv.h | 2 ++ drm/nouveau/nvkm/subdev/pci/base.c | 9 ++++++-- drm/nouveau/nvkm/subdev/pci/gk104.c | 8 +++---- drm/nouveau/nvkm/subdev/pci/gp100.c | 10 +++++++++ drm/nouveau/nvkm/subdev/pci/pcie.c | 32 +++++++++++++++++++++++---- drm/nouveau/nvkm/subdev/pci/priv.h | 7 ++++++ 10 files changed, 81 insertions(+), 16 deletions(-) -- 2.21.0
Karol Herbst
2019-May-07 20:12 UTC
[Nouveau] [PATCH v2 1/4] drm: don't set the pci power state if the pci subsystem handles the ACPI bits
v2: rework detection of if Nouveau calling a DSM method or not Signed-off-by: Karol Herbst <kherbst at redhat.com> --- drm/nouveau/nouveau_acpi.c | 7 ++++++- drm/nouveau/nouveau_acpi.h | 2 ++ drm/nouveau/nouveau_drm.c | 14 +++++++++++--- drm/nouveau/nouveau_drv.h | 2 ++ 4 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drm/nouveau/nouveau_acpi.c b/drm/nouveau/nouveau_acpi.c index ffb19585..92dfc900 100644 --- a/drm/nouveau/nouveau_acpi.c +++ b/drm/nouveau/nouveau_acpi.c @@ -358,6 +358,12 @@ void nouveau_register_dsm_handler(void) vga_switcheroo_register_handler(&nouveau_dsm_handler, 0); } +bool nouveau_runpm_calls_dsm(void) +{ + return nouveau_dsm_priv.optimus_detected && + !nouveau_dsm_priv.optimus_skip_dsm; +} + /* Must be called for Optimus models before the card can be turned off */ void nouveau_switcheroo_optimus_dsm(void) { @@ -371,7 +377,6 @@ void nouveau_switcheroo_optimus_dsm(void) nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, NOUVEAU_DSM_OPTIMUS_SET_POWERDOWN, &result); - } void nouveau_unregister_dsm_handler(void) diff --git a/drm/nouveau/nouveau_acpi.h b/drm/nouveau/nouveau_acpi.h index b86294fc..0f5d7aa0 100644 --- a/drm/nouveau/nouveau_acpi.h +++ b/drm/nouveau/nouveau_acpi.h @@ -13,6 +13,7 @@ void nouveau_switcheroo_optimus_dsm(void); int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len); bool nouveau_acpi_rom_supported(struct device *); void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *); +bool nouveau_runpm_calls_dsm(void); #else static inline bool nouveau_is_optimus(void) { return false; }; static inline bool nouveau_is_v1_dsm(void) { return false; }; @@ -22,6 +23,7 @@ static inline void nouveau_switcheroo_optimus_dsm(void) {} static inline bool nouveau_acpi_rom_supported(struct device *dev) { return false; } static inline int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len) { return -EINVAL; } static inline void *nouveau_acpi_edid(struct drm_device *dev, struct drm_connector *connector) { return NULL; } +static inline bool nouveau_runpm_calls_dsm(void) { return false; } #endif #endif diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c index 5020265b..09e68e61 100644 --- a/drm/nouveau/nouveau_drm.c +++ b/drm/nouveau/nouveau_drm.c @@ -556,6 +556,7 @@ nouveau_drm_device_init(struct drm_device *dev) nouveau_fbcon_init(dev); nouveau_led_init(dev); + drm->runpm_dsm = nouveau_runpm_calls_dsm(); if (nouveau_pmops_runtime()) { pm_runtime_use_autosuspend(dev->dev); pm_runtime_set_autosuspend_delay(dev->dev, 5000); @@ -903,6 +904,7 @@ nouveau_pmops_runtime_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); struct drm_device *drm_dev = pci_get_drvdata(pdev); + struct nouveau_drm *drm = nouveau_drm(drm_dev); int ret; if (!nouveau_pmops_runtime()) { @@ -910,12 +912,15 @@ nouveau_pmops_runtime_suspend(struct device *dev) return -EBUSY; } + drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; nouveau_switcheroo_optimus_dsm(); ret = nouveau_do_suspend(drm_dev, true); pci_save_state(pdev); pci_disable_device(pdev); pci_ignore_hotplug(pdev); - pci_set_power_state(pdev, PCI_D3cold); + if (drm->runpm_dsm) + pci_set_power_state(pdev, PCI_D3cold); + drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF; return ret; } @@ -925,7 +930,8 @@ nouveau_pmops_runtime_resume(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); struct drm_device *drm_dev = pci_get_drvdata(pdev); - struct nvif_device *device = &nouveau_drm(drm_dev)->client.device; + struct nouveau_drm *drm = nouveau_drm(drm_dev); + struct nvif_device *device = &drm->client.device; int ret; if (!nouveau_pmops_runtime()) { @@ -933,7 +939,9 @@ nouveau_pmops_runtime_resume(struct device *dev) return -EBUSY; } - pci_set_power_state(pdev, PCI_D0); + drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; + if (drm->runpm_dsm) + pci_set_power_state(pdev, PCI_D0); pci_restore_state(pdev); ret = pci_enable_device(pdev); if (ret) diff --git a/drm/nouveau/nouveau_drv.h b/drm/nouveau/nouveau_drv.h index da847244..941600e9 100644 --- a/drm/nouveau/nouveau_drv.h +++ b/drm/nouveau/nouveau_drv.h @@ -214,6 +214,8 @@ struct nouveau_drm { struct nouveau_svm *svm; struct nouveau_dmem *dmem; + + bool runpm_dsm; }; static inline struct nouveau_drm * -- 2.21.0
Karol Herbst
2019-May-07 20:12 UTC
[Nouveau] [PATCH v2 2/4] pci: enable pcie link changes for pascal
Signed-off-by: Karol Herbst <kherbst at redhat.com> Reviewed-by: Lyude Paul <lyude at redhat.com> --- drm/nouveau/nvkm/subdev/pci/gk104.c | 8 ++++---- drm/nouveau/nvkm/subdev/pci/gp100.c | 10 ++++++++++ drm/nouveau/nvkm/subdev/pci/priv.h | 5 +++++ 3 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drm/nouveau/nvkm/subdev/pci/gk104.c b/drm/nouveau/nvkm/subdev/pci/gk104.c index e6803050..66489018 100644 --- a/drm/nouveau/nvkm/subdev/pci/gk104.c +++ b/drm/nouveau/nvkm/subdev/pci/gk104.c @@ -23,7 +23,7 @@ */ #include "priv.h" -static int +int gk104_pcie_version_supported(struct nvkm_pci *pci) { return (nvkm_rd32(pci->subdev.device, 0x8c1c0) & 0x4) == 0x4 ? 2 : 1; @@ -108,7 +108,7 @@ gk104_pcie_lnkctl_speed(struct nvkm_pci *pci) return -1; } -static enum nvkm_pcie_speed +enum nvkm_pcie_speed gk104_pcie_max_speed(struct nvkm_pci *pci) { u32 max_speed = nvkm_rd32(pci->subdev.device, 0x8c1c0) & 0x300000; @@ -146,7 +146,7 @@ gk104_pcie_set_link_speed(struct nvkm_pci *pci, enum nvkm_pcie_speed speed) nvkm_mask(device, 0x8c040, 0x1, 0x1); } -static int +int gk104_pcie_init(struct nvkm_pci * pci) { enum nvkm_pcie_speed lnkctl_speed, max_speed, cap_speed; @@ -178,7 +178,7 @@ gk104_pcie_init(struct nvkm_pci * pci) return 0; } -static int +int gk104_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) { struct nvkm_subdev *subdev = &pci->subdev; diff --git a/drm/nouveau/nvkm/subdev/pci/gp100.c b/drm/nouveau/nvkm/subdev/pci/gp100.c index 82c5234a..eb19c7a4 100644 --- a/drm/nouveau/nvkm/subdev/pci/gp100.c +++ b/drm/nouveau/nvkm/subdev/pci/gp100.c @@ -35,6 +35,16 @@ gp100_pci_func = { .wr08 = nv40_pci_wr08, .wr32 = nv40_pci_wr32, .msi_rearm = gp100_pci_msi_rearm, + + .pcie.init = gk104_pcie_init, + .pcie.set_link = gk104_pcie_set_link, + + .pcie.max_speed = gk104_pcie_max_speed, + .pcie.cur_speed = g84_pcie_cur_speed, + + .pcie.set_version = gf100_pcie_set_version, + .pcie.version = gf100_pcie_version, + .pcie.version_supported = gk104_pcie_version_supported, }; int diff --git a/drm/nouveau/nvkm/subdev/pci/priv.h b/drm/nouveau/nvkm/subdev/pci/priv.h index c17f6063..a0d4c007 100644 --- a/drm/nouveau/nvkm/subdev/pci/priv.h +++ b/drm/nouveau/nvkm/subdev/pci/priv.h @@ -54,6 +54,11 @@ int gf100_pcie_cap_speed(struct nvkm_pci *); int gf100_pcie_init(struct nvkm_pci *); int gf100_pcie_set_link(struct nvkm_pci *, enum nvkm_pcie_speed, u8); +int gk104_pcie_init(struct nvkm_pci *); +int gk104_pcie_set_link(struct nvkm_pci *, enum nvkm_pcie_speed, u8 width); +enum nvkm_pcie_speed gk104_pcie_max_speed(struct nvkm_pci *); +int gk104_pcie_version_supported(struct nvkm_pci *); + int nvkm_pcie_oneinit(struct nvkm_pci *); int nvkm_pcie_init(struct nvkm_pci *); #endif -- 2.21.0
Signed-off-by: Karol Herbst <kherbst at redhat.com> Reviewed-by: Lyude Paul <lyude at redhat.com> --- drm/nouveau/include/nvkm/subdev/pci.h | 1 + drm/nouveau/nvkm/subdev/pci/pcie.c | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/drm/nouveau/include/nvkm/subdev/pci.h b/drm/nouveau/include/nvkm/subdev/pci.h index 23803cc8..1fdf3098 100644 --- a/drm/nouveau/include/nvkm/subdev/pci.h +++ b/drm/nouveau/include/nvkm/subdev/pci.h @@ -53,4 +53,5 @@ int gp100_pci_new(struct nvkm_device *, int, struct nvkm_pci **); /* pcie functions */ int nvkm_pcie_set_link(struct nvkm_pci *, enum nvkm_pcie_speed, u8 width); +enum nvkm_pcie_speed nvkm_pcie_get_speed(struct nvkm_pci *); #endif diff --git a/drm/nouveau/nvkm/subdev/pci/pcie.c b/drm/nouveau/nvkm/subdev/pci/pcie.c index d71e5db5..70ccbe0d 100644 --- a/drm/nouveau/nvkm/subdev/pci/pcie.c +++ b/drm/nouveau/nvkm/subdev/pci/pcie.c @@ -163,3 +163,11 @@ nvkm_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) return ret; } + +enum nvkm_pcie_speed +nvkm_pcie_get_speed(struct nvkm_pci *pci) +{ + if (!pci || !pci_is_pcie(pci->pdev) || !pci->pcie.cur_speed) + return -ENODEV; + return pci->func->pcie.cur_speed(pci); +} -- 2.21.0
Karol Herbst
2019-May-07 20:12 UTC
[Nouveau] [PATCH v2 4/4] pci: save the boot pcie link speed and restore it on fini
Apperantly things go south if we suspend the device with a different PCIE link speed set than it got booted with. Fixes runtime suspend on my gp107. This all looks like some bug inside the pci subsystem and I would prefer a fix there instead of nouveau, but maybe there is no real nice way of doing that outside of drivers? v2: squashed together patch 4 and 5 Signed-off-by: Karol Herbst <kherbst at redhat.com> Reviewed-by: Lyude Paul <lyude at redhat.com> --- drm/nouveau/include/nvkm/subdev/pci.h | 5 +++-- drm/nouveau/nvkm/subdev/pci/base.c | 9 +++++++-- drm/nouveau/nvkm/subdev/pci/pcie.c | 24 ++++++++++++++++++++---- drm/nouveau/nvkm/subdev/pci/priv.h | 2 ++ 4 files changed, 32 insertions(+), 8 deletions(-) diff --git a/drm/nouveau/include/nvkm/subdev/pci.h b/drm/nouveau/include/nvkm/subdev/pci.h index 1fdf3098..b23793a2 100644 --- a/drm/nouveau/include/nvkm/subdev/pci.h +++ b/drm/nouveau/include/nvkm/subdev/pci.h @@ -26,8 +26,9 @@ struct nvkm_pci { } agp; struct { - enum nvkm_pcie_speed speed; - u8 width; + enum nvkm_pcie_speed cur_speed; + enum nvkm_pcie_speed def_speed; + u8 cur_width; } pcie; bool msi; diff --git a/drm/nouveau/nvkm/subdev/pci/base.c b/drm/nouveau/nvkm/subdev/pci/base.c index ee2431a7..d9fb5a83 100644 --- a/drm/nouveau/nvkm/subdev/pci/base.c +++ b/drm/nouveau/nvkm/subdev/pci/base.c @@ -90,6 +90,8 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend) if (pci->agp.bridge) nvkm_agp_fini(pci); + else if (pci_is_pcie(pci->pdev)) + nvkm_pcie_fini(pci); return 0; } @@ -100,6 +102,8 @@ nvkm_pci_preinit(struct nvkm_subdev *subdev) struct nvkm_pci *pci = nvkm_pci(subdev); if (pci->agp.bridge) nvkm_agp_preinit(pci); + else if (pci_is_pcie(pci->pdev)) + nvkm_pcie_preinit(pci); return 0; } @@ -193,8 +197,9 @@ nvkm_pci_new_(const struct nvkm_pci_func *func, struct nvkm_device *device, pci->func = func; pci->pdev = device->func->pci(device)->pdev; pci->irq = -1; - pci->pcie.speed = -1; - pci->pcie.width = -1; + pci->pcie.cur_speed = -1; + pci->pcie.def_speed = -1; + pci->pcie.cur_width = -1; if (device->type == NVKM_DEVICE_AGP) nvkm_agp_ctor(pci); diff --git a/drm/nouveau/nvkm/subdev/pci/pcie.c b/drm/nouveau/nvkm/subdev/pci/pcie.c index 70ccbe0d..731dd30e 100644 --- a/drm/nouveau/nvkm/subdev/pci/pcie.c +++ b/drm/nouveau/nvkm/subdev/pci/pcie.c @@ -85,6 +85,13 @@ nvkm_pcie_oneinit(struct nvkm_pci *pci) return 0; } +int +nvkm_pcie_preinit(struct nvkm_pci *pci) +{ + pci->pcie.def_speed = nvkm_pcie_get_speed(pci); + return 0; +} + int nvkm_pcie_init(struct nvkm_pci *pci) { @@ -105,12 +112,21 @@ nvkm_pcie_init(struct nvkm_pci *pci) if (pci->func->pcie.init) pci->func->pcie.init(pci); - if (pci->pcie.speed != -1) - nvkm_pcie_set_link(pci, pci->pcie.speed, pci->pcie.width); + if (pci->pcie.cur_speed != -1) + nvkm_pcie_set_link(pci, pci->pcie.cur_speed, + pci->pcie.cur_width); return 0; } +int +nvkm_pcie_fini(struct nvkm_pci *pci) +{ + if (!IS_ERR_VALUE(pci->pcie.def_speed)) + return nvkm_pcie_set_link(pci, pci->pcie.def_speed, 16); + return 0; +} + int nvkm_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) { @@ -146,8 +162,8 @@ nvkm_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) speed = max_speed; } - pci->pcie.speed = speed; - pci->pcie.width = width; + pci->pcie.cur_speed = speed; + pci->pcie.cur_width = width; if (speed == cur_speed) { nvkm_debug(subdev, "requested matches current speed\n"); diff --git a/drm/nouveau/nvkm/subdev/pci/priv.h b/drm/nouveau/nvkm/subdev/pci/priv.h index a0d4c007..e7744671 100644 --- a/drm/nouveau/nvkm/subdev/pci/priv.h +++ b/drm/nouveau/nvkm/subdev/pci/priv.h @@ -60,5 +60,7 @@ enum nvkm_pcie_speed gk104_pcie_max_speed(struct nvkm_pci *); int gk104_pcie_version_supported(struct nvkm_pci *); int nvkm_pcie_oneinit(struct nvkm_pci *); +int nvkm_pcie_preinit(struct nvkm_pci *); int nvkm_pcie_init(struct nvkm_pci *); +int nvkm_pcie_fini(struct nvkm_pci *); #endif -- 2.21.0
Lyude Paul
2019-May-08 19:10 UTC
[Nouveau] [PATCH v2 1/4] drm: don't set the pci power state if the pci subsystem handles the ACPI bits
Reviewed-by: Lyude Paul <lyude at redhat.com> On Tue, 2019-05-07 at 22:12 +0200, Karol Herbst wrote:> v2: rework detection of if Nouveau calling a DSM method or not > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drm/nouveau/nouveau_acpi.c | 7 ++++++- > drm/nouveau/nouveau_acpi.h | 2 ++ > drm/nouveau/nouveau_drm.c | 14 +++++++++++--- > drm/nouveau/nouveau_drv.h | 2 ++ > 4 files changed, 21 insertions(+), 4 deletions(-) > > diff --git a/drm/nouveau/nouveau_acpi.c b/drm/nouveau/nouveau_acpi.c > index ffb19585..92dfc900 100644 > --- a/drm/nouveau/nouveau_acpi.c > +++ b/drm/nouveau/nouveau_acpi.c > @@ -358,6 +358,12 @@ void nouveau_register_dsm_handler(void) > vga_switcheroo_register_handler(&nouveau_dsm_handler, 0); > } > > +bool nouveau_runpm_calls_dsm(void) > +{ > + return nouveau_dsm_priv.optimus_detected && > + !nouveau_dsm_priv.optimus_skip_dsm; > +} > + > /* Must be called for Optimus models before the card can be turned off */ > void nouveau_switcheroo_optimus_dsm(void) > { > @@ -371,7 +377,6 @@ void nouveau_switcheroo_optimus_dsm(void) > > nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, > NOUVEAU_DSM_OPTIMUS_CAPS, > NOUVEAU_DSM_OPTIMUS_SET_POWERDOWN, &result); > - > } > > void nouveau_unregister_dsm_handler(void) > diff --git a/drm/nouveau/nouveau_acpi.h b/drm/nouveau/nouveau_acpi.h > index b86294fc..0f5d7aa0 100644 > --- a/drm/nouveau/nouveau_acpi.h > +++ b/drm/nouveau/nouveau_acpi.h > @@ -13,6 +13,7 @@ void nouveau_switcheroo_optimus_dsm(void); > int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, int len); > bool nouveau_acpi_rom_supported(struct device *); > void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *); > +bool nouveau_runpm_calls_dsm(void); > #else > static inline bool nouveau_is_optimus(void) { return false; }; > static inline bool nouveau_is_v1_dsm(void) { return false; }; > @@ -22,6 +23,7 @@ static inline void nouveau_switcheroo_optimus_dsm(void) {} > static inline bool nouveau_acpi_rom_supported(struct device *dev) { return > false; } > static inline int nouveau_acpi_get_bios_chunk(uint8_t *bios, int offset, > int len) { return -EINVAL; } > static inline void *nouveau_acpi_edid(struct drm_device *dev, struct > drm_connector *connector) { return NULL; } > +static inline bool nouveau_runpm_calls_dsm(void) { return false; } > #endif > > #endif > diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c > index 5020265b..09e68e61 100644 > --- a/drm/nouveau/nouveau_drm.c > +++ b/drm/nouveau/nouveau_drm.c > @@ -556,6 +556,7 @@ nouveau_drm_device_init(struct drm_device *dev) > nouveau_fbcon_init(dev); > nouveau_led_init(dev); > > + drm->runpm_dsm = nouveau_runpm_calls_dsm(); > if (nouveau_pmops_runtime()) { > pm_runtime_use_autosuspend(dev->dev); > pm_runtime_set_autosuspend_delay(dev->dev, 5000); > @@ -903,6 +904,7 @@ nouveau_pmops_runtime_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > struct drm_device *drm_dev = pci_get_drvdata(pdev); > + struct nouveau_drm *drm = nouveau_drm(drm_dev); > int ret; > > if (!nouveau_pmops_runtime()) { > @@ -910,12 +912,15 @@ nouveau_pmops_runtime_suspend(struct device *dev) > return -EBUSY; > } > > + drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; > nouveau_switcheroo_optimus_dsm(); > ret = nouveau_do_suspend(drm_dev, true); > pci_save_state(pdev); > pci_disable_device(pdev); > pci_ignore_hotplug(pdev); > - pci_set_power_state(pdev, PCI_D3cold); > + if (drm->runpm_dsm) > + pci_set_power_state(pdev, PCI_D3cold); > + > drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF; > return ret; > } > @@ -925,7 +930,8 @@ nouveau_pmops_runtime_resume(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > struct drm_device *drm_dev = pci_get_drvdata(pdev); > - struct nvif_device *device = &nouveau_drm(drm_dev)->client.device; > + struct nouveau_drm *drm = nouveau_drm(drm_dev); > + struct nvif_device *device = &drm->client.device; > int ret; > > if (!nouveau_pmops_runtime()) { > @@ -933,7 +939,9 @@ nouveau_pmops_runtime_resume(struct device *dev) > return -EBUSY; > } > > - pci_set_power_state(pdev, PCI_D0); > + drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; > + if (drm->runpm_dsm) > + pci_set_power_state(pdev, PCI_D0); > pci_restore_state(pdev); > ret = pci_enable_device(pdev); > if (ret) > diff --git a/drm/nouveau/nouveau_drv.h b/drm/nouveau/nouveau_drv.h > index da847244..941600e9 100644 > --- a/drm/nouveau/nouveau_drv.h > +++ b/drm/nouveau/nouveau_drv.h > @@ -214,6 +214,8 @@ struct nouveau_drm { > struct nouveau_svm *svm; > > struct nouveau_dmem *dmem; > + > + bool runpm_dsm; > }; > > static inline struct nouveau_drm *-- Cheers, Lyude Paul
Karol Herbst
2019-May-20 13:23 UTC
[Nouveau] [PATCH v2 0/4] Potential fix for runpm issues on various laptops
ping to the pci folks? I really would like to know what you make out of it. In fact, this kind of looks like a pcie issue, but I just don't know enough to really be able to tell. I am mainly wondering why putting the device with a 2.5 vs a 8.0 link into d3cold makes the resume path break? Any ideas? broken pcie controller? broken implementation on the gpu? On Tue, May 7, 2019 at 10:12 PM Karol Herbst <kherbst at redhat.com> wrote:> > CCing linux-pci and Bjorn Helgaas. Maybe we could get better insights on > how a reasonable fix would look like. > > Anyway, to me this entire issue looks like something which has to be fixed > on a PCI level instead of inside a driver, so it makes sense to ask the > pci folks if they have any better suggestions. > > Original cover letter: > While investigating the runpm issues on my GP107 I noticed that something > inside devinit makes runpm break. If Nouveau loads up to the point right > before doing devinit, runpm works without any issues, if devinit is ran, > not anymore. > > Out of curiousity I even tried to "bisect" devinit by not running it on > vbios provided signed PMU image, but on the devinit parser we have inside > Nouveau. > Allthough this one isn't as feature complete as the vbios one, I was able > to reproduce the runpm issues as well. From that point I was able to only > run a certain amount of commands until I got to some PCIe initialization > code inside devinit which trigger those runpm issues. > > Devinit on my GPU was changing the PCIe link from 8.0 to 2.5, reversing > that on the fini path makes runpm work again. > > There are a few other things going on, but with my limited knowledge about > PCIe in general, the change in the link speed sounded like it could cause > issues on resume if the controller and the device disagree on the actual > link. > > Maybe this is just a bug within the PCI subsystem inside linux instead and > the controller has to be forced to do _something_? > > Anyway, with this runpm seems to work nicely on my machine. Secure booting > the gr (even with my workaround applied I need anyway) might fail after > the GPU got runtime resumed though... > > Karol Herbst (4): > drm: don't set the pci power state if the pci subsystem handles the > ACPI bits > pci: enable pcie link changes for pascal > pci: add nvkm_pcie_get_speed > pci: save the boot pcie link speed and restore it on fini > > drm/nouveau/include/nvkm/subdev/pci.h | 6 +++-- > drm/nouveau/nouveau_acpi.c | 7 +++++- > drm/nouveau/nouveau_acpi.h | 2 ++ > drm/nouveau/nouveau_drm.c | 14 +++++++++--- > drm/nouveau/nouveau_drv.h | 2 ++ > drm/nouveau/nvkm/subdev/pci/base.c | 9 ++++++-- > drm/nouveau/nvkm/subdev/pci/gk104.c | 8 +++---- > drm/nouveau/nvkm/subdev/pci/gp100.c | 10 +++++++++ > drm/nouveau/nvkm/subdev/pci/pcie.c | 32 +++++++++++++++++++++++---- > drm/nouveau/nvkm/subdev/pci/priv.h | 7 ++++++ > 10 files changed, 81 insertions(+), 16 deletions(-) > > -- > 2.21.0 >
Bjorn Helgaas
2019-May-20 21:19 UTC
[Nouveau] [PATCH v2 4/4] pci: save the boot pcie link speed and restore it on fini
On Tue, May 07, 2019 at 10:12:45PM +0200, Karol Herbst wrote:> Apperantly things go south if we suspend the device with a different PCIE > link speed set than it got booted with. Fixes runtime suspend on my gp107. > > This all looks like some bug inside the pci subsystem and I would prefer a > fix there instead of nouveau, but maybe there is no real nice way of doing > that outside of drivers?I agree it would be nice to fix this in the PCI core if that's feasible. It looks like this driver changes the PCIe link speed using some device-specific mechanism. When we suspend, we put the device in D3cold, so it loses all its state. When we resume, the link probably comes up at the boot speed because nothing did that device-specific magic to change it, so you probably end up with the link being slow but the driver thinking it's configured to be fast, and maybe that combination doesn't work. If it requires something device-specific to change that link speed, I don't know how to put that in the PCI core. But maybe I'm missing something? Per the PCIe spec (r4.0, sec 1.2): Initialization – During hardware initialization, each PCI Express Link is set up following a negotiation of Lane widths and frequency of operation by the two agents at each end of the Link. No firmware or operating system software is involved. I have been assuming that this means device-specific link speed management is out of spec, but it seems pretty common that devices don't come up by themselves at the fastest possible link speed. So maybe the spec just intends that devices can operate at *some* valid speed.> v2: squashed together patch 4 and 5 > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > Reviewed-by: Lyude Paul <lyude at redhat.com> > --- > drm/nouveau/include/nvkm/subdev/pci.h | 5 +++-- > drm/nouveau/nvkm/subdev/pci/base.c | 9 +++++++-- > drm/nouveau/nvkm/subdev/pci/pcie.c | 24 ++++++++++++++++++++---- > drm/nouveau/nvkm/subdev/pci/priv.h | 2 ++ > 4 files changed, 32 insertions(+), 8 deletions(-) > > diff --git a/drm/nouveau/include/nvkm/subdev/pci.h b/drm/nouveau/include/nvkm/subdev/pci.h > index 1fdf3098..b23793a2 100644 > --- a/drm/nouveau/include/nvkm/subdev/pci.h > +++ b/drm/nouveau/include/nvkm/subdev/pci.h > @@ -26,8 +26,9 @@ struct nvkm_pci { > } agp; > > struct { > - enum nvkm_pcie_speed speed; > - u8 width; > + enum nvkm_pcie_speed cur_speed; > + enum nvkm_pcie_speed def_speed; > + u8 cur_width; > } pcie; > > bool msi; > diff --git a/drm/nouveau/nvkm/subdev/pci/base.c b/drm/nouveau/nvkm/subdev/pci/base.c > index ee2431a7..d9fb5a83 100644 > --- a/drm/nouveau/nvkm/subdev/pci/base.c > +++ b/drm/nouveau/nvkm/subdev/pci/base.c > @@ -90,6 +90,8 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend) > > if (pci->agp.bridge) > nvkm_agp_fini(pci); > + else if (pci_is_pcie(pci->pdev)) > + nvkm_pcie_fini(pci); > > return 0; > } > @@ -100,6 +102,8 @@ nvkm_pci_preinit(struct nvkm_subdev *subdev) > struct nvkm_pci *pci = nvkm_pci(subdev); > if (pci->agp.bridge) > nvkm_agp_preinit(pci); > + else if (pci_is_pcie(pci->pdev)) > + nvkm_pcie_preinit(pci); > return 0; > } > > @@ -193,8 +197,9 @@ nvkm_pci_new_(const struct nvkm_pci_func *func, struct nvkm_device *device, > pci->func = func; > pci->pdev = device->func->pci(device)->pdev; > pci->irq = -1; > - pci->pcie.speed = -1; > - pci->pcie.width = -1; > + pci->pcie.cur_speed = -1; > + pci->pcie.def_speed = -1; > + pci->pcie.cur_width = -1; > > if (device->type == NVKM_DEVICE_AGP) > nvkm_agp_ctor(pci); > diff --git a/drm/nouveau/nvkm/subdev/pci/pcie.c b/drm/nouveau/nvkm/subdev/pci/pcie.c > index 70ccbe0d..731dd30e 100644 > --- a/drm/nouveau/nvkm/subdev/pci/pcie.c > +++ b/drm/nouveau/nvkm/subdev/pci/pcie.c > @@ -85,6 +85,13 @@ nvkm_pcie_oneinit(struct nvkm_pci *pci) > return 0; > } > > +int > +nvkm_pcie_preinit(struct nvkm_pci *pci) > +{ > + pci->pcie.def_speed = nvkm_pcie_get_speed(pci); > + return 0; > +} > + > int > nvkm_pcie_init(struct nvkm_pci *pci) > { > @@ -105,12 +112,21 @@ nvkm_pcie_init(struct nvkm_pci *pci) > if (pci->func->pcie.init) > pci->func->pcie.init(pci); > > - if (pci->pcie.speed != -1) > - nvkm_pcie_set_link(pci, pci->pcie.speed, pci->pcie.width); > + if (pci->pcie.cur_speed != -1) > + nvkm_pcie_set_link(pci, pci->pcie.cur_speed, > + pci->pcie.cur_width); > > return 0; > } > > +int > +nvkm_pcie_fini(struct nvkm_pci *pci) > +{ > + if (!IS_ERR_VALUE(pci->pcie.def_speed)) > + return nvkm_pcie_set_link(pci, pci->pcie.def_speed, 16); > + return 0; > +} > + > int > nvkm_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) > { > @@ -146,8 +162,8 @@ nvkm_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) > speed = max_speed; > } > > - pci->pcie.speed = speed; > - pci->pcie.width = width; > + pci->pcie.cur_speed = speed; > + pci->pcie.cur_width = width; > > if (speed == cur_speed) { > nvkm_debug(subdev, "requested matches current speed\n"); > diff --git a/drm/nouveau/nvkm/subdev/pci/priv.h b/drm/nouveau/nvkm/subdev/pci/priv.h > index a0d4c007..e7744671 100644 > --- a/drm/nouveau/nvkm/subdev/pci/priv.h > +++ b/drm/nouveau/nvkm/subdev/pci/priv.h > @@ -60,5 +60,7 @@ enum nvkm_pcie_speed gk104_pcie_max_speed(struct nvkm_pci *); > int gk104_pcie_version_supported(struct nvkm_pci *); > > int nvkm_pcie_oneinit(struct nvkm_pci *); > +int nvkm_pcie_preinit(struct nvkm_pci *); > int nvkm_pcie_init(struct nvkm_pci *); > +int nvkm_pcie_fini(struct nvkm_pci *); > #endif > -- > 2.21.0 >
Bjorn Helgaas
2019-May-20 21:25 UTC
[Nouveau] [PATCH v2 2/4] pci: enable pcie link changes for pascal
On Tue, May 07, 2019 at 10:12:43PM +0200, Karol Herbst wrote:> Signed-off-by: Karol Herbst <kherbst at redhat.com> > Reviewed-by: Lyude Paul <lyude at redhat.com> > --- > drm/nouveau/nvkm/subdev/pci/gk104.c | 8 ++++---- > drm/nouveau/nvkm/subdev/pci/gp100.c | 10 ++++++++++ > drm/nouveau/nvkm/subdev/pci/priv.h | 5 +++++ > 3 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/drm/nouveau/nvkm/subdev/pci/gk104.c b/drm/nouveau/nvkm/subdev/pci/gk104.c > index e6803050..66489018 100644 > --- a/drm/nouveau/nvkm/subdev/pci/gk104.c > +++ b/drm/nouveau/nvkm/subdev/pci/gk104.c > @@ -23,7 +23,7 @@ > */ > #include "priv.h" > > -static int > +int > gk104_pcie_version_supported(struct nvkm_pci *pci) > { > return (nvkm_rd32(pci->subdev.device, 0x8c1c0) & 0x4) == 0x4 ? 2 : 1; > @@ -108,7 +108,7 @@ gk104_pcie_lnkctl_speed(struct nvkm_pci *pci) > return -1; > } > > -static enum nvkm_pcie_speed > +enum nvkm_pcie_speed > gk104_pcie_max_speed(struct nvkm_pci *pci) > { > u32 max_speed = nvkm_rd32(pci->subdev.device, 0x8c1c0) & 0x300000;Some of this looks pretty similar to the PCI core code that reads PCI_EXP_LNKSTA to find the link speed (but admittedly there's not really a good interface to do that unless bus->cur_bus_speed already has what you need). It doesn't look like this is directly reading the PCI_EXP_LNKSTA from PCI config space; maybe it's reading a mirror of that via a device-specific MMIO address, or maybe it's reading something else completely. But it makes me wonder if there's a way to make generic PCI core interfaces for some of this stuff.> @@ -146,7 +146,7 @@ gk104_pcie_set_link_speed(struct nvkm_pci *pci, enum nvkm_pcie_speed speed) > nvkm_mask(device, 0x8c040, 0x1, 0x1); > } > > -static int > +int > gk104_pcie_init(struct nvkm_pci * pci) > { > enum nvkm_pcie_speed lnkctl_speed, max_speed, cap_speed; > @@ -178,7 +178,7 @@ gk104_pcie_init(struct nvkm_pci * pci) > return 0; > } > > -static int > +int > gk104_pcie_set_link(struct nvkm_pci *pci, enum nvkm_pcie_speed speed, u8 width) > { > struct nvkm_subdev *subdev = &pci->subdev; > diff --git a/drm/nouveau/nvkm/subdev/pci/gp100.c b/drm/nouveau/nvkm/subdev/pci/gp100.c > index 82c5234a..eb19c7a4 100644 > --- a/drm/nouveau/nvkm/subdev/pci/gp100.c > +++ b/drm/nouveau/nvkm/subdev/pci/gp100.c > @@ -35,6 +35,16 @@ gp100_pci_func = { > .wr08 = nv40_pci_wr08, > .wr32 = nv40_pci_wr32, > .msi_rearm = gp100_pci_msi_rearm, > + > + .pcie.init = gk104_pcie_init, > + .pcie.set_link = gk104_pcie_set_link, > + > + .pcie.max_speed = gk104_pcie_max_speed, > + .pcie.cur_speed = g84_pcie_cur_speed, > + > + .pcie.set_version = gf100_pcie_set_version, > + .pcie.version = gf100_pcie_version, > + .pcie.version_supported = gk104_pcie_version_supported, > }; > > int > diff --git a/drm/nouveau/nvkm/subdev/pci/priv.h b/drm/nouveau/nvkm/subdev/pci/priv.h > index c17f6063..a0d4c007 100644 > --- a/drm/nouveau/nvkm/subdev/pci/priv.h > +++ b/drm/nouveau/nvkm/subdev/pci/priv.h > @@ -54,6 +54,11 @@ int gf100_pcie_cap_speed(struct nvkm_pci *); > int gf100_pcie_init(struct nvkm_pci *); > int gf100_pcie_set_link(struct nvkm_pci *, enum nvkm_pcie_speed, u8); > > +int gk104_pcie_init(struct nvkm_pci *); > +int gk104_pcie_set_link(struct nvkm_pci *, enum nvkm_pcie_speed, u8 width); > +enum nvkm_pcie_speed gk104_pcie_max_speed(struct nvkm_pci *); > +int gk104_pcie_version_supported(struct nvkm_pci *); > + > int nvkm_pcie_oneinit(struct nvkm_pci *); > int nvkm_pcie_init(struct nvkm_pci *); > #endif > -- > 2.21.0 >
Possibly Parallel Threads
- [PATCH 1/4] pci: enable pcie link changes for pascal
- [PATCH 0/5] Potential fix for runpm issues on various laptops
- [PATCH v2 0/4] Potential fix for runpm issues on various laptops
- [PATCH v2 4/4] pci: save the boot pcie link speed and restore it on fini
- [PATCH v2 4/4] pci: save the boot pcie link speed and restore it on fini