Karol Herbst
2017-Nov-21 15:01 UTC
[Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
This should make systems more stable where resuming the GPU fails. This can happen due to bad firmware or due to a bug within the kernel. The last thing which should happen in either case is an unusable system. v2: do the same in nouveau_pmops_resume Tested-by: Karl Hastings <kazen at redhat.com> Signed-off-by: Karol Herbst <kherbst at redhat.com> --- drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c index 8d4a5be3..6e4cb4f7 100644 --- a/drm/nouveau/nouveau_drm.c +++ b/drm/nouveau/nouveau_drm.c @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) return 0; } +static int +nouveau_set_power_state_D0(struct pci_dev *pdev) +{ + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); + int ret; + + pci_set_power_state(pdev, PCI_D0); + /* abort if anything went wrong */ + if (pdev->current_state != PCI_D0) { + NV_ERROR(drm, "couldn't wake up GPU!\n"); + return -EBUSY; + } + pci_restore_state(pdev); + ret = pci_enable_device(pdev); + if (ret) + return ret; + + pci_set_master(pdev); + return 0; +} + int nouveau_pmops_resume(struct device *dev) { @@ -803,12 +824,9 @@ nouveau_pmops_resume(struct device *dev) drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF) return 0; - pci_set_power_state(pdev, PCI_D0); - pci_restore_state(pdev); - ret = pci_enable_device(pdev); + ret = nouveau_set_power_state_D0(pdev); if (ret) return ret; - pci_set_master(pdev); ret = nouveau_do_resume(drm_dev, false); @@ -879,12 +897,9 @@ nouveau_pmops_runtime_resume(struct device *dev) return -EBUSY; } - pci_set_power_state(pdev, PCI_D0); - pci_restore_state(pdev); - ret = pci_enable_device(pdev); + ret = nouveau_set_power_state_D0(pdev); if (ret) return ret; - pci_set_master(pdev); ret = nouveau_do_resume(drm_dev, true); -- 2.14.3
Thierry Reding
2017-Nov-21 17:46 UTC
[Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote:> This should make systems more stable where resuming the GPU fails. This > can happen due to bad firmware or due to a bug within the kernel. The > last thing which should happen in either case is an unusable system. > > v2: do the same in nouveau_pmops_resume > > Tested-by: Karl Hastings <kazen at redhat.com> > Signed-off-by: Karol Herbst <kherbst at redhat.com> > --- > drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c > index 8d4a5be3..6e4cb4f7 100644 > --- a/drm/nouveau/nouveau_drm.c > +++ b/drm/nouveau/nouveau_drm.c > @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) > return 0; > } > > +static int > +nouveau_set_power_state_D0(struct pci_dev *pdev) > +{ > + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); > + int ret; > + > + pci_set_power_state(pdev, PCI_D0); > + /* abort if anything went wrong */ > + if (pdev->current_state != PCI_D0) { > + NV_ERROR(drm, "couldn't wake up GPU!\n"); > + return -EBUSY; > + }Looks to me like the more idiomatic way to do this is: ret = pci_set_power_state(pdev, PCI_D0); if (ret < 0 && ret != -EIO) return ret;> + pci_restore_state(pdev); > + ret = pci_enable_device(pdev); > + if (ret) > + return ret; > + > + pci_set_master(pdev);Looking closer it also seems like pci_enable_device() will already set the power state to D0 (via do_pci_enable_device()). Is the sequence above really necessary because the hardware is quirky, or was it cargo-culted? Thierry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20171121/9f5f069c/attachment.sig>
Karol Herbst
2017-Nov-21 19:03 UTC
[Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
On Tue, Nov 21, 2017 at 6:46 PM, Thierry Reding <thierry.reding at gmail.com> wrote:> On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: >> This should make systems more stable where resuming the GPU fails. This >> can happen due to bad firmware or due to a bug within the kernel. The >> last thing which should happen in either case is an unusable system. >> >> v2: do the same in nouveau_pmops_resume >> >> Tested-by: Karl Hastings <kazen at redhat.com> >> Signed-off-by: Karol Herbst <kherbst at redhat.com> >> --- >> drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- >> 1 file changed, 23 insertions(+), 8 deletions(-) >> >> diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c >> index 8d4a5be3..6e4cb4f7 100644 >> --- a/drm/nouveau/nouveau_drm.c >> +++ b/drm/nouveau/nouveau_drm.c >> @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) >> return 0; >> } >> >> +static int >> +nouveau_set_power_state_D0(struct pci_dev *pdev) >> +{ >> + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); >> + int ret; >> + >> + pci_set_power_state(pdev, PCI_D0); >> + /* abort if anything went wrong */ >> + if (pdev->current_state != PCI_D0) { >> + NV_ERROR(drm, "couldn't wake up GPU!\n"); >> + return -EBUSY; >> + } > > Looks to me like the more idiomatic way to do this is: > > ret = pci_set_power_state(pdev, PCI_D0); > if (ret < 0 && ret != -EIO) > return ret; >I thought so too, but it ends up returning 0 even if setting the power state fails. Or maybe I did something wrong when installing the kernel. I could take another shot at it, but what I came up with seems to work. Adding airlied in CC, because he saw my patch and didn't complain about it. Hopefully he knows more.>> + pci_restore_state(pdev); >> + ret = pci_enable_device(pdev); >> + if (ret) >> + return ret; >> + >> + pci_set_master(pdev); > > Looking closer it also seems like pci_enable_device() will already set > the power state to D0 (via do_pci_enable_device()). Is the sequence > above really necessary because the hardware is quirky, or was it > cargo-culted? > > ThierryNo clue. And because it was already there in the original code I didn't really felt like doing anything with it.
Apparently Analagous Threads
- [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
- [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
- [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
- [RFC PATCH] set current_state to D0 in register_slot
- [PATCH 0/2] drm/nouveau: Fix DP AUX RPM issues