Karol Herbst
2019-Oct-16 14:44 UTC
[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device states. v2: convert to pci_dev quirk put a proper technical explanation of the issue as a in-code comment v3: disable it only for certain combinations of intel and nvidia hardware Signed-off-by: Karol Herbst <kherbst at redhat.com> Cc: Bjorn Helgaas <bhelgaas at google.com> Cc: Lyude Paul <lyude at redhat.com> Cc: Rafael J. Wysocki <rjw at rjwysocki.net> Cc: Mika Westerberg <mika.westerberg at intel.com> Cc: linux-pci at vger.kernel.org Cc: linux-pm at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: nouveau at lists.freedesktop.org --- drivers/pci/pci.c | 11 ++++++++++ drivers/pci/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/pci.h | 1 + 3 files changed, 64 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b97d9e10c9cc..8e056eb7e6ff 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false; } +static inline bool parent_broken_child_pm(struct pci_dev *dev) +{ + if (!dev->bus || !dev->bus->self) + return false; + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm; +} + /** * pci_raw_set_power_state - Use PCI PM registers to set the power state of * given PCI device @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) || (state == PCI_D2 && !dev->d2_support)) return -EIO; + /* check if the bus controller causes issues */ + if (state != PCI_D0 && parent_broken_child_pm(dev)) + return 0; + pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); /* diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 44c4ae1abd00..c2f20b745dd4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5268,3 +5268,55 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1, PCI_CLASS_DISPLAY_VGA, 8, quirk_reset_lenovo_thinkpad_p50_nvgpu); + +/* + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus after + * those were put into D3cold state if they were put into a non D0 PCI PM + * device state before doing so. + * + * This leads to various issue different issues which all manifest differently, + * but have the same root cause: + * - AIML code execution hits an infinite loop (as the coe waits on device + * memory to change). + * - kernel crashes, as all pci reads return -1, which most code isn't able + * to handle well enough. + * - sudden shutdowns, as the kernel identified an unrecoverable error after + * userspace tries to access the GPU. + * + * In all cases dmesg will contain at least one line like this: + * 'nouveau 0000:01:00.0: Refused to change power state, currently in D3' + * followed by a lot of nouveau timeouts. + * + * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of the + * PCIe bridge controller in order to power down the GPU. + * Nonetheless, there are other code paths inside the ACPI firmware which use + * other registers, which seem to work fine: + * - 0xbc bit 0x20 (publicly available documentation claims 'reserved') + * - 0xb0 bit 0x10 (link disable) + * Changing the conditions inside the firmware by poking into the relevant + * addresses does resolve the issue, but it seemed to be ACPI private memory + * and not any device accessible memory at all, so there is no portable way of + * changing the conditions. + * + * The only systems where this behavior can be seen are hybrid graphics laptops + * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this issue + * only occurs in combination with listed Intel PCIe bridge controllers and + * the mentioned GPUs or if it's only a hw bug in the bridge controller. + * + * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU + * and an Intel Coffee Lake SoC, there is a higher chance of there being a bug + * in the bridge controller rather than in the GPU. + * + * This issue was not able to be reproduced on non laptop systems. + */ + +static void quirk_broken_nv_runpm(struct pci_dev *dev) +{ + dev->broken_nv_runpm = 1; +} +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, + PCI_BASE_CLASS_DISPLAY, 16, + quirk_broken_nv_runpm); +/* kaby lake */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1901, + quirk_broken_nv_runpm); diff --git a/include/linux/pci.h b/include/linux/pci.h index ac8a6c4e1792..903a0b3a39ec 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -416,6 +416,7 @@ struct pci_dev { unsigned int __aer_firmware_first_valid:1; unsigned int __aer_firmware_first:1; unsigned int broken_intx_masking:1; /* INTx masking can't be used */ + unsigned int broken_nv_runpm:1; /* some combinations of intel bridge controller and nvidia GPUs break rtd3 */ unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */ unsigned int irq_managed:1; unsigned int has_secondary_link:1; -- 2.21.0
Bjorn Helgaas
2019-Oct-16 19:14 UTC
[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:> Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device > states. > > v2: convert to pci_dev quirk > put a proper technical explanation of the issue as a in-code comment > v3: disable it only for certain combinations of intel and nvidia hardware > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > Cc: Bjorn Helgaas <bhelgaas at google.com> > Cc: Lyude Paul <lyude at redhat.com> > Cc: Rafael J. Wysocki <rjw at rjwysocki.net> > Cc: Mika Westerberg <mika.westerberg at intel.com> > Cc: linux-pci at vger.kernel.org > Cc: linux-pm at vger.kernel.org > Cc: dri-devel at lists.freedesktop.org > Cc: nouveau at lists.freedesktop.org > --- > drivers/pci/pci.c | 11 ++++++++++ > drivers/pci/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ > include/linux/pci.h | 1 + > 3 files changed, 64 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index b97d9e10c9cc..8e056eb7e6ff 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) > return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false; > } > > +static inline bool parent_broken_child_pm(struct pci_dev *dev) > +{ > + if (!dev->bus || !dev->bus->self) > + return false; > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm; > +} > + > /** > * pci_raw_set_power_state - Use PCI PM registers to set the power state of > * given PCI device > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) > || (state == PCI_D2 && !dev->d2_support)) > return -EIO; > > + /* check if the bus controller causes issues */ > + if (state != PCI_D0 && parent_broken_child_pm(dev)) > + return 0; > + > pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); > > /* > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 44c4ae1abd00..c2f20b745dd4 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5268,3 +5268,55 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) > DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1, > PCI_CLASS_DISPLAY_VGA, 8, > quirk_reset_lenovo_thinkpad_p50_nvgpu); > + > +/* > + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus after > + * those were put into D3cold state if they were put into a non D0 PCI PM > + * device state before doing so. > + * > + * This leads to various issue different issues which all manifest differently, > + * but have the same root cause: > + * - AIML code execution hits an infinite loop (as the coe waits on device > + * memory to change). > + * - kernel crashes, as all pci reads return -1, which most code isn't able > + * to handle well enough. > + * - sudden shutdowns, as the kernel identified an unrecoverable error after > + * userspace tries to access the GPU. > + * > + * In all cases dmesg will contain at least one line like this: > + * 'nouveau 0000:01:00.0: Refused to change power state, currently in D3' > + * followed by a lot of nouveau timeouts. > + * > + * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of the > + * PCIe bridge controller in order to power down the GPU. > + * Nonetheless, there are other code paths inside the ACPI firmware which use > + * other registers, which seem to work fine: > + * - 0xbc bit 0x20 (publicly available documentation claims 'reserved') > + * - 0xb0 bit 0x10 (link disable) > + * Changing the conditions inside the firmware by poking into the relevant > + * addresses does resolve the issue, but it seemed to be ACPI private memory > + * and not any device accessible memory at all, so there is no portable way of > + * changing the conditions. > + * > + * The only systems where this behavior can be seen are hybrid graphics laptops > + * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this issue > + * only occurs in combination with listed Intel PCIe bridge controllers and > + * the mentioned GPUs or if it's only a hw bug in the bridge controller. > + * > + * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU > + * and an Intel Coffee Lake SoC, there is a higher chance of there being a bug > + * in the bridge controller rather than in the GPU. > + * > + * This issue was not able to be reproduced on non laptop systems. > + */ > + > +static void quirk_broken_nv_runpm(struct pci_dev *dev) > +{ > + dev->broken_nv_runpm = 1;Can you use the existing PCI_DEV_FLAGS_NO_D3 flag for this instead of adding a new flag? I would put the parent_broken_child_pm() logic here, if possible, e.g., something like: struct pci_dev *bridge = pci_upstream_bridge(dev); if (bridge && bridge->vendor == PCI_VENDOR_ID_INTEL && bridge->device == 0x1901) dev->dev_flags |= PCI_DEV_FLAGS_NO_D3;> +} > +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, > + PCI_BASE_CLASS_DISPLAY, 16, > + quirk_broken_nv_runpm); > +/* kaby lake */ > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1901, > + quirk_broken_nv_runpm); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index ac8a6c4e1792..903a0b3a39ec 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -416,6 +416,7 @@ struct pci_dev { > unsigned int __aer_firmware_first_valid:1; > unsigned int __aer_firmware_first:1; > unsigned int broken_intx_masking:1; /* INTx masking can't be used */ > + unsigned int broken_nv_runpm:1; /* some combinations of intel bridge controller and nvidia GPUs break rtd3 */ > unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */ > unsigned int irq_managed:1; > unsigned int has_secondary_link:1; > -- > 2.21.0 >
Karol Herbst
2019-Oct-16 19:18 UTC
[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the platform means of putting the device into D3cold, right? That's actually what should still happen, just the D3hot step should be skipped. On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas <helgaas at kernel.org> wrote:> > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote: > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device > > states. > > > > v2: convert to pci_dev quirk > > put a proper technical explanation of the issue as a in-code comment > > v3: disable it only for certain combinations of intel and nvidia hardware > > > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > > Cc: Bjorn Helgaas <bhelgaas at google.com> > > Cc: Lyude Paul <lyude at redhat.com> > > Cc: Rafael J. Wysocki <rjw at rjwysocki.net> > > Cc: Mika Westerberg <mika.westerberg at intel.com> > > Cc: linux-pci at vger.kernel.org > > Cc: linux-pm at vger.kernel.org > > Cc: dri-devel at lists.freedesktop.org > > Cc: nouveau at lists.freedesktop.org > > --- > > drivers/pci/pci.c | 11 ++++++++++ > > drivers/pci/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ > > include/linux/pci.h | 1 + > > 3 files changed, 64 insertions(+) > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index b97d9e10c9cc..8e056eb7e6ff 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) > > return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false; > > } > > > > +static inline bool parent_broken_child_pm(struct pci_dev *dev) > > +{ > > + if (!dev->bus || !dev->bus->self) > > + return false; > > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm; > > +} > > + > > /** > > * pci_raw_set_power_state - Use PCI PM registers to set the power state of > > * given PCI device > > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) > > || (state == PCI_D2 && !dev->d2_support)) > > return -EIO; > > > > + /* check if the bus controller causes issues */ > > + if (state != PCI_D0 && parent_broken_child_pm(dev)) > > + return 0; > > + > > pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); > > > > /* > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > index 44c4ae1abd00..c2f20b745dd4 100644 > > --- a/drivers/pci/quirks.c > > +++ b/drivers/pci/quirks.c > > @@ -5268,3 +5268,55 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) > > DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1, > > PCI_CLASS_DISPLAY_VGA, 8, > > quirk_reset_lenovo_thinkpad_p50_nvgpu); > > + > > +/* > > + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus after > > + * those were put into D3cold state if they were put into a non D0 PCI PM > > + * device state before doing so. > > + * > > + * This leads to various issue different issues which all manifest differently, > > + * but have the same root cause: > > + * - AIML code execution hits an infinite loop (as the coe waits on device > > + * memory to change). > > + * - kernel crashes, as all pci reads return -1, which most code isn't able > > + * to handle well enough. > > + * - sudden shutdowns, as the kernel identified an unrecoverable error after > > + * userspace tries to access the GPU. > > + * > > + * In all cases dmesg will contain at least one line like this: > > + * 'nouveau 0000:01:00.0: Refused to change power state, currently in D3' > > + * followed by a lot of nouveau timeouts. > > + * > > + * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of the > > + * PCIe bridge controller in order to power down the GPU. > > + * Nonetheless, there are other code paths inside the ACPI firmware which use > > + * other registers, which seem to work fine: > > + * - 0xbc bit 0x20 (publicly available documentation claims 'reserved') > > + * - 0xb0 bit 0x10 (link disable) > > + * Changing the conditions inside the firmware by poking into the relevant > > + * addresses does resolve the issue, but it seemed to be ACPI private memory > > + * and not any device accessible memory at all, so there is no portable way of > > + * changing the conditions. > > + * > > + * The only systems where this behavior can be seen are hybrid graphics laptops > > + * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this issue > > + * only occurs in combination with listed Intel PCIe bridge controllers and > > + * the mentioned GPUs or if it's only a hw bug in the bridge controller. > > + * > > + * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU > > + * and an Intel Coffee Lake SoC, there is a higher chance of there being a bug > > + * in the bridge controller rather than in the GPU. > > + * > > + * This issue was not able to be reproduced on non laptop systems. > > + */ > > + > > +static void quirk_broken_nv_runpm(struct pci_dev *dev) > > +{ > > + dev->broken_nv_runpm = 1; > > Can you use the existing PCI_DEV_FLAGS_NO_D3 flag for this instead of > adding a new flag? > > I would put the parent_broken_child_pm() logic here, if possible, > e.g., something like: > > struct pci_dev *bridge = pci_upstream_bridge(dev); > > if (bridge && > bridge->vendor == PCI_VENDOR_ID_INTEL && bridge->device == 0x1901) > dev->dev_flags |= PCI_DEV_FLAGS_NO_D3; > > > +} > > +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, > > + PCI_BASE_CLASS_DISPLAY, 16, > > + quirk_broken_nv_runpm); > > +/* kaby lake */ > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1901, > > + quirk_broken_nv_runpm); > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > index ac8a6c4e1792..903a0b3a39ec 100644 > > --- a/include/linux/pci.h > > +++ b/include/linux/pci.h > > @@ -416,6 +416,7 @@ struct pci_dev { > > unsigned int __aer_firmware_first_valid:1; > > unsigned int __aer_firmware_first:1; > > unsigned int broken_intx_masking:1; /* INTx masking can't be used */ > > + unsigned int broken_nv_runpm:1; /* some combinations of intel bridge controller and nvidia GPUs break rtd3 */ > > unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */ > > unsigned int irq_managed:1; > > unsigned int has_secondary_link:1; > > -- > > 2.21.0 > >
Mika Westerberg
2019-Oct-21 11:40 UTC
[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
Hi Karol, Sorry for commenting late, I just came back from vacation. On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:> Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device > states. > > v2: convert to pci_dev quirk > put a proper technical explanation of the issue as a in-code comment > v3: disable it only for certain combinations of intel and nvidia hardware > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > Cc: Bjorn Helgaas <bhelgaas at google.com> > Cc: Lyude Paul <lyude at redhat.com> > Cc: Rafael J. Wysocki <rjw at rjwysocki.net> > Cc: Mika Westerberg <mika.westerberg at intel.com> > Cc: linux-pci at vger.kernel.org > Cc: linux-pm at vger.kernel.org > Cc: dri-devel at lists.freedesktop.org > Cc: nouveau at lists.freedesktop.org > --- > drivers/pci/pci.c | 11 ++++++++++ > drivers/pci/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++I may be missing something but why you can't do this in the nouveau driver itself?
Karol Herbst
2019-Oct-21 12:00 UTC
[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
On Mon, Oct 21, 2019 at 1:40 PM Mika Westerberg <mika.westerberg at intel.com> wrote:> > Hi Karol, > > Sorry for commenting late, I just came back from vacation. > > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote: > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device > > states. > > > > v2: convert to pci_dev quirk > > put a proper technical explanation of the issue as a in-code comment > > v3: disable it only for certain combinations of intel and nvidia hardware > > > > Signed-off-by: Karol Herbst <kherbst at redhat.com> > > Cc: Bjorn Helgaas <bhelgaas at google.com> > > Cc: Lyude Paul <lyude at redhat.com> > > Cc: Rafael J. Wysocki <rjw at rjwysocki.net> > > Cc: Mika Westerberg <mika.westerberg at intel.com> > > Cc: linux-pci at vger.kernel.org > > Cc: linux-pm at vger.kernel.org > > Cc: dri-devel at lists.freedesktop.org > > Cc: nouveau at lists.freedesktop.org > > --- > > drivers/pci/pci.c | 11 ++++++++++ > > drivers/pci/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ > > I may be missing something but why you can't do this in the nouveau > driver itself?What do you mean precisely? Move the quirk into nouveau, but keep the changes to pci core?
Apparently Analagous Threads
- [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
- [PATCH v4] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
- [PATCH v5] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
- [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
- [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges