Karol Herbst
2022-Aug-19 16:37 UTC
[Nouveau] [RFC 0/2] Stop the abuse of Linux-* _OSI strings
On Fri, Aug 19, 2022 at 6:00 PM Limonciello, Mario <mario.limonciello at amd.com> wrote:> > On 8/19/2022 10:44, Karol Herbst wrote: > > On Fri, Aug 19, 2022 at 4:25 PM Mario Limonciello > > <mario.limonciello at amd.com> wrote: > >> > >> 3 _OSI strings were introduced in recent years that were intended > >> to workaround very specific problems found on specific systems. > >> > >> The idea was supposed to be that these quirks were only used on > >> those systems, but this proved to be a bad assumption. I've found > >> at least one system in the wild where the vendor using the _OSI > >> string doesn't match the _OSI string and the neither does the use. > >> > >> So this brings a good time to review keeping those strings in the kernel. > >> There are 3 strings that were introduced: > >> > >> Linux-Dell-Video > >> -> Intended for systems with NVIDIA cards that didn't support RTD3 > >> Linux-Lenovo-NV-HDMI-Audio > >> -> Intended for powering on NVIDIA HDMI device > >> Linux-HPI-Hybrid-Graphics > >> -> Intended for changing dGPU output > >> > >> AFAIK the first string is no longer relevant as nouveau now supports > >> RTD3. If that's wrong, this can be changed for the series. > >> > > > > Nouveau always supported RTD3, because that's mainly a kernel feature. > > When those were introduced we simply had a bug only hit on a few > > systems. And instead of helping us to debug this, this workaround was > > added :( We were not even asked about this. > > My apologies, I was certainly part of the impetus for this W/A in the > first place while I was at my previous employer. Your comment > re-affirms to me that at least the first patch is correct. >Yeah, no worries. I just hope that people in the future will communicate such things. Anyway, there are a few issues with the runpm stuff left, and looking at what nvidia does in their open driver makes me wonder if we might need a bigger overhaul of runpm. They do apply bridge/host controller specific workarounds and I suspect some of them are related here as the workaround I came up with in nouveau can be seen in 434fdb51513bf. But also having access to documentation/specification from what Nvidia is doing would be quite helpful. We know that on some really new AMD systems we run into new issues and this needs some investigation. I simply don't access to any laptops where this problem can be seen.> > > > I am a bit curious about the other two though as I am not even sure > > they are needed at all as we put other work arounds in place. @Lyude > > Paul might know more about these. > > > > If the other two really aren't needed anymore, then yeah we should just > tear all 3 out. If that's the direction we go, I would appreciate some > commit IDs to reference in the commit message for tearing them out so > that if they end up backporting to stable we know how far they should go. >
Limonciello, Mario
2022-Aug-19 16:43 UTC
[Nouveau] [RFC 0/2] Stop the abuse of Linux-* _OSI strings
On 8/19/2022 11:37, Karol Herbst wrote:> On Fri, Aug 19, 2022 at 6:00 PM Limonciello, Mario > <mario.limonciello at amd.com> wrote: >> >> On 8/19/2022 10:44, Karol Herbst wrote: >>> On Fri, Aug 19, 2022 at 4:25 PM Mario Limonciello >>> <mario.limonciello at amd.com> wrote: >>>> >>>> 3 _OSI strings were introduced in recent years that were intended >>>> to workaround very specific problems found on specific systems. >>>> >>>> The idea was supposed to be that these quirks were only used on >>>> those systems, but this proved to be a bad assumption. I've found >>>> at least one system in the wild where the vendor using the _OSI >>>> string doesn't match the _OSI string and the neither does the use. >>>> >>>> So this brings a good time to review keeping those strings in the kernel. >>>> There are 3 strings that were introduced: >>>> >>>> Linux-Dell-Video >>>> -> Intended for systems with NVIDIA cards that didn't support RTD3 >>>> Linux-Lenovo-NV-HDMI-Audio >>>> -> Intended for powering on NVIDIA HDMI device >>>> Linux-HPI-Hybrid-Graphics >>>> -> Intended for changing dGPU output >>>> >>>> AFAIK the first string is no longer relevant as nouveau now supports >>>> RTD3. If that's wrong, this can be changed for the series. >>>> >>> >>> Nouveau always supported RTD3, because that's mainly a kernel feature. >>> When those were introduced we simply had a bug only hit on a few >>> systems. And instead of helping us to debug this, this workaround was >>> added :( We were not even asked about this. >> >> My apologies, I was certainly part of the impetus for this W/A in the >> first place while I was at my previous employer. Your comment >> re-affirms to me that at least the first patch is correct. >> > > Yeah, no worries. I just hope that people in the future will > communicate such things. > > Anyway, there are a few issues with the runpm stuff left, and looking > at what nvidia does in their open driver makes me wonder if we might > need a bigger overhaul of runpm. They do apply bridge/host controller > specific workarounds and I suspect some of them are related here as > the workaround I came up with in nouveau can be seen in 434fdb51513bf.But this overhaul shouldn't gate removing this _OSI string, or you think it should?> > But also having access to documentation/specification from what Nvidia > is doing would be quite helpful. We know that on some really new AMD > systems we run into new issues and this needs some investigation. I > simply don't access to any laptops where this problem can be seen. >Do you mean there are specifically remaining issues on AMD APU + NVIDIA dGPU systems? Any public bugs by chance? Depending on what these are I'm happy to try to help with at least access. If we have them maybe we can try to make the right connections to get some hardware to you, or at least remotely access it.>>> >>> I am a bit curious about the other two though as I am not even sure >>> they are needed at all as we put other work arounds in place. @Lyude >>> Paul might know more about these. >>> >> >> If the other two really aren't needed anymore, then yeah we should just >> tear all 3 out. If that's the direction we go, I would appreciate some >> commit IDs to reference in the commit message for tearing them out so >> that if they end up backporting to stable we know how far they should go. >> >