Linux regression tracking (Thorsten Leemhuis)
2023-Oct-31 09:18 UTC
[Nouveau] [REGRESSION]: nouveau: Asynchronous wait on fence
On 28.10.23 04:46, Owen T. Heisler wrote:> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882 > #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180 > > ## Problem > > 1. Connect external display to DVI port on dock and run X with both > ?? displays in use. > 2. Wait hours or days. > 3. Suddenly the secondary Nvidia-connected display turns off and X stops > ?? responding to keyboard/mouse input. In *some* cases it is possible to > ?? switch to a virtual TTY with Ctrl+Alt+Fn and log in there. In any > ?? case, shutdown/reboot after this happens is *usually* not successful > ?? (forced power-off is required). > > This started happening after the upgrade to Debian bullseye, and the > problem remains with Debian bookworm. > [...]Thanks for your report. With a bit of luck someone will look into this, But I doubt it, as this report has some aspects why it might be ignored. Mainly: (a) the report was about a stable/longterm kernel and (b)it's afaics unclear if the problem even happens with the latest mainline kernel. For details about these aspects, see: https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/ You thus might want to check if the problem occurs with 6.6 -- and ideally also check if reverting the culprit there fixes things for you. That might help getting things rolling, but it's a pretty old regression, which complicates things. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:> Thanks for your report. With a bit of luck someone will look into this, > But I doubt it, as this report has some aspects why it might be ignored. > Mainly: (a) the report was about a stable/longterm kernel and (b)it's > afaics unclear if the problem even happens with the latest mainline > kernel.> You thus might want to check if the problem occurs with 6.6 -- and > ideally also check if reverting the culprit there fixes things for you.Thorsten, Thank you for your reply and suggestions. I will try (a) testing on mainline (when I tried before I was interrupted by another, unrelated regression) and (b) reverting the culprit commit there if I am able to reproduce the problem. Thanks, Owen -- Owen T. Heisler https://owenh.net
On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:> On 28.10.23 04:46, Owen T. Heisler wrote: >> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882 >> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180 >> >> ## Problem >> >> 1. Connect external display to DVI port on dock and run X with both >> ?? displays in use. >> 2. Wait hours or days. >> 3. Suddenly the secondary Nvidia-connected display turns off and X stops >> ?? responding to keyboard/mouse input. In *some* cases it is possible to >> ?? switch to a virtual TTY with Ctrl+Alt+Fn and log in there.> You thus might want to check if the problem occurs with 6.6 -- and > ideally also check if reverting the culprit there fixes things for you.Hi Thorsten and others, The problem also occurs with v6.6. Here is a decoded kernel log from an untainted kernel: https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log The culprit commit does not revert cleanly on v6.6. I have not yet attempted to resolve the conflicts. I have also updated the bug description at <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>. Thanks, Owen
Linux regression tracking (Thorsten Leemhuis)
2023-Nov-21 15:16 UTC
[Nouveau] [REGRESSION]: nouveau: Asynchronous wait on fence
On 15.11.23 07:19, Owen T. Heisler wrote:> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote: >> On 28.10.23 04:46, Owen T. Heisler wrote: >>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882 >>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180 >>> >>> ## Problem >>> >>> 1. Connect external display to DVI port on dock and run X with both >>> ??? displays in use. >>> 2. Wait hours or days. >>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops >>> ??? responding to keyboard/mouse input. In *some* cases it is >>> possible to >>> ??? switch to a virtual TTY with Ctrl+Alt+Fn and log in there. > >> You thus might want to check if the problem occurs with 6.6 -- and >> ideally also check if reverting the culprit there fixes things for you. > > The problem also occurs with v6.6.You meanwhile might want to give 6.7-rc as well on the off chance that it improves things, even if that is unlikely.> Here is a decoded kernel log from an > untainted kernel: > > https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log > > The culprit commit does not revert cleanly on v6.6. I have not yet > attempted to resolve the conflicts. > > I have also updated the bug description at > <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>.Maybe one of the nouveau developer can take a quick look at d386a4b54607cf and suggest a simple way to revert it in latest mainline. Maybe just removing the main chunk of code that is added is all that it takes. Ciao, Thorsten