Chris Clayton
2023-Jan-30 23:09 UTC
[Nouveau] linux-6.2-rc4+ hangs on poweroff/reboot: Bisected
Hi again. On 30/01/2023 20:19, Chris Clayton wrote:> Thanks, Ben.<snip>>> Hey, >> >> This is a complete shot-in-the-dark, as I don't see this behaviour on >> *any* of my boards. Could you try the attached patch please? > > Unfortunately, the patch made no difference. > > I've been looking at how the graphics on my laptop is set up, and have a bit of a worry about whether the firmware might > be playing a part in this problem. In order to offload video decoding to the NVidia TU117 GPU, it seems the scrubber > firmware must be available, but as far as I know,that has not been released by NVidia. To get it to work, I followed > what ubuntu have done and the scrubber in /lib/firmware/nvidia/tu117/nvdec/ is a symlink to > ../../tu116/nvdev/scrubber.bin. That, of course, means that some of the firmware loaded is for a different card is being > loaded. I note that processing related to firmware is being changed in the patch. Might my set up be at the root of my > problem? > > I'll have a fiddle an see what I can work out. > > Chris > >> >> Thanks, >> Ben. >> >>>Well, my fiddling has got my system rebooting and shutting down successfully again. I found that if I delete the symlink to the scrubber firmware, reboot and shutdown work again. There are however, a number of other files in the tu117 firmware directory tree that that are symlinks to actual files in its tu116 counterpart. So I deleted all of those too. Unfortunately, the absence of one or more of those symlinks causes Xorg to fail to start. I've reinstated all the links except scrubber and I now have a system that works as it did until I tried to run a kernel that includes the bad commit I identified in my bisection. That includes offloading video decoding to the NVidia card, so what ever I read that said the scrubber firmware was needed seems to have been wrong. I get a new message that (nouveau 0000:01:00.0: fb: VPR locked, but no scrubber binary!), but, hey, we can't have everything. If you still want to get to the bottom of this, let me know what you need me to provide and I'll do my best. I suspect you might want to because there will a n awful lot of Ubuntu-based systems out there with that scrubber.bin symlink in place. On the other hand,m it could but quite a while before ubuntu are deploying 6.2 or later kernels. Thanks, Chris <snip>
Ben Skeggs
2023-Jan-30 23:27 UTC
[Nouveau] linux-6.2-rc4+ hangs on poweroff/reboot: Bisected
On Tue, 31 Jan 2023 at 09:09, Chris Clayton <chris2553 at googlemail.com> wrote:> > Hi again. > > On 30/01/2023 20:19, Chris Clayton wrote: > > Thanks, Ben. > > <snip> > > >> Hey, > >> > >> This is a complete shot-in-the-dark, as I don't see this behaviour on > >> *any* of my boards. Could you try the attached patch please? > > > > Unfortunately, the patch made no difference. > > > > I've been looking at how the graphics on my laptop is set up, and have a bit of a worry about whether the firmware might > > be playing a part in this problem. In order to offload video decoding to the NVidia TU117 GPU, it seems the scrubber > > firmware must be available, but as far as I know,that has not been released by NVidia. To get it to work, I followed > > what ubuntu have done and the scrubber in /lib/firmware/nvidia/tu117/nvdec/ is a symlink to > > ../../tu116/nvdev/scrubber.bin. That, of course, means that some of the firmware loaded is for a different card is being > > loaded. I note that processing related to firmware is being changed in the patch. Might my set up be at the root of my > > problem? > > > > I'll have a fiddle an see what I can work out. > > > > Chris > > > >> > >> Thanks, > >> Ben. > >> > >>> > > Well, my fiddling has got my system rebooting and shutting down successfully again. I found that if I delete the symlink > to the scrubber firmware, reboot and shutdown work again. There are however, a number of other files in the tu117 > firmware directory tree that that are symlinks to actual files in its tu116 counterpart. So I deleted all of those too. > Unfortunately, the absence of one or more of those symlinks causes Xorg to fail to start. I've reinstated all the links > except scrubber and I now have a system that works as it did until I tried to run a kernel that includes the bad commit > I identified in my bisection. That includes offloading video decoding to the NVidia card, so what ever I read that said > the scrubber firmware was needed seems to have been wrong. I get a new message that (nouveau 0000:01:00.0: fb: VPR > locked, but no scrubber binary!), but, hey, we can't have everything. > > If you still want to get to the bottom of this, let me know what you need me to provide and I'll do my best. I suspect > you might want to because there will a n awful lot of Ubuntu-based systems out there with that scrubber.bin symlink in > place. On the other hand,m it could but quite a while before ubuntu are deploying 6.2 or later kernels.The symlinks are correct - whole groups of GPUs share the same FW, and we use symlinks in linux-firmware to represent this. I don't really have any ideas how/why this patch causes issues with shutdown - it's a path that only gets executed during initialisation. Can you try and capture the kernel log during shutdown ("dmesg -w" over ssh? netconsole?), and see if there's any relevant messages providing a hint at what's going on? Alternatively, you could try unloading the module (you will have to stop X/wayland/gdm/etc/etc first) and seeing if that hangs too. Ben.> > Thanks, > > Chris > > <snip>