Ben Skeggs
2025-Jul-09 01:14 UTC
[REGRESSION] NVIDIA ION graphics broken with Linux 6.16-rc*
On 7/9/25 09:16, Jamie Heilman wrote:> Rui Salvaterra wrote: >> Hi, >> >> >> The machine (Atom 330 CPU, ION chipset, GeForce 9400M graphics) works, >> but graphics are dead. Dmesg shows the following (Linux 6.16-rc5): >> >> [ 34.408331] BUG: kernel NULL pointer dereference, address: 0000000000000000 >> [ 34.408351] #PF: supervisor instruction fetch in kernel mode >> [ 34.408358] #PF: error_code(0x0010) - not-present page >> [ 34.408364] PGD 0 P4D 0 >> [ 34.408373] Oops: Oops: 0010 [#1] SMP >> [ 34.408383] CPU: 2 UID: 0 PID: 583 Comm: Xorg Not tainted >> 6.16.0-rc5-dbg+ #187 PREEMPTLAZY >> [ 34.408393] Hardware name: To Be Filled By O.E.M. To Be Filled By >> O.E.M./To be filled by O.E.M., BIOS 080015 08/13/2009 >> [ 34.408399] RIP: 0010:0x0 >> [ 34.408414] Code: Unable to access opcode bytes at 0xffffffffffffffd6. >> [ 34.408420] RSP: 0018:ffff88800378bc08 EFLAGS: 00010202 >> [ 34.408428] RAX: ffffffff82071c60 RBX: ffff888008e6f000 RCX: 0000000000000978 >> [ 34.408434] RDX: 0000000000000020 RSI: 0000000000000002 RDI: ffff888008e6f000 >> [ 34.408440] RBP: ffff88800378bd18 R08: 0000000000000000 R09: 00000000000003ff >> [ 34.408445] R10: 0000000000000000 R11: ffff88800378bcc0 R12: ffff88800378bdb8 >> [ 34.408451] R13: ffff888007dad9c0 R14: ffff888004285680 R15: ffff888007e671c0 >> [ 34.408457] FS: 00007f2cc7b2eb00(0000) GS:ffff888149ecf000(0000) >> knlGS:0000000000000000 >> [ 34.408464] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 34.408469] CR2: ffffffffffffffd6 CR3: 0000000008a08000 CR4: 00000000000006f0 >> [ 34.408475] Call Trace: >> [ 34.408482] <TASK> >> [ 34.408486] nouveau_gem_ioctl_pushbuf+0x10d8/0x1240 >> [ 34.408504] ? nouveau_gem_ioctl_new+0x160/0x160 >> [ 34.408513] drm_ioctl_kernel+0x7a/0xe0 >> [ 34.408524] drm_ioctl+0x1ef/0x490 >> [ 34.408532] ? nouveau_gem_ioctl_new+0x160/0x160 >> [ 34.408541] ? __handle_mm_fault+0xff2/0x1510 >> [ 34.408552] nouveau_drm_ioctl+0x50/0xa0 >> [ 34.408560] __x64_sys_ioctl+0x4be/0xa90 >> [ 34.408570] ? handle_mm_fault+0xb5/0x1a0 >> [ 34.408578] ? lock_mm_and_find_vma+0x34/0x170 >> [ 34.408587] do_syscall_64+0x51/0x1d0 >> [ 34.408596] entry_SYSCALL_64_after_hwframe+0x4b/0x53 >> [ 34.408605] RIP: 0033:0x7f2cc7d2f9dd >> [ 34.408612] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 >> c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 >> 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 >> 00 00 >> [ 34.408620] RSP: 002b:00007fff6a501ee0 EFLAGS: 00000246 ORIG_RAX: >> 0000000000000010 >> [ 34.408628] RAX: ffffffffffffffda RBX: 000055c7792b3f78 RCX: 00007f2cc7d2f9dd >> [ 34.408634] RDX: 00007fff6a501fa0 RSI: 00000000c0406481 RDI: 0000000000000011 >> [ 34.408640] RBP: 00007fff6a501f30 R08: 0000000000000978 R09: 000055c7792af740 >> [ 34.408645] R10: 0000000000000002 R11: 0000000000000246 R12: 00007fff6a501fa0 >> [ 34.408651] R13: 00000000c0406481 R14: 0000000000000011 R15: 000055c7792ac700 >> [ 34.408660] </TASK> >> [ 34.408664] Modules linked in: >> [ 34.408671] CR2: 0000000000000000 >> [ 34.408678] ---[ end trace 0000000000000000 ]--- >> [ 34.408682] RIP: 0010:0x0 >> [ 34.408691] Code: Unable to access opcode bytes at 0xffffffffffffffd6. >> [ 34.408696] RSP: 0018:ffff88800378bc08 EFLAGS: 00010202 >> [ 34.408703] RAX: ffffffff82071c60 RBX: ffff888008e6f000 RCX: 0000000000000978 >> [ 34.408709] RDX: 0000000000000020 RSI: 0000000000000002 RDI: ffff888008e6f000 >> [ 34.408715] RBP: ffff88800378bd18 R08: 0000000000000000 R09: 00000000000003ff >> [ 34.408720] R10: 0000000000000000 R11: ffff88800378bcc0 R12: ffff88800378bdb8 >> [ 34.408726] R13: ffff888007dad9c0 R14: ffff888004285680 R15: ffff888007e671c0 >> [ 34.408731] FS: 00007f2cc7b2eb00(0000) GS:ffff888149ecf000(0000) >> knlGS:0000000000000000 >> [ 34.408738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 34.408743] CR2: ffffffffffffffd6 CR3: 0000000008a08000 CR4: 00000000000006f0 >> [ 34.408750] note: Xorg[583] exited with irqs disabled >> >> Unfortunately, bisecting is not feasible for me. > That looks pretty similar to the problem I posted > (https://lore.kernel.org/lkml/aElJIo9_Se6tAR1a at audible.transient.net/) > that I bisected to 862450a85b85 ("drm/nouveau/gf100-: track chan > progress with non-WFI semaphore release"). It still reverts cleanly > as of v6.16-rc5 so you might want to give that a shot.Hi, Thank you for bisecting!? Are you able to try the attached patch? Thanks, Ben.>-------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-drm-nouveau-nvif-fix-null-ptr-deref-on-pre-fermi-boa.patch Type: text/x-patch Size: 1038 bytes Desc: 0001-drm-nouveau-nvif-fix-null-ptr-deref-on-pre-fermi-boa.patch URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20250709/74c68025/attachment-0001.bin>
Rui Salvaterra
2025-Jul-09 11:45 UTC
[REGRESSION] NVIDIA ION graphics broken with Linux 6.16-rc*
Hi, Ben, On Wed, 9 Jul 2025 at 02:15, Ben Skeggs <bskeggs at nvidia.com> wrote:>[snipped]> > Thank you for bisecting! Are you able to try the attached patch?Thanks a lot, your patch fixes the issue for me! Feel free to add my... Tested-by: Rui Salvaterra <rsalvaterra at gmail.com> Kind regards, Rui
Jamie Heilman
2025-Jul-10 00:16 UTC
[REGRESSION] NVIDIA ION graphics broken with Linux 6.16-rc*
Ben Skeggs wrote:> On 7/9/25 09:16, Jamie Heilman wrote: > > Rui Salvaterra wrote: > > > Unfortunately, bisecting is not feasible for me. > > That looks pretty similar to the problem I posted > > (https://lore.kernel.org/lkml/aElJIo9_Se6tAR1a at audible.transient.net/) > > that I bisected to 862450a85b85 ("drm/nouveau/gf100-: track chan > > progress with non-WFI semaphore release"). It still reverts cleanly > > as of v6.16-rc5 so you might want to give that a shot. > > Hi, > > Thank you for bisecting!? Are you able to try the attached patch?Yeah that got graphics visible again for me, though there's something else horrible going on now (still? I'm not sure if its new behavior or not) and it blows out my dmesg ringbuffer with errors or warnings of some kind, that I was just about to start trying to debug that when some power event seems to have fried my PSU. Combined with a bunch of filesystem corruption, its going to be a while a before I can get that system back up to that spot where I can troubleshoot it again, the root volume is fried and I'm going to have rebuild. Anyway, I think whatever it is was probably an entirely separate issue.> From 6987c1c254285305fdc20270e21709a313632e0d Mon Sep 17 00:00:00 2001 > From: Ben Skeggs <bskeggs at nvidia.com> > Date: Wed, 9 Jul 2025 10:54:15 +1000 > Subject: [PATCH] drm/nouveau/nvif: fix null ptr deref on pre-fermi boards > > Check that gpfifo.post() exists before trying to call it. > > Fixes: 862450a85b85 ("drm/nouveau/gf100-: track chan progress with non-WFI semaphore release") > Signed-off-by: Ben Skeggs <bskeggs at nvidia.com> > --- > drivers/gpu/drm/nouveau/nvif/chan.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nvif/chan.c b/drivers/gpu/drm/nouveau/nvif/chan.c > index baa10227d51a..80c01017d642 100644 > --- a/drivers/gpu/drm/nouveau/nvif/chan.c > +++ b/drivers/gpu/drm/nouveau/nvif/chan.c > @@ -39,6 +39,9 @@ nvif_chan_gpfifo_post(struct nvif_chan *chan) > const u32 pbptr = (chan->push.cur - map) + chan->func->gpfifo.post_size; > const u32 gpptr = (chan->gpfifo.cur + 1) & chan->gpfifo.max; > > + if (!chan->func->gpfifo.post) > + return 0; > + > return chan->func->gpfifo.post(chan, gpptr, pbptr); > } > > -- > 2.49.0 >-- Jamie Heilman http://audible.transient.net/~jamie/