Karol Herbst
2023-Aug-08 10:39 UTC
[Nouveau] 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts")
On Mon, Aug 7, 2023 at 5:05?PM Borislav Petkov <bp at alien8.de> wrote:> > On Mon, Aug 07, 2023 at 01:49:42PM +0200, Karol Herbst wrote: > > in what way does it stop? Just not progressing? That would be kinda > > concerning. Mind tracing with what arguments `nvkm_uevent_add` is > > called with and without that patch? > > Well, me dumping those args I guess made the box not freeze before > catching a #PF over serial. Does that help? > > .... > [ 3.410135] Unpacking initramfs... > [ 3.416319] software IO TLB: mapped [mem 0x00000000a877d000-0x00000000ac77d000] (64MB) > [ 3.418227] Initialise system trusted keyrings > [ 3.432273] workingset: timestamp_bits=56 max_order=22 bucket_order=0 > [ 3.439006] ntfs: driver 2.1.32 [Flags: R/W]. > [ 3.443368] fuse: init (API version 7.38) > [ 3.447601] 9p: Installing v9fs 9p2000 file system support > [ 3.453223] Key type asymmetric registered > [ 3.457332] Asymmetric key parser 'x509' registered > [ 3.462236] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250) > [ 3.475865] efifb: probing for efifb > [ 3.479458] efifb: framebuffer at 0xf9000000, using 1920k, total 1920k > [ 3.485969] efifb: mode is 800x600x32, linelength=3200, pages=1 > [ 3.491872] efifb: scrolling: redraw > [ 3.495438] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 > [ 3.502349] Console: switching to colour frame buffer device 100x37 > [ 3.509564] fb0: EFI VGA frame buffer device > [ 3.514013] ACPI: \_PR_.CP00: Found 4 idle states > [ 3.518850] ACPI: \_PR_.CP01: Found 4 idle states > [ 3.523687] ACPI: \_PR_.CP02: Found 4 idle states > [ 3.528515] ACPI: \_PR_.CP03: Found 4 idle states > [ 3.533346] ACPI: \_PR_.CP04: Found 4 idle states > [ 3.538173] ACPI: \_PR_.CP05: Found 4 idle states > [ 3.543003] ACPI: \_PR_.CP06: Found 4 idle states > [ 3.544219] Freeing initrd memory: 8196K > [ 3.547844] ACPI: \_PR_.CP07: Found 4 idle states > [ 3.609542] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > [ 3.616224] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A > [ 3.625552] serial 0000:00:16.3: enabling device (0000 -> 0003) > [ 3.633034] 0000:00:16.3: ttyS1 at I/O 0xf0a0 (irq = 17, base_baud = 115200) is a 16550A > [ 3.642451] Linux agpgart interface v0.103 > [ 3.647141] ACPI: bus type drm_connector registered > [ 3.653261] Console: switching to colour dummy device 80x25 > [ 3.659092] nouveau 0000:03:00.0: vgaarb: deactivate vga console > [ 3.665174] nouveau 0000:03:00.0: NVIDIA GT218 (0a8c00b1) > [ 3.784585] nouveau 0000:03:00.0: bios: version 70.18.83.00.08 > [ 3.792244] nouveau 0000:03:00.0: fb: 512 MiB DDR3 > [ 3.948786] nouveau 0000:03:00.0: DRM: VRAM: 512 MiB > [ 3.953755] nouveau 0000:03:00.0: DRM: GART: 1048576 MiB > [ 3.959073] nouveau 0000:03:00.0: DRM: TMDS table version 2.0 > [ 3.964808] nouveau 0000:03:00.0: DRM: DCB version 4.0 > [ 3.969938] nouveau 0000:03:00.0: DRM: DCB outp 00: 02000360 00000000 > [ 3.976367] nouveau 0000:03:00.0: DRM: DCB outp 01: 02000362 00020010 > [ 3.982792] nouveau 0000:03:00.0: DRM: DCB outp 02: 028003a6 0f220010 > [ 3.989223] nouveau 0000:03:00.0: DRM: DCB outp 03: 01011380 00000000 > [ 3.995647] nouveau 0000:03:00.0: DRM: DCB outp 04: 08011382 00020010 > [ 4.002076] nouveau 0000:03:00.0: DRM: DCB outp 05: 088113c6 0f220010 > [ 4.008511] nouveau 0000:03:00.0: DRM: DCB conn 00: 00101064 > [ 4.014151] nouveau 0000:03:00.0: DRM: DCB conn 01: 00202165 > [ 4.021710] nvkm_uevent_add: uevent: 0xffff888100242100, event: 0xffff8881022de1a0, id: 0x0, bits: 0x1, func: 0x0000000000000000 > [ 4.033680] nvkm_uevent_add: uevent: 0xffff888100242300, event: 0xffff8881022de1a0, id: 0x0, bits: 0x1, func: 0x0000000000000000 > [ 4.045429] nouveau 0000:03:00.0: DRM: MM: using COPY for buffer copies > [ 4.052059] stackdepot: allocating hash table of 1048576 entries via kvcalloc > [ 4.067191] nvkm_uevent_add: uevent: 0xffff888100242800, event: 0xffff888104b3e260, id: 0x0, bits: 0x1, func: 0x0000000000000000 > [ 4.078936] nvkm_uevent_add: uevent: 0xffff888100242900, event: 0xffff888104b3e260, id: 0x1, bits: 0x1, func: 0x0000000000000000 > [ 4.090514] nvkm_uevent_add: uevent: 0xffff888100242a00, event: 0xffff888102091f28, id: 0x1, bits: 0x3, func: 0xffffffff8177b700 > [ 4.102118] tsc: Refined TSC clocksource calibration: 3591.345 MHz > [ 4.108342] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x33c4635c383, max_idle_ns: 440795314831 ns > [ 4.108401] nvkm_uevent_add: uevent: 0xffff8881020b6000, event: 0xffff888102091f28, id: 0xf, bits: 0x3, func: 0xffffffff8177b700 > [ 4.129864] clocksource: Switched to clocksource tsc > [ 4.131478] [drm] Initialized nouveau 1.3.1 20120801 for 0000:03:00.0 on minor 0 > [ 4.143806] BUG: kernel NULL pointer dereference, address: 0000000000000020ahh, that would have been good to know :) Mind figuring out what's exactly NULL inside nvif_object_mthd? Or rather what line `nvif_object_mthd+0x136` belongs to, then it should be easy to figure out what's wrong here.> [ 4.144676] #PF: supervisor read access in kernel mode > [ 4.144676] #PF: error_code(0x0000) - not-present page > [ 4.144676] PGD 0 P4D 0 > [ 4.144676] Oops: 0000 [#1] PREEMPT SMP PTI > [ 4.144676] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc5-dirty #1 > [ 4.144676] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014 > [ 4.144676] RIP: 0010:nvif_object_mthd+0x136/0x1e0 > [ 4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 e8 8d 04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 <48> 8b 78 20 4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89 > [ 4.144676] RSP: 0000:ffffc90000023888 EFLAGS: 00010282 > [ 4.144676] RAX: 0000000000000000 RBX: ffff8881003bc000 RCX: 0000000000000008 > [ 4.144676] RDX: 0000000000000028 RSI: ffffc90000023948 RDI: ffffc900000238a8 > [ 4.144676] RBP: ffff8881003bc620 R08: ffff888102170000 R09: ffff888102170000 > [ 4.144676] R10: 0000000000000002 R11: 0000000000000001 R12: ffff8881003bc620 > [ 4.144676] R13: ffffc90000023948 R14: 0000000000000008 R15: 0000000000000000 > [ 4.144676] FS: 0000000000000000(0000) GS:ffff88843a700000(0000) knlGS:0000000000000000 > [ 4.144676] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4.144676] CR2: 0000000000000020 CR3: 000000000641e001 CR4: 00000000000606e0 > [ 4.144676] Call Trace: > [ 4.144676] <TASK> > [ 4.144676] ? __die+0x20/0x70 > [ 4.144676] ? page_fault_oops+0x14c/0x430 > [ 4.144676] ? fixup_exception+0x22/0x340 > [ 4.144676] ? kernelmode_fixup_or_oops+0x84/0x110 > [ 4.144676] ? exc_page_fault+0x66/0x1b0 > [ 4.144676] ? asm_exc_page_fault+0x22/0x30 > [ 4.144676] ? nvif_object_mthd+0x136/0x1e0 > [ 4.144676] ? nvif_object_mthd+0x123/0x1e0 > [ 4.144676] ? rcu_is_watching+0xd/0x40 > [ 4.144676] ? __mutex_lock+0xc9/0x790 > [ 4.144676] ? nouveau_dp_detect+0x67/0x4e0 > [ 4.144676] nvif_conn_hpd_status+0x22/0xd0 > [ 4.144676] nouveau_dp_detect+0x33b/0x4e0 > [ 4.144676] ? rt_mutex_unlock+0xf5/0x110 > [ 4.144676] nouveau_connector_detect+0x10f/0x470 > [ 4.144676] drm_helper_probe_detect+0x81/0xa0 > [ 4.144676] drm_helper_probe_single_connector_modes+0x441/0x510 > [ 4.144676] drm_client_modeset_probe+0x1f8/0xca0 > [ 4.144676] __drm_fb_helper_initial_config_and_unlock+0x34/0x560 > [ 4.144676] ? __mutex_lock+0xc9/0x790 > [ 4.144676] ? drm_client_register+0x22/0xa0 > [ 4.144676] drm_fbdev_generic_client_hotplug+0x66/0xc0 > [ 4.144676] drm_client_register+0x64/0xa0 > [ 4.144676] nouveau_drm_probe+0x20d/0x230 > [ 4.144676] local_pci_probe+0x46/0xa0 > [ 4.144676] pci_device_probe+0xaf/0x200 > [ 4.144676] really_probe+0xc2/0x2d0 > [ 4.144676] __driver_probe_device+0x73/0x120 > [ 4.144676] driver_probe_device+0x1e/0xe0 > [ 4.144676] __driver_attach+0x8a/0x190 > [ 4.144676] ? __pfx___driver_attach+0x10/0x10 > [ 4.144676] bus_for_each_dev+0x6a/0xb0 > [ 4.144676] bus_add_driver+0xeb/0x1f0 > [ 4.144676] driver_register+0x5c/0x120 > [ 4.144676] ? __pfx_nouveau_drm_init+0x10/0x10 > [ 4.144676] do_one_initcall+0x5b/0x280 > [ 4.144676] kernel_init_freeable+0x186/0x2f0 > [ 4.144676] ? __pfx_kernel_init+0x10/0x10 > [ 4.144676] kernel_init+0x16/0x1b0 > [ 4.144676] ret_from_fork+0x30/0x50 > [ 4.144676] ? __pfx_kernel_init+0x10/0x10 > [ 4.144676] ret_from_fork_asm+0x1b/0x30 > [ 4.144676] </TASK> > [ 4.144676] Modules linked in: > [ 4.144676] CR2: 0000000000000020 > [ 4.144676] ---[ end trace 0000000000000000 ]--- > [ 4.144676] RIP: 0010:nvif_object_mthd+0x136/0x1e0 > [ 4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 e8 8d 04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 <48> 8b 78 20 4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89 > [ 4.144676] RSP: 0000:ffffc90000023888 EFLAGS: 00010282 > [ 4.144676] RAX: 0000000000000000 RBX: ffff8881003bc000 RCX: 0000000000000008 > [ 4.144676] RDX: 0000000000000028 RSI: ffffc90000023948 RDI: ffffc900000238a8 > [ 4.144676] RBP: ffff8881003bc620 R08: ffff888102170000 R09: ffff888102170000 > [ 4.144676] R10: 0000000000000002 R11: 0000000000000001 R12: ffff8881003bc620 > [ 4.144676] R13: ffffc90000023948 R14: 0000000000000008 R15: 0000000000000000 > [ 4.144676] FS: 0000000000000000(0000) GS:ffff88843a700000(0000) knlGS:0000000000000000 > [ 4.144676] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4.144676] CR2: 0000000000000020 CR3: 000000000641e001 CR4: 00000000000606e0 > [ 4.144676] note: swapper/0[1] exited with irqs disabled > [ 4.549714] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 > [ 4.550687] Kernel Offset: disabled > [ 4.550687] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]--- > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette >
Borislav Petkov
2023-Aug-08 13:47 UTC
[Nouveau] 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts")
On Tue, Aug 08, 2023 at 12:39:32PM +0200, Karol Herbst wrote:> ahh, that would have been good to know :)Yeah, I didn't see it before - it would only freeze. Only after I added the printk you requested.> Mind figuring out what's exactly NULL inside nvif_object_mthd? Or > rather what line `nvif_object_mthd+0x136` belongs to, then it should > be easy to figure out what's wrong here.That looks like this: ffffffff816ddfee: e8 8d 04 4e 00 callq ffffffff81bbe480 <__memcpy> ffffffff816ddff3: 41 8d 56 20 lea 0x20(%r14),%edx ffffffff816ddff7: 49 8b 44 24 08 mov 0x8(%r12),%rax ffffffff816ddffc: 83 fa 17 cmp $0x17,%edx ffffffff816ddfff: 76 7d jbe ffffffff816de07e <nvif_object_mthd+0x1ae> ffffffff816de001: 49 39 c4 cmp %rax,%r12 ffffffff816de004: 74 45 je ffffffff816de04b <nvif_object_mthd+0x17b> <--- RIP points here. The 0x20 also fits the deref address: 0000000000000020. Which means %rax is 0. Yap. ffffffff816de006: 48 8b 78 20 mov 0x20(%rax),%rdi ffffffff816de00a: 4c 89 64 24 10 mov %r12,0x10(%rsp) ffffffff816de00f: 48 8b 40 38 mov 0x38(%rax),%rax ffffffff816de013: c6 44 24 06 ff movb $0xff,0x6(%rsp) ffffffff816de018: 31 c9 xor %ecx,%ecx ffffffff816de01a: 48 89 e6 mov %rsp,%rsi ffffffff816de01d: 48 8b 40 28 mov 0x28(%rax),%rax ffffffff816de021: e8 3a 0c 4f 00 callq ffffffff81bcec60 <__x86_indirect_thunk_array> Now, the preprocessed asm version of nvif/object.c says around here: call memcpy # # drivers/gpu/drm/nouveau/nvif/object.c:160: ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL); leal 32(%r14), %edx #, _108 # drivers/gpu/drm/nouveau/nvif/object.c:33: struct nvif_client *client = object->client; movq 8(%r12), %rax # object_19(D)->client, client # drivers/gpu/drm/nouveau/nvif/object.c:38: if (size >= sizeof(*args) && args->v0.version == 0) { cmpl $23, %edx #, _108 jbe .L69 #, # drivers/gpu/drm/nouveau/nvif/object.c:39: if (object != &client->object) cmpq %rax, %r12 # client, object je .L70 #, # drivers/gpu/drm/nouveau/nvif/object.c:47: return client->driver->ioctl(client->object.priv, data, size, hack); movq 32(%rax), %rdi # client_109->object.priv, client_109->object.priv So I'd say that client is NULL. IINM. movq %r12, 16(%rsp) # object, MEM[(union *)&stack].v0.object # drivers/gpu/drm/nouveau/nvif/object.c:47: return client->driver->ioctl(client->object.priv, data, size, hack); movq 56(%rax), %rax # client_109->driver, client_109->driver # drivers/gpu/drm/nouveau/nvif/object.c:43: args->v0.owner = NVIF_IOCTL_V0_OWNER_ANY; movb $-1, 6(%rsp) #, MEM[(union *)&stack].v0.owner .L64: # drivers/gpu/drm/nouveau/nvif/object.c:47: return client->driver->ioctl(client->object.priv, data, size, hack); xorl %ecx, %ecx # movq %rsp, %rsi #, movq 40(%rax), %rax #, _77->ioctl call __x86_indirect_thunk_rax # drivers/gpu/drm/nouveau/nvif/object.c:161: memcpy(data, args->mthd.data, size);> > [ 4.144676] #PF: supervisor read access in kernel mode > > [ 4.144676] #PF: error_code(0x0000) - not-present page > > [ 4.144676] PGD 0 P4D 0 > > [ 4.144676] Oops: 0000 [#1] PREEMPT SMP PTI > > [ 4.144676] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc5-dirty #1 > > [ 4.144676] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014 > > [ 4.144676] RIP: 0010:nvif_object_mthd+0x136/0x1e0 > > [ 4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 e8 8d 04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 <48> 8b 78 20 4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89Opcode bytes around RIP look correct too: ./scripts/decodecode < /tmp/oops [ 4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 e8 8d 04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 <48> 8b 78 20 4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89 All code ======= 0: f2 4c 89 ee repnz mov %r13,%rsi 4: 48 8d 7c 24 20 lea 0x20(%rsp),%rdi 9: 66 89 04 24 mov %ax,(%rsp) d: c6 44 24 18 00 movb $0x0,0x18(%rsp) 12: e8 8d 04 4e 00 callq 0x4e04a4 17: 41 8d 56 20 lea 0x20(%r14),%edx 1b: 49 8b 44 24 08 mov 0x8(%r12),%rax 20: 83 fa 17 cmp $0x17,%edx 23: 76 7d jbe 0xa2 25: 49 39 c4 cmp %rax,%r12 28: 74 45 je 0x6f 2a:* 48 8b 78 20 mov 0x20(%rax),%rdi <-- trapping instruction 2e: 4c 89 64 24 10 mov %r12,0x10(%rsp) 33: 48 8b 40 38 mov 0x38(%rax),%rax 37: c6 44 24 06 ff movb $0xff,0x6(%rsp) 3c: 31 c9 xor %ecx,%ecx 3e: 48 rex.W 3f: 89 .byte 0x89 HTH. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Takashi Iwai
2023-Aug-09 09:21 UTC
[Nouveau] 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts")
On Tue, 08 Aug 2023 12:39:32 +0200, Karol Herbst wrote:> > On Mon, Aug 7, 2023 at 5:05?PM Borislav Petkov <bp at alien8.de> wrote: > > > > On Mon, Aug 07, 2023 at 01:49:42PM +0200, Karol Herbst wrote: > > > in what way does it stop? Just not progressing? That would be kinda > > > concerning. Mind tracing with what arguments `nvkm_uevent_add` is > > > called with and without that patch? > > > > Well, me dumping those args I guess made the box not freeze before > > catching a #PF over serial. Does that help? > > > > .... > > [ 3.410135] Unpacking initramfs... > > [ 3.416319] software IO TLB: mapped [mem 0x00000000a877d000-0x00000000ac77d000] (64MB) > > [ 3.418227] Initialise system trusted keyrings > > [ 3.432273] workingset: timestamp_bits=56 max_order=22 bucket_order=0 > > [ 3.439006] ntfs: driver 2.1.32 [Flags: R/W]. > > [ 3.443368] fuse: init (API version 7.38) > > [ 3.447601] 9p: Installing v9fs 9p2000 file system support > > [ 3.453223] Key type asymmetric registered > > [ 3.457332] Asymmetric key parser 'x509' registered > > [ 3.462236] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250) > > [ 3.475865] efifb: probing for efifb > > [ 3.479458] efifb: framebuffer at 0xf9000000, using 1920k, total 1920k > > [ 3.485969] efifb: mode is 800x600x32, linelength=3200, pages=1 > > [ 3.491872] efifb: scrolling: redraw > > [ 3.495438] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 > > [ 3.502349] Console: switching to colour frame buffer device 100x37 > > [ 3.509564] fb0: EFI VGA frame buffer device > > [ 3.514013] ACPI: \_PR_.CP00: Found 4 idle states > > [ 3.518850] ACPI: \_PR_.CP01: Found 4 idle states > > [ 3.523687] ACPI: \_PR_.CP02: Found 4 idle states > > [ 3.528515] ACPI: \_PR_.CP03: Found 4 idle states > > [ 3.533346] ACPI: \_PR_.CP04: Found 4 idle states > > [ 3.538173] ACPI: \_PR_.CP05: Found 4 idle states > > [ 3.543003] ACPI: \_PR_.CP06: Found 4 idle states > > [ 3.544219] Freeing initrd memory: 8196K > > [ 3.547844] ACPI: \_PR_.CP07: Found 4 idle states > > [ 3.609542] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > > [ 3.616224] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A > > [ 3.625552] serial 0000:00:16.3: enabling device (0000 -> 0003) > > [ 3.633034] 0000:00:16.3: ttyS1 at I/O 0xf0a0 (irq = 17, base_baud = 115200) is a 16550A > > [ 3.642451] Linux agpgart interface v0.103 > > [ 3.647141] ACPI: bus type drm_connector registered > > [ 3.653261] Console: switching to colour dummy device 80x25 > > [ 3.659092] nouveau 0000:03:00.0: vgaarb: deactivate vga console > > [ 3.665174] nouveau 0000:03:00.0: NVIDIA GT218 (0a8c00b1) > > [ 3.784585] nouveau 0000:03:00.0: bios: version 70.18.83.00.08 > > [ 3.792244] nouveau 0000:03:00.0: fb: 512 MiB DDR3 > > [ 3.948786] nouveau 0000:03:00.0: DRM: VRAM: 512 MiB > > [ 3.953755] nouveau 0000:03:00.0: DRM: GART: 1048576 MiB > > [ 3.959073] nouveau 0000:03:00.0: DRM: TMDS table version 2.0 > > [ 3.964808] nouveau 0000:03:00.0: DRM: DCB version 4.0 > > [ 3.969938] nouveau 0000:03:00.0: DRM: DCB outp 00: 02000360 00000000 > > [ 3.976367] nouveau 0000:03:00.0: DRM: DCB outp 01: 02000362 00020010 > > [ 3.982792] nouveau 0000:03:00.0: DRM: DCB outp 02: 028003a6 0f220010 > > [ 3.989223] nouveau 0000:03:00.0: DRM: DCB outp 03: 01011380 00000000 > > [ 3.995647] nouveau 0000:03:00.0: DRM: DCB outp 04: 08011382 00020010 > > [ 4.002076] nouveau 0000:03:00.0: DRM: DCB outp 05: 088113c6 0f220010 > > [ 4.008511] nouveau 0000:03:00.0: DRM: DCB conn 00: 00101064 > > [ 4.014151] nouveau 0000:03:00.0: DRM: DCB conn 01: 00202165 > > [ 4.021710] nvkm_uevent_add: uevent: 0xffff888100242100, event: 0xffff8881022de1a0, id: 0x0, bits: 0x1, func: 0x0000000000000000 > > [ 4.033680] nvkm_uevent_add: uevent: 0xffff888100242300, event: 0xffff8881022de1a0, id: 0x0, bits: 0x1, func: 0x0000000000000000 > > [ 4.045429] nouveau 0000:03:00.0: DRM: MM: using COPY for buffer copies > > [ 4.052059] stackdepot: allocating hash table of 1048576 entries via kvcalloc > > [ 4.067191] nvkm_uevent_add: uevent: 0xffff888100242800, event: 0xffff888104b3e260, id: 0x0, bits: 0x1, func: 0x0000000000000000 > > [ 4.078936] nvkm_uevent_add: uevent: 0xffff888100242900, event: 0xffff888104b3e260, id: 0x1, bits: 0x1, func: 0x0000000000000000 > > [ 4.090514] nvkm_uevent_add: uevent: 0xffff888100242a00, event: 0xffff888102091f28, id: 0x1, bits: 0x3, func: 0xffffffff8177b700 > > [ 4.102118] tsc: Refined TSC clocksource calibration: 3591.345 MHz > > [ 4.108342] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x33c4635c383, max_idle_ns: 440795314831 ns > > [ 4.108401] nvkm_uevent_add: uevent: 0xffff8881020b6000, event: 0xffff888102091f28, id: 0xf, bits: 0x3, func: 0xffffffff8177b700 > > [ 4.129864] clocksource: Switched to clocksource tsc > > [ 4.131478] [drm] Initialized nouveau 1.3.1 20120801 for 0000:03:00.0 on minor 0 > > [ 4.143806] BUG: kernel NULL pointer dereference, address: 0000000000000020 > > ahh, that would have been good to know :) Mind figuring out what's > exactly NULL inside nvif_object_mthd? Or rather what line > `nvif_object_mthd+0x136` belongs to, then it should be easy to figure > out what's wrong here.FWIW, we've hit the bug on openSUSE Tumbleweed 6.4.8 kernel: https://bugzilla.suse.com/show_bug.cgi?id=1214073 Confirmed that reverting the patch cured the issue. FWIW, loading nouveau showed a refcount_t warning just before the NULL dereference: [ 163.237655] ACPI Warning: \_SB.PCI0.IXVE.IGPU._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230331/nsarguments-61) [ 163.237700] ACPI: \_SB_.PCI0.IXVE.IGPU: failed to evaluate _DSM [ 163.237755] nouveau 0000:02:00.0: enabling device (0002 -> 0003) [ 163.238089] ACPI: \_SB_.PCI0.LGPU: Enabled at IRQ 20 [ 163.249419] Console: switching to colour dummy device 80x25 [ 163.266174] nouveau 0000:02:00.0: vgaarb: deactivate vga console [ 163.266307] nouveau 0000:02:00.0: NVIDIA MCP79/MCP7A (0ac180b1) [ 163.287303] nouveau 0000:02:00.0: bios: version 62.79.40.00.01 [ 163.309529] nouveau 0000:02:00.0: fb: 256 MiB stolen system memory [ 163.383121] nouveau 0000:02:00.0: DRM: VRAM: 256 MiB [ 163.383132] nouveau 0000:02:00.0: DRM: GART: 1048576 MiB [ 163.383138] nouveau 0000:02:00.0: DRM: TMDS table version 2.0 [ 163.383142] nouveau 0000:02:00.0: DRM: DCB version 4.0 [ 163.383145] nouveau 0000:02:00.0: DRM: DCB outp 00: 01000123 00010014 [ 163.383150] nouveau 0000:02:00.0: DRM: DCB outp 01: 02021232 00000010 [ 163.383154] nouveau 0000:02:00.0: DRM: DCB outp 02: 02021286 0f220010 [ 163.383158] nouveau 0000:02:00.0: DRM: DCB conn 00: 00000040 [ 163.383162] nouveau 0000:02:00.0: DRM: DCB conn 01: 0000a146 [ 163.385635] nouveau 0000:02:00.0: DRM: MM: using M2MF for buffer copies [ 163.417977] ------------[ cut here ]------------ [ 163.417988] refcount_t: saturated; leaking memory. [ 163.418012] WARNING: CPU: 1 PID: 2873 at lib/refcount.c:19 refcount_warn_saturate+0x9b/0x110 [ 163.418022] Modules linked in: nouveau(+) button mxm_wmi i2c_algo_bit drm_display_helper drm_ttm_helper xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype iptable_filter bpfilter br_netfilter bridge stp llc overlay ccm af_packet bnep btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic uvcvideo rtl8xxxu videobuf2_vmalloc mac80211 uvc videobuf2_memops videobuf2_v4l2 videodev libarc4 videobuf2_common mc cfg80211 hid_appleir hid_apple bcm5974 apple_mfi_fastcharge iscsi_ibft iscsi_boot_sysfs joydev rfkill qrtr z3fold snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio coretemp snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_intel snd_hda_core applesmc snd_hwdep kvm snd_pcm irqbypass pcspkr acpi_cpufreq snd_timer binfmt_misc snd forcedeth soundcore squashfs nls_iso8859_1 loop nls_cp437 vfat fat i2c_nforce2 acpi_als industrialio_triggered_buffer kfifo_buf indu strialio sbs sbshc apple_bl ac [ 163.418129] tiny_power_button fuse efi_pstore configfs dmi_sysfs ip_tables x_tables hid_generic usbhid ttm video wmi cec ohci_pci ohci_hcd ehci_pci rc_core sr_mod ehci_hcd sha512_ssse3 cdrom usbcore nv_tco btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr efivarfs [last unloaded: button] [ 163.418177] CPU: 1 PID: 2873 Comm: modprobe Not tainted 6.4.8-1-default #1 openSUSE Tumbleweed 5f0d78911475bf45bbeef64510275b9fba2542b1 [ 163.418183] Hardware name: Apple Inc. MacBook5,1/Mac-F42D89C8, BIOS MB51.88Z.007D.B03.0904271443 04/27/09 [ 163.418187] RIP: 0010:refcount_warn_saturate+0x9b/0x110 [ 163.418192] Code: 01 01 e8 68 7b aa ff 0f 0b c3 cc cc cc cc 80 3d e6 e1 a8 01 00 75 a8 48 c7 c7 d0 ed 05 a4 c6 05 d6 e1 a8 01 01 e8 45 7b aa ff <0f> 0b c3 cc cc cc cc 80 3d c0 e1 a8 01 00 75 85 48 c7 c7 28 ee 05 [ 163.418196] RSP: 0018:ffffbae941613aa0 EFLAGS: 00010086 [ 163.418200] RAX: 0000000000000000 RBX: ffff951bc88c2000 RCX: 0000000000000027 [ 163.418204] RDX: ffff951cf81274c8 RSI: 0000000000000001 RDI: ffff951cf81274c0 [ 163.418207] RBP: 0000000000000246 R08: 0000000000000000 R09: ffffbae941613948 [ 163.418210] R10: 0000000000000003 R11: ffffffffa4958d48 R12: ffff951be0df3a58 [ 163.418213] R13: ffffbae941613ad8 R14: ffff951be0df3800 R15: 0000000000000000 [ 163.418216] FS: 00007f81c0247740(0000) GS:ffff951cf8100000(0000) knlGS:0000000000000000 [ 163.418220] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 163.418223] CR2: 00005653d818cd64 CR3: 000000012c72a000 CR4: 00000000000006e0 [ 163.418226] Call Trace: [ 163.418231] <TASK> [ 163.418234] ? refcount_warn_saturate+0x9b/0x110 [ 163.418238] ? __warn+0x81/0x130 [ 163.418248] ? refcount_warn_saturate+0x9b/0x110 [ 163.418252] ? report_bug+0x171/0x1a0 [ 163.418259] ? handle_bug+0x3c/0x80 [ 163.418264] ? exc_invalid_op+0x17/0x70 [ 163.418268] ? asm_exc_invalid_op+0x1a/0x20 [ 163.418275] ? refcount_warn_saturate+0x9b/0x110 [ 163.418279] drm_connector_list_iter_next+0x97/0xc0 [ 163.418289] drm_connector_register_all+0x3d/0xf0 [ 163.418296] drm_modeset_register_all+0x5f/0x80 [ 163.418302] drm_dev_register+0x114/0x240 [ 163.418307] nouveau_drm_probe+0x16a/0x280 [nouveau 7f21e95875a4a0137564007ae3277f6b641e9279] [ 163.418713] local_pci_probe+0x45/0xa0 [ 163.418719] pci_device_probe+0xc7/0x230 [ 163.418726] really_probe+0x19e/0x3e0 [ 163.418730] ? __pfx___driver_attach+0x10/0x10 [ 163.418734] __driver_probe_device+0x78/0x160 [ 163.418737] driver_probe_device+0x1f/0x90 [ 163.418741] __driver_attach+0xd2/0x1c0 [ 163.418745] bus_for_each_dev+0x77/0xc0 [ 163.418751] bus_add_driver+0x116/0x220 [ 163.418757] driver_register+0x59/0x100 [ 163.418762] ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau 7f21e95875a4a0137564007ae3277f6b641e9279] [ 163.418999] do_one_initcall+0x4a/0x220 [ 163.418999] ? kmalloc_trace+0x2a/0xa0 [ 163.418999] do_init_module+0x60/0x240 [ 163.418999] __do_sys_init_module+0x17f/0x1b0 [ 163.418999] do_syscall_64+0x60/0x90 [ 163.418999] ? syscall_exit_to_user_mode+0x1b/0x40 [ 163.418999] ? do_syscall_64+0x6c/0x90 [ 163.418999] ? count_memcg_events.constprop.0+0x1a/0x30 [ 163.418999] ? handle_mm_fault+0x9e/0x350 [ 163.418999] ? do_user_addr_fault+0x179/0x640 [ 163.418999] ? exc_page_fault+0x71/0x160 [ 163.418999] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 163.418999] RIP: 0033:0x7f81bfb19a5e [ 163.418999] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7a 03 0d 00 f7 d8 64 89 01 48 [ 163.418999] RSP: 002b:00007ffddb1760c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af [ 163.418999] RAX: ffffffffffffffda RBX: 0000560bc842df00 RCX: 00007f81bfb19a5e [ 163.418999] RDX: 0000560bc8432900 RSI: 00000000006aef9b RDI: 00007f81bea94010 [ 163.418999] RBP: 0000560bc8432900 R08: 0000560bc8432c20 R09: 0000000000000000 [ 163.418999] R10: 0000000000012b71 R11: 0000000000000246 R12: 0000000000040000 [ 163.418999] R13: 0000000000000000 R14: 0000000000000009 R15: 0000560bc842d7b0 [ 163.418999] </TASK> [ 163.418999] ---[ end trace 0000000000000000 ]--- The full dmesg is found in https://bugzilla.suse.com/attachment.cgi?id=868688 thanks, Takashi