Christian König
2021-Mar-15 08:05 UTC
[Nouveau] [bisected] Re: nouveau: lockdep cli->mutex vs reservation_ww_class_mutex deadlock report
Hi Mike, I'm pretty sure your bisection is a bit off. The patch you mentioned is completely unrelated to Nouveau and I think the code path in question is not even used by this driver. Regards, Christian. Am 14.03.21 um 05:48 schrieb Mike Galbraith:> This little bugger bisected to... > > b73cd1e2ebfc "drm/ttm: stop destroying pinned ghost object" > > ...and (the second time around) was confirmed on the spot. However, > while the fingered commit still reverts cleanly, doing so at HEAD does > not make lockdep return to happy camper state (leading to bisection > #2), ie the fingered commit is only the beginning of nouveau's 5.12 > cycle lockdep woes. > > homer:..kernel/linux-master # quilt applied|grep revert > patches/revert-drm-ttm-Remove-pinned-bos-from-LRU-in-ttm_bo_move_to_lru_tail-v2.patch > patches/revert-drm-ttm-cleanup-LRU-handling-further.patch > patches/revert-drm-ttm-use-pin_count-more-extensively.patch > patches/revert-drm-ttm-stop-destroying-pinned-ghost-object.patch > > That still ain't enough to appease lockdep at HEAD. I'm not going to > muck about with it beyond that, since this looks a whole lot like yet > another example of "fixing stuff exposes other busted stuff". > > On Wed, 2021-03-10 at 10:58 +0100, Mike Galbraith wrote: >> [ 29.966927] =====================================================>> [ 29.966929] WARNING: possible circular locking dependency detected >> [ 29.966932] 5.12.0.g05a59d7-master #2 Tainted: G W E >> [ 29.966934] ------------------------------------------------------ >> [ 29.966937] X/2145 is trying to acquire lock: >> [ 29.966939] ffff888120714518 (&cli->mutex){+.+.}-{3:3}, at: nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.967002] >> but task is already holding lock: >> [ 29.967004] ffff888123c201a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: nouveau_bo_pin+0x2b/0x310 [nouveau] >> [ 29.967053] >> which lock already depends on the new lock. >> >> [ 29.967056] >> the existing dependency chain (in reverse order) is: >> [ 29.967058] >> -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}: >> [ 29.967063] __ww_mutex_lock.constprop.16+0xbe/0x10d0 >> [ 29.967069] nouveau_bo_pin+0x2b/0x310 [nouveau] >> [ 29.967112] nouveau_channel_prep+0x106/0x2e0 [nouveau] >> [ 29.967151] nouveau_channel_new+0x4f/0x760 [nouveau] >> [ 29.967188] nouveau_abi16_ioctl_channel_alloc+0xdf/0x350 [nouveau] >> [ 29.967223] drm_ioctl_kernel+0x91/0xe0 [drm] >> [ 29.967245] drm_ioctl+0x2db/0x380 [drm] >> [ 29.967259] nouveau_drm_ioctl+0x56/0xb0 [nouveau] >> [ 29.967303] __x64_sys_ioctl+0x76/0xb0 >> [ 29.967307] do_syscall_64+0x33/0x40 >> [ 29.967310] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [ 29.967314] >> -> #0 (&cli->mutex){+.+.}-{3:3}: >> [ 29.967318] __lock_acquire+0x1494/0x1ac0 >> [ 29.967322] lock_acquire+0x23e/0x3b0 >> [ 29.967325] __mutex_lock+0x95/0x9d0 >> [ 29.967330] nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.967377] ttm_bo_handle_move_mem+0x79/0x130 [ttm] >> [ 29.967384] ttm_bo_validate+0x156/0x1b0 [ttm] >> [ 29.967390] nouveau_bo_validate+0x48/0x70 [nouveau] >> [ 29.967438] nouveau_bo_pin+0x1de/0x310 [nouveau] >> [ 29.967487] nv50_wndw_prepare_fb+0x53/0x4d0 [nouveau] >> [ 29.967531] drm_atomic_helper_prepare_planes+0x8a/0x110 [drm_kms_helper] >> [ 29.967547] nv50_disp_atomic_commit+0xa9/0x1b0 [nouveau] >> [ 29.967593] drm_atomic_helper_update_plane+0x10a/0x150 [drm_kms_helper] >> [ 29.967606] drm_mode_cursor_universal+0x10b/0x220 [drm] >> [ 29.967627] drm_mode_cursor_common+0x190/0x200 [drm] >> [ 29.967648] drm_mode_cursor_ioctl+0x3d/0x50 [drm] >> [ 29.967669] drm_ioctl_kernel+0x91/0xe0 [drm] >> [ 29.967684] drm_ioctl+0x2db/0x380 [drm] >> [ 29.967699] nouveau_drm_ioctl+0x56/0xb0 [nouveau] >> [ 29.967748] __x64_sys_ioctl+0x76/0xb0 >> [ 29.967752] do_syscall_64+0x33/0x40 >> [ 29.967756] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [ 29.967760] >> other info that might help us debug this: >> >> [ 29.967764] Possible unsafe locking scenario: >> >> [ 29.967767] CPU0 CPU1 >> [ 29.967770] ---- ---- >> [ 29.967772] lock(reservation_ww_class_mutex); >> [ 29.967776] lock(&cli->mutex); >> [ 29.967779] lock(reservation_ww_class_mutex); >> [ 29.967783] lock(&cli->mutex); >> [ 29.967786] >> *** DEADLOCK *** >> >> [ 29.967790] 3 locks held by X/2145: >> [ 29.967792] #0: ffff88810365bcf8 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_mode_cursor_common+0x87/0x200 [drm] >> [ 29.967817] #1: ffff888108d9e098 (crtc_ww_class_mutex){+.+.}-{3:3}, at: drm_modeset_lock+0xc3/0xe0 [drm] >> [ 29.967841] #2: ffff888123c201a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: nouveau_bo_pin+0x2b/0x310 [nouveau] >> [ 29.967896] >> stack backtrace: >> [ 29.967899] CPU: 6 PID: 2145 Comm: X Kdump: loaded Tainted: G W E 5.12.0.g05a59d7-master #2 >> [ 29.967904] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013 >> [ 29.967908] Call Trace: >> [ 29.967911] dump_stack+0x6d/0x89 >> [ 29.967915] check_noncircular+0xe7/0x100 >> [ 29.967919] ? nvkm_vram_map+0x48/0x50 [nouveau] >> [ 29.967959] ? __lock_acquire+0x1494/0x1ac0 >> [ 29.967963] __lock_acquire+0x1494/0x1ac0 >> [ 29.967967] lock_acquire+0x23e/0x3b0 >> [ 29.967971] ? nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.968020] __mutex_lock+0x95/0x9d0 >> [ 29.968024] ? nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.968070] ? nvif_vmm_map+0xf4/0x110 [nouveau] >> [ 29.968093] ? nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.968137] ? lock_release+0x160/0x280 >> [ 29.968141] ? nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.968184] nouveau_bo_move+0x11f/0x980 [nouveau] >> [ 29.968226] ? up_write+0x17/0x130 >> [ 29.968229] ? unmap_mapping_pages+0x53/0x110 >> [ 29.968234] ttm_bo_handle_move_mem+0x79/0x130 [ttm] >> [ 29.968240] ttm_bo_validate+0x156/0x1b0 [ttm] >> [ 29.968247] nouveau_bo_validate+0x48/0x70 [nouveau] >> [ 29.968289] nouveau_bo_pin+0x1de/0x310 [nouveau] >> [ 29.968330] nv50_wndw_prepare_fb+0x53/0x4d0 [nouveau] >> [ 29.968372] drm_atomic_helper_prepare_planes+0x8a/0x110 [drm_kms_helper] >> [ 29.968384] ? lockdep_init_map_type+0x58/0x240 >> [ 29.968388] nv50_disp_atomic_commit+0xa9/0x1b0 [nouveau] >> [ 29.968430] drm_atomic_helper_update_plane+0x10a/0x150 [drm_kms_helper] >> [ 29.968442] drm_mode_cursor_universal+0x10b/0x220 [drm] >> [ 29.968463] ? lock_is_held_type+0xdd/0x130 >> [ 29.968468] drm_mode_cursor_common+0x190/0x200 [drm] >> [ 29.968486] ? drm_mode_setplane+0x190/0x190 [drm] >> [ 29.968502] drm_mode_cursor_ioctl+0x3d/0x50 [drm] >> [ 29.968518] drm_ioctl_kernel+0x91/0xe0 [drm] >> [ 29.968533] drm_ioctl+0x2db/0x380 [drm] >> [ 29.968548] ? drm_mode_setplane+0x190/0x190 [drm] >> [ 29.968570] ? _raw_spin_unlock_irqrestore+0x30/0x60 >> [ 29.968574] ? lockdep_hardirqs_on+0x79/0x100 >> [ 29.968578] ? _raw_spin_unlock_irqrestore+0x3b/0x60 >> [ 29.968582] nouveau_drm_ioctl+0x56/0xb0 [nouveau] >> [ 29.968632] __x64_sys_ioctl+0x76/0xb0 >> [ 29.968636] ? lockdep_hardirqs_on+0x79/0x100 >> [ 29.968640] do_syscall_64+0x33/0x40 >> [ 29.968644] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [ 29.968648] RIP: 0033:0x7f1ccfb4e9e7 >> [ 29.968652] Code: b3 66 90 48 8b 05 b1 14 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 14 2c 00 f7 d8 64 89 01 48 >> [ 29.968659] RSP: 002b:00007ffca9596058 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >> [ 29.968663] RAX: ffffffffffffffda RBX: 000055da9d0c6470 RCX: 00007f1ccfb4e9e7 >> [ 29.968667] RDX: 00007ffca9596090 RSI: 00000000c01c64a3 RDI: 000000000000000e >> [ 29.968670] RBP: 00007ffca9596090 R08: 0000000000000040 R09: 000055da9d0f6310 >> [ 29.968674] R10: 0000000000000093 R11: 0000000000000246 R12: 00000000c01c64a3 >> [ 29.968677] R13: 000000000000000e R14: 0000000000000000 R15: 0000000000000000 >>
Mike Galbraith
2021-Mar-15 08:53 UTC
[Nouveau] [bisected] Re: nouveau: lockdep cli->mutex vs reservation_ww_class_mutex deadlock report
On Mon, 2021-03-15 at 09:05 +0100, Christian K?nig wrote:> Hi Mike, > > I'm pretty sure your bisection is a bit off.(huh?) Ah crap, yup, the spew from hell you plugged obliterated the lockdep gripe I was grepping for as go/nogo, and off into lala land we go.. twice.. whee :) Oh well, the ordering gripe is clear enough without a whodoneit. -Mike