Stefano Garzarella
2019-Sep-30 13:51 UTC
[PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()
On Fri, Sep 27, 2019 at 05:37:20AM +0000, Dexuan Cui wrote:> > From: linux-hyperv-owner at vger.kernel.org > > <linux-hyperv-owner at vger.kernel.org> On Behalf Of Stefano Garzarella > > Sent: Thursday, September 26, 2019 12:48 AM > > > > Hi Dexuan, > > > > On Thu, Sep 26, 2019 at 01:11:27AM +0000, Dexuan Cui wrote: > > > ... > > > NOTE: I only tested the code on Hyper-V. I can not test the code for > > > virtio socket, as I don't have a KVM host. :-( Sorry. > > > > > > @Stefan, @Stefano: please review & test the patch for virtio socket, > > > and let me know if the patch breaks anything. Thanks! > > > > Comment below, I'll test it ASAP! > > Stefano, Thank you! > > BTW, this is how I tested the patch: > 1. write a socket server program in the guest. The program calls listen() > and then calls sleep(10000 seconds). Note: accept() is not called. > > 2. create some connections to the server program in the guest. > > 3. kill the server program by Ctrl+C, and "dmesg" will show the scary > call-trace, if the kernel is built with > CONFIG_LOCKDEP=y > CONFIG_LOCKDEP_SUPPORT=y > > 4. Apply the patch, do the same test and we should no longer see the call-trace. >Hi Dexuan, I tested on virtio socket and it works as expected! With your patch applied I don't have issues and call-trace. Without the patch I have a very similar call-trace (as expected): =========================================== WARNING: possible recursive locking detected 5.3.0-vsock #17 Not tainted -------------------------------------------- python3/872 is trying to acquire lock: ffff88802b650110 (sk_lock-AF_VSOCK){+.+.}, at: virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common] but task is already holding lock: ffff88803597ce10 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x3f/0x130 [vsock] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_VSOCK); lock(sk_lock-AF_VSOCK); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by python3/872: #0: ffff88802c957180 (&sb->s_type->i_mutex_key#8){+.+.}, at: __sock_release+0x2d/0xb0 #1: ffff88803597ce10 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x3f/0x130 [vsock] stack backtrace: CPU: 0 PID: 872 Comm: python3 Not tainted 5.3.0-vsock #17 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 Call Trace: dump_stack+0x85/0xc0 __lock_acquire.cold+0xad/0x22b lock_acquire+0xc4/0x1a0 ? virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common] lock_sock_nested+0x5d/0x80 ? virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common] virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common] ? mark_held_locks+0x49/0x70 ? _raw_spin_unlock_irqrestore+0x44/0x60 __vsock_release+0x2d/0x130 [vsock] __vsock_release+0xb9/0x130 [vsock] vsock_release+0x12/0x30 [vsock] __sock_release+0x3d/0xb0 sock_close+0x14/0x20 __fput+0xc1/0x250 task_work_run+0x93/0xb0 exit_to_usermode_loop+0xd3/0xe0 syscall_return_slowpath+0x205/0x310 entry_SYSCALL_64_after_hwframe+0x49/0xbe Feel free to add: Tested-by: Stefano Garzarella <sgarzare at redhat.com>