Harry Schmalzbauer
2017-Mar-07 19:45 UTC
unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905]
Bez?glich Harry Schmalzbauer's Nachricht vom 07.03.2017 19:44 (localtime):> Bez?glich Harry Schmalzbauer's Nachricht vom 07.03.2017 13:42 (localtime): > ? >> Something ufs related seems to have tightened the unionfs locking >> problem in stable/11. Now the machine instantaniously panics during >> boot after mounting root with Rick's latest patch. >> >> Unfortunately I don't have SWAP available on that machine (yet), but >> maybe shit is a hint for anybody. >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >> 0xfffffe00982220e0 >> vpanic() at vpanic+0x186/frame 0xfffffe0098222160 >> kassert_panic() at kassert_panic+0x126/frame 0xfffffe00982221d0 >> witness_assert() at witness_assert+0x35a/frame 0xfffffe0098222230 >> __lockmgr_args() at __lockmgr_args+0x517/frame 0xfffffe00982222d0 >> vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfffffe00982222f0 >> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe0098222320 >> unionfs_unlock() at unionfs_unlock+0x112/frame 0xfffffe0098222390 >> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe00982223c0 >> unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfffffe0098222470 >> unionfs_domount() at unionfs_domount+0x518/frame 0xfffffe00982226b0 >> vfs_donmount() at vfs_donmount+0xe37/frame 0xfffffe00982228f0 >> sys_nmount() at sys_nmount+0x72/frame 0xfffffe0098222930 >> amd64_syscall() at amd64_syscall+0x2f9/frame 0xfffffe0098222ab0 >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0098222ab0 >> --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp >> 0x7fffffffe318, rbp = 0x7fffffffeca0 --- > New discovery: > Rick's latest patch casues panic only with KDB. If I compile a kernel > without witenss and KDB, the machine boots fine! > Also, it's at least not so easy anymore to trigger the deadlock :-) . I > need to do more testing but until now Rick's approach seems very > promising :-) .My unionfs deadlock problem isn't really solved with Rick's latest patch, I still can reproduce it: krb5.conf and krb5.keytab are files on unionfs referenced by /etc. libexec/negotiate_kerberos_auth reads these and if I have enough helper processes handling requests, the deadlock occurs. _But_: If I move the files outside the unionfs and create a symlink, I cannot reproduce the deadlock anymore, which was similar easily reproducable without it or any of the other workarounds. So it looks like I have an acceptable solution for now, although it's only usable under certain conditions. Unfortunately I can't do tests with a debug kernel since the patch prevents the system with the debug kernel from starting up. But if this was ironed out, I'd happily provide more info. Thanks, -Harry
Rick Macklem
2017-Mar-07 22:49 UTC
unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905]
Hmm, this is going to sound dumb, but I don't recall generating any unionfs patch;-) I'll go look for it. Maybe it was Kostik's? rick ________________________________________ From: Harry Schmalzbauer <freebsd at omnilan.de> Sent: Tuesday, March 7, 2017 2:45:40 PM To: Rick Macklem Cc: Konstantin Belousov; FreeBSD Stable; Mark Johnston; kib at FreeBSD.org Subject: Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905] Bez?glich Harry Schmalzbauer's Nachricht vom 07.03.2017 19:44 (localtime):> Bez?glich Harry Schmalzbauer's Nachricht vom 07.03.2017 13:42 (localtime): > ? >> Something ufs related seems to have tightened the unionfs locking >> problem in stable/11. Now the machine instantaniously panics during >> boot after mounting root with Rick's latest patch. >> >> Unfortunately I don't have SWAP available on that machine (yet), but >> maybe shit is a hint for anybody. >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >> 0xfffffe00982220e0 >> vpanic() at vpanic+0x186/frame 0xfffffe0098222160 >> kassert_panic() at kassert_panic+0x126/frame 0xfffffe00982221d0 >> witness_assert() at witness_assert+0x35a/frame 0xfffffe0098222230 >> __lockmgr_args() at __lockmgr_args+0x517/frame 0xfffffe00982222d0 >> vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfffffe00982222f0 >> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe0098222320 >> unionfs_unlock() at unionfs_unlock+0x112/frame 0xfffffe0098222390 >> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe00982223c0 >> unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfffffe0098222470 >> unionfs_domount() at unionfs_domount+0x518/frame 0xfffffe00982226b0 >> vfs_donmount() at vfs_donmount+0xe37/frame 0xfffffe00982228f0 >> sys_nmount() at sys_nmount+0x72/frame 0xfffffe0098222930 >> amd64_syscall() at amd64_syscall+0x2f9/frame 0xfffffe0098222ab0 >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0098222ab0 >> --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp >> 0x7fffffffe318, rbp = 0x7fffffffeca0 --- > New discovery: > Rick's latest patch casues panic only with KDB. If I compile a kernel > without witenss and KDB, the machine boots fine! > Also, it's at least not so easy anymore to trigger the deadlock :-) . I > need to do more testing but until now Rick's approach seems very > promising :-) .My unionfs deadlock problem isn't really solved with Rick's latest patch, I still can reproduce it: krb5.conf and krb5.keytab are files on unionfs referenced by /etc. libexec/negotiate_kerberos_auth reads these and if I have enough helper processes handling requests, the deadlock occurs. _But_: If I move the files outside the unionfs and create a symlink, I cannot reproduce the deadlock anymore, which was similar easily reproducable without it or any of the other workarounds. So it looks like I have an acceptable solution for now, although it's only usable under certain conditions. Unfortunately I can't do tests with a debug kernel since the patch prevents the system with the debug kernel from starting up. But if this was ironed out, I'd happily provide more info. Thanks, -Harry