Harald Schmalzbauer
2010-Feb-05 11:32 UTC
8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included
Hello, when I start lighttpd at boot time, the system half-locks in a way, that any process, which accesses /usr/local/etc stalls. It's also impossible to shut down. /usr/local/etc is unionfs mounted. I compiled a kernel with debug options. When mounting unionfs at boot time, here's the firt LOR with trace: lock order reversal: 1st 0xffffff00018b47f8 unionfs (unionfs) @ /usr/src/sys/fs/unionfs/union_subr.c:356 2nd 0xffffff00018d9d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 vrele() at vrele+0x120 unionfs_noderem() at unionfs_noderem+0x1c4 unionfs_reclaim() at unionfs_reclaim+0x11 vgonel() at vgonel+0xf1 vrecycle() at vrecycle+0x58 unionfs_inactive() at uniougen2.2: <vendor 0x1267> at usbus2 nfs_inactive+ukbd0: 0<vendor 0x1267 product 0x0103, class 0/0, rev 1.10/1.01, addr 2> on usbus2 x20 vinactive() at vinactive+0x6b vput() at vput+0x216 kern_statatkbd1 at ukbd0 _vnhook() at kern_statat_vnhook+0xe9 kern_statat() at kern_statat+0x15 stat() at stat+0x22 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- suhid0: y<vendor 0x1267 product 0x0103, class 0/0, rev 1.10/1.01, addr 2> on usbus2 scall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 0x7fffffffe5b8, rbp = 0x800b312c0 --- KDB: enter: witness_checkorder [thread pid 27 tid 100068 ] Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) Tracing pid 27 tid 100068 td 0xffffff00016fe720 kdb_enter() at kdb_enter+0x3d witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 vrele() at vrele+0x120 unionfs_noderem() at unionfs_noderem+0x1c4 unionfs_reclaim() at unionfs_reclaim+0x11 vgonel() at vgonel+0xf1 vrecycle() at vrecycle+0x58 unionfs_inactive() at unionfs_inactive+0x20 vinactive() at vinactive+0x6b vput() at vput+0x216 kern_statat_vnhook() at kern_statat_vnhook+0xe9 kern_statat() at kern_statat+0x15 stat() at stat+0x22 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 0x7fffffffe5b8, rbp = 0x800b312c0 - Like mentioned, there is that strange problem with lighttpd started at boot time. Other /urs/local/etc/rc.d startups don't lead to a /usr/local/etc deadlock. Unfortunately I don't get any panic or anything else when the hang happens. How can I aquire more information? It's no problem to log in and to do everything else outside /usr/local/etc... ===================================================================== Here's a LOR at shutdown with trace: lock order reversal: 1st 0xffffff0001bc2098 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200 2nd 0xffffff0001bc1ba8 devfs (devfs) @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1194 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_flushfiles() at ffs_flushfiles+0x93 ffs_unmount() at ffs_unmount+0x48 dounmount() at dounmount+0x2ac vfs_unmountall() at vfs_unmountall+0x54 boot() at boot+0x814 mkdumpheader() at mkdumpheader syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 0x7fffffffe738, rbp = 0x402290 --- KDB: enter: witness_checkorder [thread pid 1 tid 100002 ] Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) Tracing pid 1 tid 100002 td 0xffffff0001310ab0 kdb_enter() at kdb_enter+0x3d witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_flushfiles() at ffs_flushfiles+0x93 ffs_unmount() at ffs_unmount+0x48 dounmount() at dounmount+0x2ac vfs_unmountall() at vfs_unmountall+0x54 boot() at boot+0x814 mkdumpheader() at mkdumpheader syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 0x7fffffffe738, rbp = 0x402290 --- Any Help highly appreciated! Thanks, -Harry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 196 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100205/cf3ee525/signature-0001.pgp
Harald Schmalzbauer
2010-Feb-05 11:40 UTC
Reboot Loop: ffs_snapshot/bufwait LORs [Was: Re: 8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included]
Harald Schmalzbauer schrieb am 05.02.2010 12:31 (localtime):> Hello, > > when I start lighttpd at boot time, the system half-locks in a way, that > any process, which accesses /usr/local/etc stalls. It's also impossible > to shut down. > /usr/local/etc is unionfs mounted. > I compiled a kernel with debug options. > > When mounting unionfs at boot time, here's the firt LOR with trace: > > lock order reversal: > 1st 0xffffff00018b47f8 unionfs (unionfs) @ > /usr/src/sys/fs/unionfs/union_subr.c:356 > 2nd 0xffffff00018d9d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > _witness_debugger() at _witness_debugger+0x49 > witness_checkorder() at witness_checkorder+0x7ea > __lockmgr_args() at __lockmgr_args+0xcbd > ffs_lock() at ffs_lock+0x8c > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b > _vn_lock() at _vn_lock+0x50 > vrele() at vrele+0x120 > unionfs_noderem() at unionfs_noderem+0x1c4 > unionfs_reclaim() at unionfs_reclaim+0x11 > vgonel() at vgonel+0xf1 > vrecycle() at vrecycle+0x58 > unionfs_inactive() at uniougen2.2: <vendor 0x1267> at usbus2 > nfs_inactive+ukbd0: 0<vendor 0x1267 product 0x0103, class 0/0, rev > 1.10/1.01, addr 2> on usbus2 > x20 > vinactive() at vinactive+0x6b > vput() at vput+0x216 > kern_statatkbd1 at ukbd0 > _vnhook() at kern_statat_vnhook+0xe9 > kern_statat() at kern_statat+0x15 > stat() at stat+0x22 > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- suhid0: y<vendor 0x1267 product 0x0103, class 0/0, rev 1.10/1.01, > addr 2> on usbus2 > scall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = > 0x7fffffffe5b8, rbp = 0x800b312c0 --- > KDB: enter: witness_checkorder > [thread pid 27 tid 100068 ] > Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) > Tracing pid 27 tid 100068 td 0xffffff00016fe720 > kdb_enter() at kdb_enter+0x3d > witness_checkorder() at witness_checkorder+0x7ea > __lockmgr_args() at __lockmgr_args+0xcbd > ffs_lock() at ffs_lock+0x8c > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b > _vn_lock() at _vn_lock+0x50 > vrele() at vrele+0x120 > unionfs_noderem() at unionfs_noderem+0x1c4 > unionfs_reclaim() at unionfs_reclaim+0x11 > vgonel() at vgonel+0xf1 > vrecycle() at vrecycle+0x58 > unionfs_inactive() at unionfs_inactive+0x20 > vinactive() at vinactive+0x6b > vput() at vput+0x216 > kern_statat_vnhook() at kern_statat_vnhook+0xe9 > kern_statat() at kern_statat+0x15 > stat() at stat+0x22 > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = > 0x7fffffffe5b8, rbp = 0x800b312c0 - > > > Like mentioned, there is that strange problem with lighttpd started at > boot time. Other /urs/local/etc/rc.d startups don't lead to a > /usr/local/etc deadlock. > Unfortunately I don't get any panic or anything else when the hang happens. > How can I aquire more information? > It's no problem to log in and to do everything else outside > /usr/local/etc... > > > =====================================================================> > Here's a LOR at shutdown with trace: > > lock order reversal: > 1st 0xffffff0001bc2098 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200 > 2nd 0xffffff0001bc1ba8 devfs (devfs) @ > /usr/src/sys/ufs/ffs/ffs_vfsops.c:1194 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > _witness_debugger() at _witness_debugger+0x49 > witness_checkorder() at witness_checkorder+0x7ea > __lockmgr_args() at __lockmgr_args+0xcbd > vop_stdlock() at vop_stdlock+0x39 > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b > _vn_lock() at _vn_lock+0x50 > ffs_flushfiles() at ffs_flushfiles+0x93 > ffs_unmount() at ffs_unmount+0x48 > dounmount() at dounmount+0x2ac > vfs_unmountall() at vfs_unmountall+0x54 > boot() at boot+0x814 > mkdumpheader() at mkdumpheader > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = > 0x7fffffffe738, rbp = 0x402290 --- > KDB: enter: witness_checkorder > [thread pid 1 tid 100002 ] > Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) > > Tracing pid 1 tid 100002 td 0xffffff0001310ab0 > kdb_enter() at kdb_enter+0x3d > witness_checkorder() at witness_checkorder+0x7ea > __lockmgr_args() at __lockmgr_args+0xcbd > vop_stdlock() at vop_stdlock+0x39 > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b > _vn_lock() at _vn_lock+0x50 > ffs_flushfiles() at ffs_flushfiles+0x93 > ffs_unmount() at ffs_unmount+0x48 > dounmount() at dounmount+0x2ac > vfs_unmountall() at vfs_unmountall+0x54 > boot() at boot+0x814 > mkdumpheader() at mkdumpheader > syscall() at syscall+0x1af > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = > 0x7fffffffe738, rbp = 0x402290 --- > > Any Help highly appreciated! > > Thanks, > > -HarryAdditional LORs while regular machine operation (background fsck) which leads to reboot! I have access over the serail console, but the machine is unresponsive after that. So I'm now in a endelss reboot loop with the debug kernel... lock order reversal: 1st 0xffffff0001b899d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423 2nd 0xffffff802970fc28 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559 3rd 0xffffff00018b4448 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:544 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_snapshot() at ffs_snapshot+0x1b70 ffs_mount() at ffs_mount+0x651 vfs_donmount() at vfs_donmount+0xcd4 nmount() at nmount+0x74 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 0x7fffffffe988, rbp = 0x800a028e- KDB: enter: witness_checkorder [thread pid 947 tid 100073 ] Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) db> lock order reversal: 1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:523 2nd 0xffffff8000422028 uhci2 (uhci2) @ /usr/src/sys/dev/usb/controller/uhci.c:1551 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _mtx_lock_flags() at _mtx_lock_flags+0x68 uhci_do_poll() at uhci_do_poll+0x2e usbd_transfer_poll() at usbd_transfer_poll+0x18d ukbd_do_poll() at ukbd_do_poll+0x63 ukbd_get_key() at ukbd_get_key+0xa8 ukbd_read_char() at ukbd_read_char+0xaa scgetc() at scgetc+0x5b sc_cngetc() at sc_cngetc+0xf2 cncheckc() at cncheckc+0x65 cngetc() at cngetc+0x1c db_readline() at db_readline+0x79 db_read_line() at db_read_line+0x15 db_command_loop() at db_command_loop+0x38 db_trap() at db_trap+0x87 kdb_trap() at kdb_trap+0x82 trap() at trap+0x18f calltrap() at calltrap+0x8 --- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 0xffffff803e959020 --- kdb_enter() at kdb_enter+0x3d witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_snapshot() at ffs_snapshot+0x1b70 ffs_mount() at ffs_mount+0x651 vfs_donmount() at vfs_donmount+0xcd4 nmount() at nmount+0x74 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 0x7fffffffe988, rbp = 0x800a028e- lock order reversal: 1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:523 2nd 0xffffff0001747890 USB device mutex (USB device mutex) @ /usr/src/sys/dev/usb/usb_device.c:1410 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _mtx_lock_flags() at _mtx_lock_flags+0x68 usbd_clear_stall_proc() at usbd_clear_stall_proc+0x49 usbd_transfer_poll() at usbd_transfer_poll+0x1c0 ukbd_do_poll() at ukbd_do_poll+0x63 ukbd_get_key() at ukbd_get_key+0xa8 ukbd_read_char() at ukbd_read_char+0xaa scgetc() at scgetc+0x5b sc_cngetc() at sc_cngetc+0xf2 cncheckc() at cncheckc+0x65 cngetc() at cngetc+0x1c db_readline() at db_readline+0x79 db_read_line() at db_read_line+0x15 db_command_loop() at db_command_loop+0x38 db_trap() at db_trap+0x87 kdb_trap() at kdb_trap+0x82 trap() at trap+0x18f calltrap() at calltrap+0x8 --- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 0xffffff803e959020 --- kdb_enter() at kdb_enter+0x3d witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_snapshot() at ffs_snapshot+0x1b70 ffs_mount() at ffs_mount+0x651 vfs_donmount() at vfs_donmount+0xcd4 nmount() at nmount+0x74 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 0x7fffffffe988, rbp = 0x800a028e- lock order reversal: (Giant after non-sleepable) 1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:523 2nd 0xffffffff80820780 Giant (Giant) @ /usr/src/sys/dev/usb/usb_transfer.c:1952 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _mtx_lock_flags() at _mtx_lock_flags+0x68 usb_callback_proc() at usb_callback_proc+0x48 usbd_transfer_poll() at usbd_transfer_poll+0x1e9 ukbd_do_poll() at ukbd_do_poll+0x63 ukbd_get_key() at ukbd_get_key+0xa8 ukbd_read_char() at ukbd_read_char+0xaa scgetc() at scgetc+0x5b sc_cngetc() at sc_cngetc+0xf2 cncheckc() at cncheckc+0x65 cngetc() at cngetc+0x1c db_readline() at db_readline+0x79 db_read_line() at db_read_line+0x15 db_command_loop() at db_command_loop+0x38 db_trap() at db_trap+0x87 kdb_trap() at kdb_trap+0x82 trap() at trap+0x18f calltrap() at calltrap+0x8 --- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 0xffffff803e959020 --- kdb_enter() at kdb_enter+0x3d witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_snapshot() at ffs_snapshot+0x1b70 ffs_mount() at ffs_mount+0x651 vfs_donmount() at vfs_donmount+0xcd4 nmount() at nmount+0x74 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 0x7fffffffe988, rbp = 0x800a028e- lock order reversal: 1st 0xffffff802970fc28 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559 2nd 0xffffff00440ef6b0 snaplk (snaplk) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:793 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea __lockmgr_args() at __lockmgr_args+0xcbd ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x50 ffs_snapshot() at ffs_snapshot+0x17d7 ffs_mount() at ffs_mount+0x651 vfs_donmount() at vfs_donmount+0xcd4 nmount() at nmount+0x74 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 0x7fffffffe988, rbp = 0x800a028e- KDB: enter: witness_checkorder [thread pid 947 tid 100073 ] Stopped at kdb_enter+0x3d: movq $0,0x4c04dc(%rip) Thanks for any help, -Harry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 196 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100205/400bc058/signature.pgp