I think I may have found a clue regarding some of the hangs I'm seeing
on FreeBSD 7.1.
I have a program (kvoop), compiled under FreeBSD 6 and using
compatibility libraries under FreeBSD 7, that seems to be consistently
involved during these hangs. This time, I noticed that many processes
are stopped, waiting on the ufs lock. I can't tell for certain, but I
assume this kvoop process 33203 is blocking the other processes. The
kvoop process looks to me like it is waiting for a mutex, but nothing
shows up being deadlocked. Am I on the right track? Is there something
else I should be looking for?
(kgdb) proc 33203
[Switching to thread 129 (Thread 100241)]#0 sched_switch (
td=0xffffff0019109a50, newtd=0x0, flags=1)
at ../../../kern/sched_ule.c:1944
1944 cpuid = PCPU_GET(cpuid);
(kgdb) where
#0 sched_switch (td=0xffffff0019109a50, newtd=0x0, flags=1)
at ../../../kern/sched_ule.c:1944
#1 0xffffffff803a573b in mi_switch (flags=1, newtd=0x0)
at ../../../kern/kern_synch.c:440
#2 0xffffffff803d1214 in turnstile_wait (ts=Variable "ts" is not
available.
)
at ../../../kern/subr_turnstile.c:748
#3 0xffffffff80392db0 in _mtx_lock_sleep (m=0xffffffff8083c220,
tid=18446742974618442320, opts=Variable "opts" is not available.
) at ../../../kern/kern_mutex.c:420
#4 0xffffffff80392ea6 in _mtx_lock_flags (m=0xffffffff8083c220, opts=0,
file=0xffffffff805bd438 "../../../kern/vfs_cache.c", line=327)
at ../../../kern/kern_mutex.c:186
#5 0xffffffff80403bc6 in cache_lookup (dvp=0xffffff00013e0dc8,
vpp=0xffffffffa2d93a18, cnp=0xffffffffa2d93a40)
at ../../../kern/vfs_cache.c:327
#6 0xffffffff80404093 in vfs_cache_lookup (ap=Variable "ap" is not
available.
)
at ../../../kern/vfs_cache.c:674
#7 0xffffffff805628a0 in VOP_LOOKUP_APV (vop=0xffffffff8076e440,
a=0xffffffffa2d936b0) at vnode_if.c:99
#8 0xffffffff80409afd in lookup (ndp=0xffffffffa2d939f0) at vnode_if.h:57
#9 0xffffffff8040a824 in namei (ndp=0xffffffffa2d939f0)
at ../../../kern/vfs_lookup.c:219
#10 0xffffffff8041dc4f in vn_open_cred (ndp=0xffffffffa2d939f0,
flagp=0xffffffffa2d9393c, cmode=0, cred=0xffffff0001588600,
fp=0xffffff0001964200) at ../../../kern/vfs_vnops.c:188
#11 0xffffffff8041ccc7 in kern_open (td=0xffffff0019109a50,
path=0xffffac30 <Address 0xffffac30 out of bounds>, pathseg=Variable
"pathseg" is not available.
)
at ../../../kern/vfs_syscalls.c:1032
#12 0xffffffff80544660 in ia32_syscall (frame=0xffffffffa2d93c80)
at ../../../amd64/ia32/ia32_syscall.c:182
#13 0xffffffff805014d0 in Xint0x80_syscall () at ia32_exception.S:65
#14 0x0000000028284037 in ?? ()
(kgdb) frame 4
#4 0xffffffff80392ea6 in _mtx_lock_flags (m=0xffffffff8083c220, opts=0,
file=0xffffffff805bd438 "../../../kern/vfs_cache.c", line=327)
at ../../../kern/kern_mutex.c:186
186 _get_sleep_lock(m, curthread, opts, file, line);
(kgdb) list
181 ("mtx_lock() of spin mutex %s @ %s:%d",
m->lock_object.lo_name,
182 file, line));
183 WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER
| LOP_EXCLUSIVE,
184 file, line);
185
186 _get_sleep_lock(m, curthread, opts, file, line);
187 LOCK_LOG_LOCK("LOCK", &m->lock_object, opts,
m->mtx_recurse, file,
188 line);
189 WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE,
file, line);
190 curthread->td_locks++;
(kgdb) print *m
$2 = {lock_object = {lo_name = 0xffffffff805bd5d2 "Name Cache",
lo_type = 0xffffffff805bd5d2 "Name Cache", lo_flags = 16973824,
lo_witness_data = {lod_list = {stqe_next = 0xffffffff807cd600},
lod_witness = 0xffffffff807cd600}}, mtx_lock = 4, mtx_recurse = 0}
> Date: Tue, 24 Feb 2009 15:46:28 -0600 > From: Guy Helmer <ghelmer@palisadesys.com> > Sender: owner-freebsd-stable@freebsd.org > > I think I may have found a clue regarding some of the hangs I'm seeing > on FreeBSD 7.1. > I have a program (kvoop), compiled under FreeBSD 6 and using > compatibility libraries under FreeBSD 7, that seems to be consistently > involved during these hangs. This time, I noticed that many processes > are stopped, waiting on the ufs lock. I can't tell for certain, but I > assume this kvoop process 33203 is blocking the other processes. The > kvoop process looks to me like it is waiting for a mutex, but nothing > shows up being deadlocked. Am I on the right track? Is there something > else I should be looking for?For whatever it is worth, I have seen this three times over the past couple of weeks. My system is RELENG_7 of Feb. 7. I have no real information to add other than that I have also had this problem. When it happens, the only way out is a reboot. It hit me twice while updating Xorg. Thanks for your efforts! I hope someone can track down what is triggering with this. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
On Tuesday 24 February 2009 4:46:28 pm Guy Helmer wrote:> I think I may have found a clue regarding some of the hangs I'm seeing > on FreeBSD 7.1. > I have a program (kvoop), compiled under FreeBSD 6 and using > compatibility libraries under FreeBSD 7, that seems to be consistently > involved during these hangs. This time, I noticed that many processes > are stopped, waiting on the ufs lock. I can't tell for certain, but I > assume this kvoop process 33203 is blocking the other processes. The > kvoop process looks to me like it is waiting for a mutex, but nothing > shows up being deadlocked. Am I on the right track? Is there something > else I should be looking for? > > (kgdb) proc 33203 > [Switching to thread 129 (Thread 100241)]#0 sched_switch ( > td=0xffffff0019109a50, newtd=0x0, flags=1) > at ../../../kern/sched_ule.c:1944 > 1944 cpuid = PCPU_GET(cpuid); > (kgdb) where > #0 sched_switch (td=0xffffff0019109a50, newtd=0x0, flags=1) > at ../../../kern/sched_ule.c:1944 > #1 0xffffffff803a573b in mi_switch (flags=1, newtd=0x0) > at ../../../kern/kern_synch.c:440 > #2 0xffffffff803d1214 in turnstile_wait (ts=Variable "ts" is not available. > ) > at ../../../kern/subr_turnstile.c:748 > #3 0xffffffff80392db0 in _mtx_lock_sleep (m=0xffffffff8083c220, > tid=18446742974618442320, opts=Variable "opts" is not available. > ) at ../../../kern/kern_mutex.c:420 > #4 0xffffffff80392ea6 in _mtx_lock_flags (m=0xffffffff8083c220, opts=0, > file=0xffffffff805bd438 "../../../kern/vfs_cache.c", line=327) > at ../../../kern/kern_mutex.c:186 > #5 0xffffffff80403bc6 in cache_lookup (dvp=0xffffff00013e0dc8, > vpp=0xffffffffa2d93a18, cnp=0xffffffffa2d93a40) > at ../../../kern/vfs_cache.c:327 > #6 0xffffffff80404093 in vfs_cache_lookup (ap=Variable "ap" is not > available. > ) > at ../../../kern/vfs_cache.c:674 > #7 0xffffffff805628a0 in VOP_LOOKUP_APV (vop=0xffffffff8076e440, > a=0xffffffffa2d936b0) at vnode_if.c:99 > #8 0xffffffff80409afd in lookup (ndp=0xffffffffa2d939f0) at vnode_if.h:57 > #9 0xffffffff8040a824 in namei (ndp=0xffffffffa2d939f0) > at ../../../kern/vfs_lookup.c:219 > #10 0xffffffff8041dc4f in vn_open_cred (ndp=0xffffffffa2d939f0, > flagp=0xffffffffa2d9393c, cmode=0, cred=0xffffff0001588600, > fp=0xffffff0001964200) at ../../../kern/vfs_vnops.c:188 > #11 0xffffffff8041ccc7 in kern_open (td=0xffffff0019109a50, > path=0xffffac30 <Address 0xffffac30 out of bounds>, pathseg=Variable > "pathseg" is not available. > ) > at ../../../kern/vfs_syscalls.c:1032 > #12 0xffffffff80544660 in ia32_syscall (frame=0xffffffffa2d93c80) > at ../../../amd64/ia32/ia32_syscall.c:182 > #13 0xffffffff805014d0 in Xint0x80_syscall () at ia32_exception.S:65 > #14 0x0000000028284037 in ?? () > > (kgdb) frame 4 > #4 0xffffffff80392ea6 in _mtx_lock_flags (m=0xffffffff8083c220, opts=0, > file=0xffffffff805bd438 "../../../kern/vfs_cache.c", line=327) > at ../../../kern/kern_mutex.c:186 > 186 _get_sleep_lock(m, curthread, opts, file, line); > (kgdb) list > 181 ("mtx_lock() of spin mutex %s @ %s:%d", > m->lock_object.lo_name, > 182 file, line)); > 183 WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER > | LOP_EXCLUSIVE, > 184 file, line); > 185 > 186 _get_sleep_lock(m, curthread, opts, file, line); > 187 LOCK_LOG_LOCK("LOCK", &m->lock_object, opts, > m->mtx_recurse, file, > 188 line); > 189 WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE, > file, line); > 190 curthread->td_locks++; > > (kgdb) print *m > $2 = {lock_object = {lo_name = 0xffffffff805bd5d2 "Name Cache", > lo_type = 0xffffffff805bd5d2 "Name Cache", lo_flags = 16973824, > lo_witness_data = {lod_list = {stqe_next = 0xffffffff807cd600}, > lod_witness = 0xffffffff807cd600}}, mtx_lock = 4, mtx_recurse = 0}Is this from a coredump or while the system is live? mtx_lock = 4 indicates the mutex is unlocked, so there shouldn't be any threads waiting on it. -- John Baldwin