I have the following panic occurring several times a week. The machine is an NFS server, and it usually panics early in the morning, when first people try to access it. After reboot it may work OK for 1-2 days, and then panics again. I have tried changing memory and replacing disk which was exported via NFS, but nothing helped :( Any suggestion on how to fix this panic will be very much appreciated ! /Yuri [root@XXX][/var/crash]# uname -a FreeBSD XXX.irfu.se 6.0-STABLE FreeBSD 6.0-STABLE #0: Tue Nov 29 13:31:15 CET 2005 root@XXX.irfu.se:/usr/obj/usr/src/sys/HEM i386 [root@XXX][/var/crash]# kgdb /usr/obj/usr/src/sys/HEM/kernel.debug vmcore.7 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x74 fault code = supervisor read, page not present instruction pointer = 0x20:0xc053a426 stack pointer = 0x28:0xd56c0b88 frame pointer = 0x28:0xd56c0b8c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 77 (vnlru) trap number = 12 panic: page fault Uptime: 2d12h22m11s Dumping 511 MB (2 chunks) chunk 0: 1MB (160 pages) ... ok chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:165 #1 0xc051577a in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0515a84 in panic (fmt=0xc06ce475 "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc06b4815 in trap_fatal (frame=0xd56c0b48, eva=0) at /usr/src/sys/i386/i386/trap.c:836 #4 0xc06b3f2d in trap (frame {tf_fs = 1133445128, tf_es = 40, tf_ds = 40, tf_edi = -1017997312, tf_esi = -1020120704, tf_ebp = -714339444, tf_isp = -714339468, tf_ebx = -1012942272, tf_edx = -1020120704, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1068260314, tf_cs = 32, tf_eflags = 589831, tf_esp = -1020120704, tf_ss = -714339408}) at /usr/src/sys/i386/i386/trap.c:269 #5 0xc06a24fa in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #6 0xc053a426 in turnstile_setowner (ts=0xc39fba40, owner=0x0) at /usr/src/sys/kern/subr_turnstile.c:417 #7 0xc053a752 in turnstile_wait (lock=0xc461fe00, owner=0x0) at /usr/src/sys/kern/subr_turnstile.c:576 #8 0xc050b511 in _mtx_lock_sleep (m=0xc461fe00, tid=3274846592, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:555 #9 0xc064becd in ufsdirhash_free (ip=0xc4a33840) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:289 #10 0xc064de66 in ufs_reclaim (ap=0x0) at /usr/src/sys/ufs/ufs/ufs_inode.c:175 #11 0xc06bef38 in VOP_RECLAIM_APV (vop=0x0, a=0xc3323180) at vnode_if.c:1589 #12 0xc057adfe in vgonel (vp=0xc3cf3aa0) at vnode_if.h:818 #13 0xc0577530 in vtryrecycle (vp=0xc3cf3aa0) at /usr/src/sys/kern/vfs_subr.c:840 #14 0xc0576ec6 in vnlru_free (count=1376) at /usr/src/sys/kern/vfs_subr.c:668 #15 0xc0577019 in vnlru_proc () at /usr/src/sys/kern/vfs_subr.c:703 #16 0xc04fc310 in fork_exit (callout=0xc0576f24 <vnlru_proc>, arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:789 #17 0xc06a255c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 (kgdb) quit -- Dr. Yuri Khotyaintsev Institutet f?r rymdfysik (IRF), Uppsala
On Friday 02 December 2005 05:00 am, Yuri Khotyaintsev wrote:> I have the following panic occurring several times a week. The machine is > an NFS server, and it usually panics early in the morning, when first > people try to access it. After reboot it may work OK for 1-2 days, and then > panics again. I have tried changing memory and replacing disk which was > exported via NFS, but nothing helped :( > > Any suggestion on how to fix this panic will be very much appreciated !This panic (in propagate_priority) is usually caused when a thread goes to sleep while holding a mutex (which is forbidden). If you enable INVARIANTS and/or WITNESS you should get a better panic, and with WITNESS you will even be warned when a thread goes to sleep while holding a mutex. However, these options do introduce considerable execution overhead, and sometimes that overhead changes the timing enough to hide the race. :( -- John Baldwin <jhb@FreeBSD.org> ?<>< ?http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" ?= ?http://www.FreeBSD.org
On Friday 02 December 2005 14.54, John Baldwin wrote:> On Friday 02 December 2005 05:00 am, Yuri Khotyaintsev wrote: > > I have the following panic occurring several times a week. The machine is > > an NFS server, and it usually panics early in the morning, when first > > people try to access it. After reboot it may work OK for 1-2 days, and > > then panics again. I have tried changing memory and replacing disk which > > was exported via NFS, but nothing helped :( > > > > Any suggestion on how to fix this panic will be very much appreciated ! > > This panic (in propagate_priority) is usually caused when a thread goes to > sleep while holding a mutex (which is forbidden). If you enable INVARIANTS > and/or WITNESS you should get a better panic, and with WITNESS you will > even be warned when a thread goes to sleep while holding a mutex. However, > these options do introduce considerable execution overhead, and sometimes > that overhead changes the timing enough to hide the race. :(I am compiling a new kernel with INVARIANTS and WITNESS now. Will wait for a "better" panic ;-) -- Dr. Yuri Khotyaintsev Institutet f?r rymdfysik (IRF), Uppsala
On Friday 02 December 2005 14.54, John Baldwin wrote:> On Friday 02 December 2005 05:00 am, Yuri Khotyaintsev wrote: > > I have the following panic occurring several times a week. The machine is > > an NFS server, and it usually panics early in the morning, when first > > people try to access it. After reboot it may work OK for 1-2 days, and > > then panics again. I have tried changing memory and replacing disk which > > was exported via NFS, but nothing helped :( > > > > Any suggestion on how to fix this panic will be very much appreciated ! > > This panic (in propagate_priority) is usually caused when a thread goes to > sleep while holding a mutex (which is forbidden). If you enable INVARIANTS > and/or WITNESS you should get a better panic, and with WITNESS you will > even be warned when a thread goes to sleep while holding a mutex. However, > these options do introduce considerable execution overhead, and sometimes > that overhead changes the timing enough to hide the race. :(Here are the two panics which I got with INVARIANTS and WITNESS enabled. # kgdb /usr/obj/usr/src/sys/HEM.DEBUG/kernel.debug vmcore.8 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Memory modified after free 0xc4759e00(508) val=0 @ 0xc4759e00 panic: Most recently used by UFS dirhash Uptime: 11h8m36s Dumping 511 MB (2 chunks) chunk 0: 1MB (160 pages) ... ok chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:165 #1 0xc050fd4f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0510043 in panic (fmt=0xc06dccbb "Most recently used by %s\n") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc0648ccf in mtrash_ctor (mem=0xc4759e00, size=0, arg=0x0, flags=2) at /usr/src/sys/vm/uma_dbg.c:137 #4 0xc06469c1 in uma_zalloc_arg (zone=0xc104d980, udata=0x0, flags=2) at /usr/src/sys/vm/uma_core.c:1850 #5 0xc05043cd in malloc (size=400, mtp=0xc06fb700, flags=2) at uma.h:275 #6 0xc063fba9 in ufs_readdir (ap=0xd56eaaec) at /usr/src/sys/ufs/ufs/ufs_vnops.c:1846 #7 0xc06a61cc in VOP_READDIR_APV (vop=0x0, a=0xd56eaaec) at vnode_if.c:1427 #8 0xc0607716 in nfsrv_readdir (nfsd=0xc4368c00, slp=0x0, td=0xc3326780, mrq=0xd56eac80) at vnode_if.h:746 #9 0xc060fa5b in nfssvc_nfsd (td=0x0) at /usr/src/sys/nfsserver/nfs_syscalls.c:472 #10 0xc060f280 in nfssvc (td=0xc3326780, uap=0xd56ead04) at /usr/src/sys/nfsserver/nfs_syscalls.c:181 #11 0xc069b6b0 in syscall (frame---Type <return> to continue, or q <return> to quit--- {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 0, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -714166940, tf_ebx = 0, tf_edx = -1077936144, tf_ecx = 1, tf_eax = 155, tf_trapno = 12, tf_err = 2, tf_eip = 671852067, tf_cs = 51, tf_eflags = 582, tf_esp = -1077941492, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:981 #12 0xc068947f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #13 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) quit # kgdb /usr/obj/usr/src/sys/HEM.DEBUG/kernel.debug vmcore.9 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Memory modified after free 0xc5172800(508) val=0 @ 0xc5172800 panic: Most recently used by UFS dirhash Uptime: 1d1h7m17s Dumping 511 MB (2 chunks) chunk 0: 1MB (160 pages) ... ok chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:165 #1 0xc050fd4f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0510043 in panic (fmt=0xc06dccbb "Most recently used by %s\n") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc0648ccf in mtrash_ctor (mem=0xc5172800, size=0, arg=0x0, flags=257) at /usr/src/sys/vm/uma_dbg.c:137 #4 0xc06469c1 in uma_zalloc_arg (zone=0xc104d980, udata=0x0, flags=257) at /usr/src/sys/vm/uma_core.c:1850 #5 0xc05043cd in malloc (size=368, mtp=0xc070eb60, flags=257) at uma.h:275 #6 0xc063729b in ufsdirhash_build (ip=0xc55664a4) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:184 #7 0xc0639441 in ufs_lookup (ap=0xd57c283c) at /usr/src/sys/ufs/ufs/ufs_lookup.c:192 #8 0xc06a4e0a in VOP_CACHEDLOOKUP_APV (vop=0x0, a=0xd57c283c) at vnode_if.c:150 #9 0xc0565e3b in vfs_cache_lookup (ap=0x0) at vnode_if.h:82 #10 0xc06a4d2f in VOP_LOOKUP_APV (vop=0xc070eee0, a=0xd57c28e4) at vnode_if.c:99 #11 0xc056a8d0 in lookup (ndp=0xd57c2bec) at vnode_if.h:56 ---Type <return> to continue, or q <return> to quit--- #12 0xc060df58 in nfs_namei (ndp=0xd57c2bec, fhp=0x0, len=0, slp=0x0, nam=0x0, mdp=0xd57c2a04, dposp=0xd57c2a08, retdirp=0xd57c29f0, v3=8, retdirattrp=0x0, retdirattr_retp=0x0, td=0xc350a780, pubflag=0) at /usr/src/sys/nfsserver/nfs_srvsubs.c:780 #13 0xc05fd284 in nfsrv_lookup (nfsd=0xc5764100, slp=0x0, td=0xc350a780, mrq=0xd57c2c80) at /usr/src/sys/nfsserver/nfs_serv.c:517 #14 0xc060fa5b in nfssvc_nfsd (td=0x0) at /usr/src/sys/nfsserver/nfs_syscalls.c:472 #15 0xc060f280 in nfssvc (td=0xc350a780, uap=0xd57c2d04) at /usr/src/sys/nfsserver/nfs_syscalls.c:181 #16 0xc069b6b0 in syscall (frame {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 0, tf_esi = 0, tf_ebp = -1077941464, tf_isp = -713282204, tf_ebx = 0, tf_edx = -1077936144, tf_ecx = 1, tf_eax = 155, tf_trapno = 12, tf_err = 2, tf_eip = 671852067, tf_cs = 51, tf_eflags = 582, tf_esp = -1077941492, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:981 #17 0xc068947f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #18 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) exit Undefined command: "exit". Try "help". (kgdb) quit -- Dr. Yuri Khotyaintsev Institutet f?r rymdfysik (IRF), Uppsala
Hello. While copying a few directories from one machine to my new notebook (tar over ssh over wireless connection [if_iwi]), the notebook paniced with the following: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x52535307 fault code = supervisor read, page not present instruction pointer = 0x20:0xc078bc08 stack pointer = 0x28:0xde4ae95c frame pointer = 0x28:0xde4ae984 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 761 (bsdtar) trap number = 12 panic: page fault Uptime: 11m20s Dumping 502 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 502MB (128464 pages) 486 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc0638202 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0638498 in panic (fmt=0xc084e5a2 "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc0807c30 in trap_fatal (frame=0xde4ae91c, eva=1381192455) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc080799b in trap_pfault (frame=0xde4ae91c, usermode=0, eva=1381192455) at /usr/src/sys/i386/i386/trap.c:742 #5 0xc08075d9 in trap (frame {tf_fs = 8, tf_es = -565575640, tf_ds = -1065943000, tf_edi -565515340, tf_esi = -1043806720, tf_ebp = -565515900, tf_isp -565515960, tf_ebx = -1039299392, tf_edx = 170, tf_ecx = 1, tf_eax 1381191775, tf_trapno = 12, tf_err = 0, tf_eip = -1065829368, tf_cs 32, tf_eflags = 66051, tf_esp = -1064527936, tf_ss = -565515812}) at /usr/src/sys/i386/i386/trap.c:432 #6 0xc07f6dca in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc078bc08 in ufsdirhash_lookup (ip=0xc20ec318, name=0xc1c45810 "UPCII.TTF", namelen=9, offp=0x5253505f, bpp=0x5253505f, prevoffp=0x0) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:409 #8 0xc078d480 in ufs_lookup (ap=0xde4aea80) at /usr/src/sys/ufs/ufs/ufs_lookup.c:209 #9 0xc0816d64 in VOP_CACHEDLOOKUP_APV (vop=0x5253505f, a=0xaa) at vnode_if.c:150 #10 0xc0682c9e in vfs_cache_lookup (ap=0x5253505f) at vnode_if.h:82 #11 0xc0816cf3 in VOP_LOOKUP_APV (vop=0xc08fbf40, a=0xde4aeb18) at vnode_if.c:99 #12 0xc068722d in lookup (ndp=0xde4aeba0) at vnode_if.h:56 #13 0xc0686b6e in namei (ndp=0xde4aeba0) at /usr/src/sys/kern/vfs_lookup.c:203 #14 0xc0694367 in kern_lstat (td=0xc1fea900, path=0xaa <Address 0xaa out of bounds>, pathseg=170, sbp=0xde4aec74) at /usr/src/sys/kern/vfs_syscalls.c:2102 #15 0xc0694303 in lstat (td=0xc1fea900, uap=0xde4aed04) at /usr/src/sys/kern/vfs_syscalls.c:2086 #16 0xc0807f47 in syscall (frame {tf_fs = 59, tf_es = 4259899, tf_ds = -1078001605, tf_edi -1077941792, tf_esi = -1077941248, tf_ebp = -1077941560, tf_isp -565514908, tf_ebx = 134672409, tf_edx = 134586905, tf_ecx = 25, tf_eax = 190, tf_trapno = 0, tf_err = 2, tf_eip = 672111379, tf_cs = 51, tf_eflags = 658, tf_esp = -1077941860, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:976 #17 0xc07f6e1f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #18 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) The notebook runs GENERIC kernel of 6.0-RELEASE. I don't know if it's known issue or not, nor it is reproducible. If dmesg would be helpful, I can post it as well. I will keep the vmcore.0 for a while, too, just in case. -- Krzysztof Kowalik | () ASCII Ribbon Campaign Computer Center, AGH UST | /\ Support plain text e-mail