On Wednesday, October 07, 2015 08:52:30 AM Christian Kratzer wrote:> Hi, > > On Tue, 6 Oct 2015, John Baldwin wrote: > <snipp/> > >> This crash is occurring when doing an mtx_unlock(&Giant). Unfortunately, I'm not > >> conversant w.r.t. this code. I've cc'd jhb@ in case he has some insight. > >> If you don't get any responses, I'd suggest reposting to freebsd-current@ with > >> "crashes in mtx_unlock(&Giant)" in the subject line. > >> > >> Btw John, the code does tsleep() in a loop before the mtx_unlock(&Giant). I do > >> remember that was once allowed, but am not sure if it still is (ie a tsleep() call > >> while holding Giant)? > >> > >> Hopefully someone who knows what is special about Giant that might cause this will > >> respond. > >> > >> Good luck with it, rick > > > > tsleep() with Giant is still allowed. However, this sort of panic usually means > > you unlocked a mutex you didn't hold (but without INVARIANTS enabled or you'd get > > an assertion failure earlier). > > > > I don't see anything obviously wrong in smb_iod_thread() however. > > > > If you have the crashdump, can you please run this in kgdb: > > > > frame 9 > > p (struct mtx *)c > > p *(struct mtx *)c > > yes I have. Here we go: > > --snipp-- > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x20 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80996c7c > stack pointer = 0x28:0xfffffe004e79bac0 > frame pointer = 0x28:0xfffffe004e79baf0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 12235 (smbiod172) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80984e30 at kdb_backtrace+0x60 > #1 0xffffffff809489e6 at vpanic+0x126 > #2 0xffffffff809488b3 at panic+0x43 > #3 0xffffffff80d4aadb at trap_fatal+0x36b > #4 0xffffffff80d4addd at trap_pfault+0x2ed > #5 0xffffffff80d4a47a at trap+0x47a > #6 0xffffffff80d307f2 at calltrap+0x8 > #7 0xffffffff8092ebe0 at __mtx_unlock_sleep+0x60 > #8 0xffffffff8092eb69 at __mtx_unlock_flags+0x69 > #9 0xffffffff81a1b724 at smb_iod_thread+0xb4 > #10 0xffffffff8091244a at fork_exit+0x9a > #11 0xffffffff80d30d2e at fork_trampoline+0xe > Uptime: 1d18h34m4s > Dumping 161 out of 999 MB:..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% > > Reading symbols from /boot/kernel/smbfs.ko.symbols...done. > Loaded symbols for /boot/kernel/smbfs.ko.symbols > Reading symbols from /boot/kernel/libiconv.ko.symbols...done. > Loaded symbols for /boot/kernel/libiconv.ko.symbols > Reading symbols from /boot/kernel/libmchain.ko.symbols...done. > Loaded symbols for /boot/kernel/libmchain.ko.symbols > #0 doadump (textdump=<value optimized out>) at pcpu.h:219 > 219 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) frame 9 > #9 0xffffffff8092ebe0 in __mtx_unlock_sleep (c=0xfffff8002f531790, opts=<value optimized out>, > file=0xffffffff81a25801 "%s: Can't handle disordered parameters %d:%d\n", line=1) at /usr/src/sys/kern/kern_mutex.c:791 > 791 /usr/src/sys/kern/kern_mutex.c: No such file or directory. > in /usr/src/sys/kern/kern_mutex.c > Current language: auto; currently minimal > (kgdb) p (struct mtx *)c > $1 = (struct mtx *) 0xfffff8002f531790 > (kgdb) p *(struct mtx *)c > $2 = {lock_object = {lo_name = 0x6 <Address 0x6 out of bounds>, lo_flags = 0, lo_data = 0, lo_witness = 0xfffff8002f531798}, > mtx_lock = 1444181401}Ok, so that is a destroyed mutex. This means it is probably not Giant, and it might be some mutex in smb_iod_main() that shows up in smb_iod_thread() due to inlining. Actually, we know this from your earlier mail: if (evp->ev_type & SMBIOD_EV_SYNC) { SMB_IOD_EVLOCK(iod); wakeup(evp); SMB_IOD_EVUNLOCK(iod); Line 624 is that SMB_IOD_EVUNLOCK(). Hmm, does 'p *evp' work at frame 10? If not, can you try building the devel/gdb port from a recent ports tree with the 'KGDB' option enabled and use 'kgdb710' instead of 'kgdb' to see if you can print out '*evp'?> (kgdb) > --snipp-- > > I can build a GENERIC kernel with INVARIANTS enabled on the box to see if we get a better assertions next time this happens.That would be great, but please keep the existing core and kernel. We might be able to figure this out from that still. Also, go ahead and put this patch in and let me know if you ever see the printf logged. If you do, that could explain this panic (and we might need a more involved fix to avoid memory leaks). Index: smb_iod.c ==================================================================--- smb_iod.c (revision 288952) +++ smb_iod.c (working copy) @@ -624,6 +624,13 @@ SMB_IOD_EVUNLOCK(iod); } else free(evp, M_SMBIOD); + if (iod->iod_flags & SMBIOD_SHUTDOWN) { + if (!STAILQ_EMPTY(&iod->iod_evlist)) + printf("%s: shutdown with pending events\n", + __func__); + } + return; + } } #if 0 if (iod->iod_state == SMBIOD_ST_VCACTIVE) { -- John Baldwin
Hi John, On Wed, 7 Oct 2015, John Baldwin wrote:>> mtx_lock = 1444181401} > > Ok, so that is a destroyed mutex. This means it is probably not Giant, and > it might be some mutex in smb_iod_main() that shows up in smb_iod_thread() due > to inlining. > > Actually, we know this from your earlier mail: > > if (evp->ev_type & SMBIOD_EV_SYNC) { > SMB_IOD_EVLOCK(iod); > wakeup(evp); > SMB_IOD_EVUNLOCK(iod); > > Line 624 is that SMB_IOD_EVUNLOCK(). > > Hmm, does 'p *evp' work at frame 10? If not, can you try building the > devel/gdb port from a recent ports tree with the 'KGDB' option enabled and > use 'kgdb710' instead of 'kgdb' to see if you can print out '*evp'?kgdb hangs when chaning to frame10. I will build the port later (svn ports checkout in progress) I have cloned the VM so that I have this isolated from my production network.>> (kgdb) >> --snipp-- >> >> I can build a GENERIC kernel with INVARIANTS enabled on the box to see if we get a better assertions next time this happens. > > That would be great, but please keep the existing core and kernel. We might > be able to figure this out from that still. > > Also, go ahead and put this patch in and let me know if you ever see the > printf logged. If you do, that could explain this panic (and we might need > a more involved fix to avoid memory leaks). > > Index: smb_iod.c > ==================================================================> --- smb_iod.c (revision 288952) > +++ smb_iod.c (working copy) > @@ -624,6 +624,13 @@ > SMB_IOD_EVUNLOCK(iod); > } else > free(evp, M_SMBIOD); > + if (iod->iod_flags & SMBIOD_SHUTDOWN) { > + if (!STAILQ_EMPTY(&iod->iod_evlist)) > + printf("%s: shutdown with pending events\n", > + __func__); > + } > + return; > + } > } > #if 0 > if (iod->iod_state == SMBIOD_ST_VCACTIVE) {The vm is now runnning lates 10-stable kernel with above patch and invariants enabled. Give it about 2 days to produce the next crash. Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck at cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/
Hi John, the box crashed again running a 10-stable kernel with following patch of yours: On Wed, 7 Oct 2015, John Baldwin wrote:> Index: smb_iod.c > ==================================================================> --- smb_iod.c (revision 288952) > +++ smb_iod.c (working copy) > @@ -624,6 +624,13 @@ > SMB_IOD_EVUNLOCK(iod); > } else > free(evp, M_SMBIOD); > + if (iod->iod_flags & SMBIOD_SHUTDOWN) { > + if (!STAILQ_EMPTY(&iod->iod_evlist)) > + printf("%s: shutdown with pending events\n", > + __func__); > + } > + return; > + } > } > #if 0 > if (iod->iod_state == SMBIOD_ST_VCACTIVE) { >here is what I got on the kvm console: login: panic: Assertion mtx_unowned(m) failed at /usr/src/sys/kern/kern_mutex.c:955^M cpuid = 1^M KDB: stack backtrace:^M #0 0xffffffff80975bb0 at kdb_backtrace+0x60^M #1 0xffffffff8093baa6 at vpanic+0x126^M #2 0xffffffff8093b979 at kassert_panic+0x139^M #3 0xffffffff80921c47 at _mtx_destroy+0x77^M #4 0xffffffff81a1c114 at smb_iod_destroy+0xc4^M #5 0xffffffff81a12eea at smb_vc_free+0x1a^M #6 0xffffffff81a13e24 at sdp_trydestroy+0xb4^M #7 0xffffffff81a1cb36 at smbfs_unmount+0xd6^M #8 0xffffffff809d9e84 at dounmount+0x524^M #9 0xffffffff809d9936 at sys_unmount+0x3c6^M #10 0xffffffff80d42235 at amd64_syscall+0x265^M #11 0xffffffff80d25cfb at Xfast_syscall+0xfb^M Uptime: 19h48m28s^M Dumping 179 out of 999 MB:..9%..18%..27%..36%..45%..54%..63%..72%..81%..98%^M Dump complete^M Automatic reboot in 15 seconds - press a key on the console to abort^M heres the crashinfo: panic: Assertion mtx_unowned(m) failed at /usr/src/sys/kern/kern_mutex.c:955 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: Assertion mtx_unowned(m) failed at /usr/src/sys/kern/kern_mutex.c:955 cpuid = 1 KDB: stack backtrace: #0 0xffffffff80975bb0 at kdb_backtrace+0x60 #1 0xffffffff8093baa6 at vpanic+0x126 #2 0xffffffff8093b979 at kassert_panic+0x139 #3 0xffffffff80921c47 at _mtx_destroy+0x77 #4 0xffffffff81a1c114 at smb_iod_destroy+0xc4 #5 0xffffffff81a12eea at smb_vc_free+0x1a #6 0xffffffff81a13e24 at sdp_trydestroy+0xb4 #7 0xffffffff81a1cb36 at smbfs_unmount+0xd6 #8 0xffffffff809d9e84 at dounmount+0x524 #9 0xffffffff809d9936 at sys_unmount+0x3c6 #10 0xffffffff80d42235 at amd64_syscall+0x265 #11 0xffffffff80d25cfb at Xfast_syscall+0xfb Uptime: 19h48m28s Dumping 179 out of 999 MB:..9%..18%..27%..36%..45%..54%..63%..72%..81%..98% Reading symbols from /boot/kernel/smbfs.ko.symbols...done. Loaded symbols for /boot/kernel/smbfs.ko.symbols Reading symbols from /boot/kernel/libiconv.ko.symbols...done. Loaded symbols for /boot/kernel/libiconv.ko.symbols Reading symbols from /boot/kernel/libmchain.ko.symbols...done. Loaded symbols for /boot/kernel/libmchain.ko.symbols #0 doadump (textdump=<value optimized out>) at pcpu.h:219 219 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:219 #1 0xffffffff8093b5f2 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451 #2 0xffffffff8093bae5 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3 0xffffffff8093b979 in kassert_panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:646 #4 0xffffffff80921c47 in _mtx_destroy (c=0xfffff80009284690) at /usr/src/sys/kern/kern_mutex.c:955 #5 0xffffffff81a1c114 in smb_iod_destroy (iod=0xfffff80009284600) at /usr/src/sys/modules/smbfs/../../netsmb/smb_iod.c:706 #6 0xffffffff81a12eea in smb_vc_free (cp=0xfffff8003a602a00) at /usr/src/sys/modules/smbfs/../../netsmb/smb_conn.c:499 #7 0xffffffff81a13e24 in sdp_trydestroy (sdp=0xfffff8000a7cbc80) at /usr/src/sys/modules/smbfs/../../netsmb/smb_dev.c:166 #8 0xffffffff81a1cb36 in smbfs_unmount (mp=0xfffff80039f88330, mntflags=<value optimized out>) at /usr/src/sys/modules/smbfs/../../fs/smbfs/smbfs_vfsops.c:297 #9 0xffffffff809d9e84 in dounmount (mp=0xfffff80039f88330, flags=134217728, td=0xfffff8000f2b0000) at /usr/src/sys/kern/vfs_mount.c:1313 #10 0xffffffff809d9936 in sys_unmount (td=0xfffff8000f2b0000, uap=0xfffffe003d67fb80) at /usr/src/sys/kern/vfs_mount.c:1205 #11 0xffffffff80d42235 in amd64_syscall (td=0xfffff8000f2b0000, traced=0) at subr_syscall.c:134 #12 0xffffffff80d25cfb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #13 0x000000080089190a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) I have kgdb710 from ports setup in case you need me to check something. Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck at cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/