Chris Ross
2015-Jul-01 02:27 UTC
New FreeBSD snapshots available: stable/10 (20150625 r284813)
Yeah, this is the same panic you, I, and others have been seeing on sparc64's with bge's, or at least v240's (and one other IIRC) for many many months. Thanks for grabbing a core! When I was trying to search for a commit that caused the change of behavior, I had difficultly doing it, but it was well back in 2014. The "boots sometimes" makes this a hard one to track, but as I only have my production v240, also makes it one I haven't spent as much time trying to find as I'd like. Thank you for letting me know this issue isn't fixed, though, despite the other success with this code. :-) Hopefully your stacktrace can help figure out what is wrong. - Chris On Jun 30, 2015, at 22:14 , Kurt Lidl <lidl at pix.net> wrote:> I got all excited and decided to give it a try on my dual-cpu > V240 as well. I was able to get it installed, but it panics > when booting off the mirrored ZFS drives. (Note: I have no > reason to believe this is ZFS related.) > > ---- snip, snip ---- > Setting hostname: spork.pix.net. > bge0: link state changed to DOWN > spin lock 0xc0cb9e38 (smp rendezvous) held by 0xfffff80003e93240 (tid 100340) too long > timeout stopping cpus > panic: spin lock held too long > cpuid = 1 > KDB: stack backtrace: > #0 0xc0575380 at panic+0x20 > #1 0xc0558e10 at _mtx_lock_spin_failed+0x50 > #2 0xc0558ed8 at _mtx_lock_spin_cookie+0xb8 > #3 0xc08d7b9c at tick_get_timecount_mp+0xdc > #4 0xc0583c88 at binuptime+0x48 > #5 0xc08a3b8c at timercb+0x6c > #6 0xc08d7f00 at tick_intr+0x220 > Uptime: 29s > Dumping 8192 MB (4 chunks) > chunk at 0: 2147483648 bytes ... ok > chunk at 0x100000000: 2147483648 bytes ... ok > chunk at 0x1000000000: 2147483648 bytes ... ok > chunk at 0x1100000000: 2147483648 bytes ... ok > > Dump complete > ---- snip, snip ---- > > Now the thing that amazes me is that this happened > the first three times after I did the install, and > on the fourth boot, it didn't panic. And it was > able to 'savecore' the crashdump. > > Here's the stacktrace from the core.txt.0 file: > > -Kurt > > Reading symbols from /boot/kernel/zfs.ko.symbols...done. > Loaded symbols for /boot/kernel/zfs.ko.symbols > Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > Loaded symbols for /boot/kernel/opensolaris.ko.symbols > Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. > Loaded symbols for /boot/kernel/geom_mirror.ko.symbols > Reading symbols from /boot/kernel/tmpfs.ko.symbols...done. > Loaded symbols for /boot/kernel/tmpfs.ko.symbols > #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > at /usr/src/sys/kern/kern_shutdown.c:262 > 262 savectx(&dumppcb); > (kgdb) #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > at /usr/src/sys/kern/kern_shutdown.c:262 > #1 0x00000000c0574fb0 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:451 > #2 0x00000000c0575358 in vpanic (fmt=0xc0b22fe0 "spin lock held too long", > ap=0x1fa2da638) at /usr/src/sys/kern/kern_shutdown.c:758 > #3 0x00000000c0575388 in panic (fmt=0xc0b22fe0 "spin lock held too long") > at /usr/src/sys/kern/kern_shutdown.c:687 > #4 0x00000000c0558e18 in _mtx_lock_spin_failed (m=0xc0cb9e38) > at /usr/src/sys/kern/kern_mutex.c:561 > #5 0x00000000c0558ee0 in _mtx_lock_spin_cookie (c=0xfffff80003e93240, > tid=18446735277669594832, opts=0, file=0x0, line=0) > at /usr/src/sys/kern/kern_mutex.c:608 > #6 0x00000000c08d7ba4 in tick_get_timecount_mp (tc=0xc0d13378) at smp.h:206 > #7 0x00000000c0583c90 in binuptime (bt=0x1fa2da980) > at /usr/src/sys/kern/kern_tc.c:188 > #8 0x00000000c08a3b94 in timercb (et=0xc0d13308, arg=<value optimized out>) > at time.h:418 > #9 0x00000000c08d7f08 in tick_intr (tf=0x1fa2dab20) > at /usr/src/sys/sparc64/sparc64/tick.c:252 > #10 0x00000000c00a11bc in tl1_intr () > #11 0x00000000c08c934c in spinlock_exit () > at /usr/src/sys/sparc64/sparc64/machdep.c:244 > #12 0x00000000c08c9330 in spinlock_exit () > at /usr/src/sys/sparc64/sparc64/machdep.c:240 > #13 0x00000000c051a194 in cnputs (p=0x1fa2db11a "") > at /usr/src/sys/kern/kern_cons.c:530 > #14 0x00000000c05c06e0 in putchar (c=10, arg=0x1fa2db0c8) > at /usr/src/sys/kern/subr_prf.c:437 > #15 0x00000000c05bee90 in kvprintf (fmt=0xc0b2fb95 "", > func=0xc05c02e0 <putchar>, arg=0x1fa2db0c8, radix=10, ap=0x1fa2db300) > at /usr/src/sys/kern/subr_prf.c:655 > #16 0x00000000c05bfe80 in _vprintf (level=5, flags=1, > fmt=0xc0b2fb78 "%s: link state changed to %s\n", ap=0x1fa2db2f0) > at /usr/src/sys/kern/subr_prf.c:281 > #17 0x00000000c05c0270 in log (level=5, > fmt=0xc0b2fb78 "%s: link state changed to %s\n") > at /usr/src/sys/kern/subr_prf.c:308 > #18 0x00000000c064ec28 in do_link_state_change (arg=0xfffff80003396800, > pending=1) at /usr/src/sys/net/if.c:2131 > #19 0x00000000c05cab38 in taskqueue_run_locked (queue=0xfffff80003288000) > at /usr/src/sys/kern/subr_taskqueue.c:342 > #20 0x00000000c05cacec in taskqueue_run (queue=0xfffff80003288000) > at /usr/src/sys/kern/subr_taskqueue.c:358 > #21 0x00000000c05cae20 in taskqueue_swi_run (dummy=0x0) > at /usr/src/sys/kern/subr_taskqueue.c:471 > #22 0x00000000c0539cc4 in intr_event_execute_handlers (p=0xfffff80003295860, > ie=0xfffff80003287e00) at /usr/src/sys/kern/kern_intr.c:1264 > #23 0x00000000c053b86c in ithread_loop (arg=0xfffff8000324c080) > at /usr/src/sys/kern/kern_intr.c:1277 > #24 0x00000000c0536428 in fork_exit (callout=0xc053b780 <ithread_loop>, > arg=0xfffff8000324c080, frame=0x1fa2db880) > at /usr/src/sys/kern/kern_fork.c:1018 > #25 0x00000000c00a1270 in fork_trampoline () > #26 0x00000000c00a1270 in fork_trampoline () > Previous frame identical to this frame (corrupt stack?) > (kgdb) > > > _______________________________________________ > freebsd-stable at freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" >-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150630/e96e0a5e/attachment.bin>
Glen Barber
2015-Jul-01 02:36 UTC
New FreeBSD snapshots available: stable/10 (20150625 r284813)
On Tue, Jun 30, 2015 at 10:27:21PM -0400, Chris Ross wrote:> > Yeah, this is the same panic you, I, and others have been seeing on sparc64's > with bge's, or at least v240's (and one other IIRC) for many many months. Thanks > for grabbing a core! > > When I was trying to search for a commit that caused the change of behavior, > I had difficultly doing it, but it was well back in 2014. The "boots sometimes" > makes this a hard one to track, but as I only have my production v240, also > makes it one I haven't spent as much time trying to find as I'd like. > > Thank you for letting me know this issue isn't fixed, though, despite the other > success with this code. :-) > > Hopefully your stacktrace can help figure out what is wrong. >A quick search through the PR system returned zero results for this. Did you file a PR previously? (If not, do you know of one that already exists that Kurt can update?) Glen> - Chris > > On Jun 30, 2015, at 22:14 , Kurt Lidl <lidl at pix.net> wrote: > > I got all excited and decided to give it a try on my dual-cpu > > V240 as well. I was able to get it installed, but it panics > > when booting off the mirrored ZFS drives. (Note: I have no > > reason to believe this is ZFS related.) > > > > ---- snip, snip ---- > > Setting hostname: spork.pix.net. > > bge0: link state changed to DOWN > > spin lock 0xc0cb9e38 (smp rendezvous) held by 0xfffff80003e93240 (tid 100340) too long > > timeout stopping cpus > > panic: spin lock held too long > > cpuid = 1 > > KDB: stack backtrace: > > #0 0xc0575380 at panic+0x20 > > #1 0xc0558e10 at _mtx_lock_spin_failed+0x50 > > #2 0xc0558ed8 at _mtx_lock_spin_cookie+0xb8 > > #3 0xc08d7b9c at tick_get_timecount_mp+0xdc > > #4 0xc0583c88 at binuptime+0x48 > > #5 0xc08a3b8c at timercb+0x6c > > #6 0xc08d7f00 at tick_intr+0x220 > > Uptime: 29s > > Dumping 8192 MB (4 chunks) > > chunk at 0: 2147483648 bytes ... ok > > chunk at 0x100000000: 2147483648 bytes ... ok > > chunk at 0x1000000000: 2147483648 bytes ... ok > > chunk at 0x1100000000: 2147483648 bytes ... ok > > > > Dump complete > > ---- snip, snip ---- > > > > Now the thing that amazes me is that this happened > > the first three times after I did the install, and > > on the fourth boot, it didn't panic. And it was > > able to 'savecore' the crashdump. > > > > Here's the stacktrace from the core.txt.0 file: > > > > -Kurt > > > > Reading symbols from /boot/kernel/zfs.ko.symbols...done. > > Loaded symbols for /boot/kernel/zfs.ko.symbols > > Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > > Loaded symbols for /boot/kernel/opensolaris.ko.symbols > > Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. > > Loaded symbols for /boot/kernel/geom_mirror.ko.symbols > > Reading symbols from /boot/kernel/tmpfs.ko.symbols...done. > > Loaded symbols for /boot/kernel/tmpfs.ko.symbols > > #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > > at /usr/src/sys/kern/kern_shutdown.c:262 > > 262 savectx(&dumppcb); > > (kgdb) #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > > at /usr/src/sys/kern/kern_shutdown.c:262 > > #1 0x00000000c0574fb0 in kern_reboot (howto=260) > > at /usr/src/sys/kern/kern_shutdown.c:451 > > #2 0x00000000c0575358 in vpanic (fmt=0xc0b22fe0 "spin lock held too long", > > ap=0x1fa2da638) at /usr/src/sys/kern/kern_shutdown.c:758 > > #3 0x00000000c0575388 in panic (fmt=0xc0b22fe0 "spin lock held too long") > > at /usr/src/sys/kern/kern_shutdown.c:687 > > #4 0x00000000c0558e18 in _mtx_lock_spin_failed (m=0xc0cb9e38) > > at /usr/src/sys/kern/kern_mutex.c:561 > > #5 0x00000000c0558ee0 in _mtx_lock_spin_cookie (c=0xfffff80003e93240, > > tid=18446735277669594832, opts=0, file=0x0, line=0) > > at /usr/src/sys/kern/kern_mutex.c:608 > > #6 0x00000000c08d7ba4 in tick_get_timecount_mp (tc=0xc0d13378) at smp.h:206 > > #7 0x00000000c0583c90 in binuptime (bt=0x1fa2da980) > > at /usr/src/sys/kern/kern_tc.c:188 > > #8 0x00000000c08a3b94 in timercb (et=0xc0d13308, arg=<value optimized out>) > > at time.h:418 > > #9 0x00000000c08d7f08 in tick_intr (tf=0x1fa2dab20) > > at /usr/src/sys/sparc64/sparc64/tick.c:252 > > #10 0x00000000c00a11bc in tl1_intr () > > #11 0x00000000c08c934c in spinlock_exit () > > at /usr/src/sys/sparc64/sparc64/machdep.c:244 > > #12 0x00000000c08c9330 in spinlock_exit () > > at /usr/src/sys/sparc64/sparc64/machdep.c:240 > > #13 0x00000000c051a194 in cnputs (p=0x1fa2db11a "") > > at /usr/src/sys/kern/kern_cons.c:530 > > #14 0x00000000c05c06e0 in putchar (c=10, arg=0x1fa2db0c8) > > at /usr/src/sys/kern/subr_prf.c:437 > > #15 0x00000000c05bee90 in kvprintf (fmt=0xc0b2fb95 "", > > func=0xc05c02e0 <putchar>, arg=0x1fa2db0c8, radix=10, ap=0x1fa2db300) > > at /usr/src/sys/kern/subr_prf.c:655 > > #16 0x00000000c05bfe80 in _vprintf (level=5, flags=1, > > fmt=0xc0b2fb78 "%s: link state changed to %s\n", ap=0x1fa2db2f0) > > at /usr/src/sys/kern/subr_prf.c:281 > > #17 0x00000000c05c0270 in log (level=5, > > fmt=0xc0b2fb78 "%s: link state changed to %s\n") > > at /usr/src/sys/kern/subr_prf.c:308 > > #18 0x00000000c064ec28 in do_link_state_change (arg=0xfffff80003396800, > > pending=1) at /usr/src/sys/net/if.c:2131 > > #19 0x00000000c05cab38 in taskqueue_run_locked (queue=0xfffff80003288000) > > at /usr/src/sys/kern/subr_taskqueue.c:342 > > #20 0x00000000c05cacec in taskqueue_run (queue=0xfffff80003288000) > > at /usr/src/sys/kern/subr_taskqueue.c:358 > > #21 0x00000000c05cae20 in taskqueue_swi_run (dummy=0x0) > > at /usr/src/sys/kern/subr_taskqueue.c:471 > > #22 0x00000000c0539cc4 in intr_event_execute_handlers (p=0xfffff80003295860, > > ie=0xfffff80003287e00) at /usr/src/sys/kern/kern_intr.c:1264 > > #23 0x00000000c053b86c in ithread_loop (arg=0xfffff8000324c080) > > at /usr/src/sys/kern/kern_intr.c:1277 > > #24 0x00000000c0536428 in fork_exit (callout=0xc053b780 <ithread_loop>, > > arg=0xfffff8000324c080, frame=0x1fa2db880) > > at /usr/src/sys/kern/kern_fork.c:1018 > > #25 0x00000000c00a1270 in fork_trampoline () > > #26 0x00000000c00a1270 in fork_trampoline () > > Previous frame identical to this frame (corrupt stack?) > > (kgdb) > > > > > > _______________________________________________ > > freebsd-stable at freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" > > >-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150701/a6d51591/attachment.bin>
Fabian Keil
2015-Jul-01 09:18 UTC
New FreeBSD snapshots available: stable/10 (20150625 r284813)
Chris Ross <cross+freebsd at distal.com> wrote:> Yeah, this is the same panic you, I, and others have been seeing on > sparc64's with bge's, or at least v240's (and one other IIRC) for many > many months. Thanks for grabbing a core!Does it make a difference if you boot with hw.bge.allow_asf=0? According to the man page it is known to "cause system lockup problems on a small number of systems". It's not obvious to me why it's enabled by default on FreeBSD and I disable it on all my systems. Fabian -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 181 bytes Desc: OpenPGP digital signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150701/ebdc77ff/attachment.bin>