Glen Barber
2015-Jul-01 02:36 UTC
New FreeBSD snapshots available: stable/10 (20150625 r284813)
On Tue, Jun 30, 2015 at 10:27:21PM -0400, Chris Ross wrote:> > Yeah, this is the same panic you, I, and others have been seeing on sparc64's > with bge's, or at least v240's (and one other IIRC) for many many months. Thanks > for grabbing a core! > > When I was trying to search for a commit that caused the change of behavior, > I had difficultly doing it, but it was well back in 2014. The "boots sometimes" > makes this a hard one to track, but as I only have my production v240, also > makes it one I haven't spent as much time trying to find as I'd like. > > Thank you for letting me know this issue isn't fixed, though, despite the other > success with this code. :-) > > Hopefully your stacktrace can help figure out what is wrong. >A quick search through the PR system returned zero results for this. Did you file a PR previously? (If not, do you know of one that already exists that Kurt can update?) Glen> - Chris > > On Jun 30, 2015, at 22:14 , Kurt Lidl <lidl at pix.net> wrote: > > I got all excited and decided to give it a try on my dual-cpu > > V240 as well. I was able to get it installed, but it panics > > when booting off the mirrored ZFS drives. (Note: I have no > > reason to believe this is ZFS related.) > > > > ---- snip, snip ---- > > Setting hostname: spork.pix.net. > > bge0: link state changed to DOWN > > spin lock 0xc0cb9e38 (smp rendezvous) held by 0xfffff80003e93240 (tid 100340) too long > > timeout stopping cpus > > panic: spin lock held too long > > cpuid = 1 > > KDB: stack backtrace: > > #0 0xc0575380 at panic+0x20 > > #1 0xc0558e10 at _mtx_lock_spin_failed+0x50 > > #2 0xc0558ed8 at _mtx_lock_spin_cookie+0xb8 > > #3 0xc08d7b9c at tick_get_timecount_mp+0xdc > > #4 0xc0583c88 at binuptime+0x48 > > #5 0xc08a3b8c at timercb+0x6c > > #6 0xc08d7f00 at tick_intr+0x220 > > Uptime: 29s > > Dumping 8192 MB (4 chunks) > > chunk at 0: 2147483648 bytes ... ok > > chunk at 0x100000000: 2147483648 bytes ... ok > > chunk at 0x1000000000: 2147483648 bytes ... ok > > chunk at 0x1100000000: 2147483648 bytes ... ok > > > > Dump complete > > ---- snip, snip ---- > > > > Now the thing that amazes me is that this happened > > the first three times after I did the install, and > > on the fourth boot, it didn't panic. And it was > > able to 'savecore' the crashdump. > > > > Here's the stacktrace from the core.txt.0 file: > > > > -Kurt > > > > Reading symbols from /boot/kernel/zfs.ko.symbols...done. > > Loaded symbols for /boot/kernel/zfs.ko.symbols > > Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > > Loaded symbols for /boot/kernel/opensolaris.ko.symbols > > Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. > > Loaded symbols for /boot/kernel/geom_mirror.ko.symbols > > Reading symbols from /boot/kernel/tmpfs.ko.symbols...done. > > Loaded symbols for /boot/kernel/tmpfs.ko.symbols > > #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > > at /usr/src/sys/kern/kern_shutdown.c:262 > > 262 savectx(&dumppcb); > > (kgdb) #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) > > at /usr/src/sys/kern/kern_shutdown.c:262 > > #1 0x00000000c0574fb0 in kern_reboot (howto=260) > > at /usr/src/sys/kern/kern_shutdown.c:451 > > #2 0x00000000c0575358 in vpanic (fmt=0xc0b22fe0 "spin lock held too long", > > ap=0x1fa2da638) at /usr/src/sys/kern/kern_shutdown.c:758 > > #3 0x00000000c0575388 in panic (fmt=0xc0b22fe0 "spin lock held too long") > > at /usr/src/sys/kern/kern_shutdown.c:687 > > #4 0x00000000c0558e18 in _mtx_lock_spin_failed (m=0xc0cb9e38) > > at /usr/src/sys/kern/kern_mutex.c:561 > > #5 0x00000000c0558ee0 in _mtx_lock_spin_cookie (c=0xfffff80003e93240, > > tid=18446735277669594832, opts=0, file=0x0, line=0) > > at /usr/src/sys/kern/kern_mutex.c:608 > > #6 0x00000000c08d7ba4 in tick_get_timecount_mp (tc=0xc0d13378) at smp.h:206 > > #7 0x00000000c0583c90 in binuptime (bt=0x1fa2da980) > > at /usr/src/sys/kern/kern_tc.c:188 > > #8 0x00000000c08a3b94 in timercb (et=0xc0d13308, arg=<value optimized out>) > > at time.h:418 > > #9 0x00000000c08d7f08 in tick_intr (tf=0x1fa2dab20) > > at /usr/src/sys/sparc64/sparc64/tick.c:252 > > #10 0x00000000c00a11bc in tl1_intr () > > #11 0x00000000c08c934c in spinlock_exit () > > at /usr/src/sys/sparc64/sparc64/machdep.c:244 > > #12 0x00000000c08c9330 in spinlock_exit () > > at /usr/src/sys/sparc64/sparc64/machdep.c:240 > > #13 0x00000000c051a194 in cnputs (p=0x1fa2db11a "") > > at /usr/src/sys/kern/kern_cons.c:530 > > #14 0x00000000c05c06e0 in putchar (c=10, arg=0x1fa2db0c8) > > at /usr/src/sys/kern/subr_prf.c:437 > > #15 0x00000000c05bee90 in kvprintf (fmt=0xc0b2fb95 "", > > func=0xc05c02e0 <putchar>, arg=0x1fa2db0c8, radix=10, ap=0x1fa2db300) > > at /usr/src/sys/kern/subr_prf.c:655 > > #16 0x00000000c05bfe80 in _vprintf (level=5, flags=1, > > fmt=0xc0b2fb78 "%s: link state changed to %s\n", ap=0x1fa2db2f0) > > at /usr/src/sys/kern/subr_prf.c:281 > > #17 0x00000000c05c0270 in log (level=5, > > fmt=0xc0b2fb78 "%s: link state changed to %s\n") > > at /usr/src/sys/kern/subr_prf.c:308 > > #18 0x00000000c064ec28 in do_link_state_change (arg=0xfffff80003396800, > > pending=1) at /usr/src/sys/net/if.c:2131 > > #19 0x00000000c05cab38 in taskqueue_run_locked (queue=0xfffff80003288000) > > at /usr/src/sys/kern/subr_taskqueue.c:342 > > #20 0x00000000c05cacec in taskqueue_run (queue=0xfffff80003288000) > > at /usr/src/sys/kern/subr_taskqueue.c:358 > > #21 0x00000000c05cae20 in taskqueue_swi_run (dummy=0x0) > > at /usr/src/sys/kern/subr_taskqueue.c:471 > > #22 0x00000000c0539cc4 in intr_event_execute_handlers (p=0xfffff80003295860, > > ie=0xfffff80003287e00) at /usr/src/sys/kern/kern_intr.c:1264 > > #23 0x00000000c053b86c in ithread_loop (arg=0xfffff8000324c080) > > at /usr/src/sys/kern/kern_intr.c:1277 > > #24 0x00000000c0536428 in fork_exit (callout=0xc053b780 <ithread_loop>, > > arg=0xfffff8000324c080, frame=0x1fa2db880) > > at /usr/src/sys/kern/kern_fork.c:1018 > > #25 0x00000000c00a1270 in fork_trampoline () > > #26 0x00000000c00a1270 in fork_trampoline () > > Previous frame identical to this frame (corrupt stack?) > > (kgdb) > > > > > > _______________________________________________ > > freebsd-stable at freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" > > >-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150701/a6d51591/attachment.bin>
Chris Ross
2015-Jul-01 02:48 UTC
New FreeBSD snapshots available: stable/10 (20150625 r284813)
On Jun 30, 2015, at 22:36 , Glen Barber <gjb at FreeBSD.org> wrote:> On Tue, Jun 30, 2015 at 10:27:21PM -0400, Chris Ross wrote: >> >> Yeah, this is the same panic you, I, and others have been seeing on sparc64's >> with bge's, or at least v240's (and one other IIRC) for many many months. Thanks >> for grabbing a core! >> >> When I was trying to search for a commit that caused the change of behavior, >> I had difficultly doing it, but it was well back in 2014. The "boots sometimes" >> makes this a hard one to track, but as I only have my production v240, also >> makes it one I haven't spent as much time trying to find as I'd like. >> >> Thank you for letting me know this issue isn't fixed, though, despite the other >> success with this code. :-) >> >> Hopefully your stacktrace can help figure out what is wrong. >> > > A quick search through the PR system returned zero results for this. > Did you file a PR previously? (If not, do you know of one that already > exists that Kurt can update?)The "long" thread I see in my emails are with subject "FreeBSD 10-STABLE/sparc64 panic". May/June, and then later September and October, and I don't see anyone to have created a PR. I think I got confused and dismayed in June, from reading back, and then never got to trying hard again. The first report I see is from Kurt, http://lists.freebsd.org/pipermail/freebsd-sparc64/2014-March/009261.html, so well over a year ago. But, no mention in that thread about a PR either. I think you may be right, Glen, that there isn't one, and that's on me as well as others. Hopefully, some of the searching through various revisions of 10/stable I documented in the "FreeBSD 10-STABLE/sparc64 panic" thread in May 2014 may help in the end, though. Thanks. tl;dr; I don't know of an existing PR. - Chris>> >> On Jun 30, 2015, at 22:14 , Kurt Lidl <lidl at pix.net> wrote: >>> I got all excited and decided to give it a try on my dual-cpu >>> V240 as well. I was able to get it installed, but it panics >>> when booting off the mirrored ZFS drives. (Note: I have no >>> reason to believe this is ZFS related.) >>> >>> ---- snip, snip ---- >>> Setting hostname: spork.pix.net. >>> bge0: link state changed to DOWN >>> spin lock 0xc0cb9e38 (smp rendezvous) held by 0xfffff80003e93240 (tid 100340) too long >>> timeout stopping cpus >>> panic: spin lock held too long >>> cpuid = 1 >>> KDB: stack backtrace: >>> #0 0xc0575380 at panic+0x20 >>> #1 0xc0558e10 at _mtx_lock_spin_failed+0x50 >>> #2 0xc0558ed8 at _mtx_lock_spin_cookie+0xb8 >>> #3 0xc08d7b9c at tick_get_timecount_mp+0xdc >>> #4 0xc0583c88 at binuptime+0x48 >>> #5 0xc08a3b8c at timercb+0x6c >>> #6 0xc08d7f00 at tick_intr+0x220 >>> Uptime: 29s >>> Dumping 8192 MB (4 chunks) >>> chunk at 0: 2147483648 bytes ... ok >>> chunk at 0x100000000: 2147483648 bytes ... ok >>> chunk at 0x1000000000: 2147483648 bytes ... ok >>> chunk at 0x1100000000: 2147483648 bytes ... ok >>> >>> Dump complete >>> ---- snip, snip ---- >>> >>> Now the thing that amazes me is that this happened >>> the first three times after I did the install, and >>> on the fourth boot, it didn't panic. And it was >>> able to 'savecore' the crashdump. >>> >>> Here's the stacktrace from the core.txt.0 file: >>> >>> -Kurt >>> >>> Reading symbols from /boot/kernel/zfs.ko.symbols...done. >>> Loaded symbols for /boot/kernel/zfs.ko.symbols >>> Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. >>> Loaded symbols for /boot/kernel/opensolaris.ko.symbols >>> Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. >>> Loaded symbols for /boot/kernel/geom_mirror.ko.symbols >>> Reading symbols from /boot/kernel/tmpfs.ko.symbols...done. >>> Loaded symbols for /boot/kernel/tmpfs.ko.symbols >>> #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) >>> at /usr/src/sys/kern/kern_shutdown.c:262 >>> 262 savectx(&dumppcb); >>> (kgdb) #0 0x00000000c05745bc in doadump (textdump=<value optimized out>) >>> at /usr/src/sys/kern/kern_shutdown.c:262 >>> #1 0x00000000c0574fb0 in kern_reboot (howto=260) >>> at /usr/src/sys/kern/kern_shutdown.c:451 >>> #2 0x00000000c0575358 in vpanic (fmt=0xc0b22fe0 "spin lock held too long", >>> ap=0x1fa2da638) at /usr/src/sys/kern/kern_shutdown.c:758 >>> #3 0x00000000c0575388 in panic (fmt=0xc0b22fe0 "spin lock held too long") >>> at /usr/src/sys/kern/kern_shutdown.c:687 >>> #4 0x00000000c0558e18 in _mtx_lock_spin_failed (m=0xc0cb9e38) >>> at /usr/src/sys/kern/kern_mutex.c:561 >>> #5 0x00000000c0558ee0 in _mtx_lock_spin_cookie (c=0xfffff80003e93240, >>> tid=18446735277669594832, opts=0, file=0x0, line=0) >>> at /usr/src/sys/kern/kern_mutex.c:608 >>> #6 0x00000000c08d7ba4 in tick_get_timecount_mp (tc=0xc0d13378) at smp.h:206 >>> #7 0x00000000c0583c90 in binuptime (bt=0x1fa2da980) >>> at /usr/src/sys/kern/kern_tc.c:188 >>> #8 0x00000000c08a3b94 in timercb (et=0xc0d13308, arg=<value optimized out>) >>> at time.h:418 >>> #9 0x00000000c08d7f08 in tick_intr (tf=0x1fa2dab20) >>> at /usr/src/sys/sparc64/sparc64/tick.c:252 >>> #10 0x00000000c00a11bc in tl1_intr () >>> #11 0x00000000c08c934c in spinlock_exit () >>> at /usr/src/sys/sparc64/sparc64/machdep.c:244 >>> #12 0x00000000c08c9330 in spinlock_exit () >>> at /usr/src/sys/sparc64/sparc64/machdep.c:240 >>> #13 0x00000000c051a194 in cnputs (p=0x1fa2db11a "") >>> at /usr/src/sys/kern/kern_cons.c:530 >>> #14 0x00000000c05c06e0 in putchar (c=10, arg=0x1fa2db0c8) >>> at /usr/src/sys/kern/subr_prf.c:437 >>> #15 0x00000000c05bee90 in kvprintf (fmt=0xc0b2fb95 "", >>> func=0xc05c02e0 <putchar>, arg=0x1fa2db0c8, radix=10, ap=0x1fa2db300) >>> at /usr/src/sys/kern/subr_prf.c:655 >>> #16 0x00000000c05bfe80 in _vprintf (level=5, flags=1, >>> fmt=0xc0b2fb78 "%s: link state changed to %s\n", ap=0x1fa2db2f0) >>> at /usr/src/sys/kern/subr_prf.c:281 >>> #17 0x00000000c05c0270 in log (level=5, >>> fmt=0xc0b2fb78 "%s: link state changed to %s\n") >>> at /usr/src/sys/kern/subr_prf.c:308 >>> #18 0x00000000c064ec28 in do_link_state_change (arg=0xfffff80003396800, >>> pending=1) at /usr/src/sys/net/if.c:2131 >>> #19 0x00000000c05cab38 in taskqueue_run_locked (queue=0xfffff80003288000) >>> at /usr/src/sys/kern/subr_taskqueue.c:342 >>> #20 0x00000000c05cacec in taskqueue_run (queue=0xfffff80003288000) >>> at /usr/src/sys/kern/subr_taskqueue.c:358 >>> #21 0x00000000c05cae20 in taskqueue_swi_run (dummy=0x0) >>> at /usr/src/sys/kern/subr_taskqueue.c:471 >>> #22 0x00000000c0539cc4 in intr_event_execute_handlers (p=0xfffff80003295860, >>> ie=0xfffff80003287e00) at /usr/src/sys/kern/kern_intr.c:1264 >>> #23 0x00000000c053b86c in ithread_loop (arg=0xfffff8000324c080) >>> at /usr/src/sys/kern/kern_intr.c:1277 >>> #24 0x00000000c0536428 in fork_exit (callout=0xc053b780 <ithread_loop>, >>> arg=0xfffff8000324c080, frame=0x1fa2db880) >>> at /usr/src/sys/kern/kern_fork.c:1018 >>> #25 0x00000000c00a1270 in fork_trampoline () >>> #26 0x00000000c00a1270 in fork_trampoline () >>> Previous frame identical to this frame (corrupt stack?) >>> (kgdb) >>> >>> >>> _______________________________________________ >>> freebsd-stable at freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" >>> >> > >-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150630/d31fab7d/attachment.bin>