FreeBSD 8.1-STABLE #2: Thu Oct 21 15:30:45 UTC 2010 root@rip.psg.com:/usr/obj/usr/src/sys/RIP amd64 console recording em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header em0: discard frame w/o packet header panic: sbflush_internal: cc 4294965301 || mb 0 || mbcnt 0 cpuid = 0 panic: bufwrite: buffer is not busy??? cpuid = 0 Fatal trap 12: page fault while in kernel mode Uptime: cpuid = 2; 48mapic id = 02 36s fault virtual address = 0xffff804000000000 Physical memory: 4086 MB fault code = supervisor read data, page not present Dumping 1647 MB:instruction pointer = 0x20:0xffffffff804c22ae (CTRL-C to abort) stack pointer = 0x28:0xffffff80000de9a0 frame pointer = 0x28:0xffffff80000de9b0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em0 taskq) trap number = 12 1632 1616 1600 1584 1568 1552 1536 1520 1504 1488 1472 1456 1440 1424 1408 1392 1376 1360 1344 1328 1312 1296 1280 1264 1248 1232 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024 1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 768 752 736 720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16Attempt to write outside dump device boundaries. ** DUMP FAILED (ERROR 6) ** Automatic reboot in 15 seconds - press a key on the console to abort em0: Watchdog timeout -- resetting and locked up. required power cycle to reboot randy
On Thu, Oct 21, 2010 at 12:08:23PM -0700, Randy Bush wrote:> FreeBSD 8.1-STABLE #2: Thu Oct 21 15:30:45 UTC 2010 > root@rip.psg.com:/usr/obj/usr/src/sys/RIP amd64 > > console recording > > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > panic: sbflush_internal: cc 4294965301 || mb 0 || mbcnt 0 > cpuid = 0 > panic: bufwrite: buffer is not busy??? > > > cpuid = 0 > Fatal trap 12: page fault while in kernel mode > Uptime: cpuid = 2; 48mapic id = 02 > 36s > fault virtual address = 0xffff804000000000 > Physical memory: 4086 MB > fault code = supervisor read data, page not present > Dumping 1647 MB:instruction pointer = 0x20:0xffffffff804c22ae > (CTRL-C to abort) stack pointer = 0x28:0xffffff80000de9a0 > frame pointer = 0x28:0xffffff80000de9b0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 0 (em0 taskq) > trap number = 12 > 1632 1616 1600 1584 1568 1552 1536 1520 1504 1488 1472 1456 1440 1424 1408 1392 1376 1360 1344 1328 1312 1296 1280 1264 1248 1232 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024 1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 768 752 736 720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16Attempt to write outside dump device boundaries. > > ** DUMP FAILED (ERROR 6) ** > Automatic reboot in 15 seconds - press a key on the console to abort > em0: Watchdog timeout -- resetting > > and locked up. required power cycle to rebootCC'ing Jack Vogel of Intel, who is currently re-working portions of the em(4) driver. I think taskq issue might be the thing he's fixing and thus might have a workaround for you. But we're going to need to know exactly what em(4) model you have. Please provide "dmesg" output relevant to em0, and also "pciconf -lvc" output for the em0@<xxx> device. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On 10/21/10 21:08, Randy Bush wrote:> FreeBSD 8.1-STABLE #2: Thu Oct 21 15:30:45 UTC 2010 > root@rip.psg.com:/usr/obj/usr/src/sys/RIP amd64 > > console recording > > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > em0: discard frame w/o packet header > panic: sbflush_internal: cc 4294965301 || mb 0 || mbcnt 0 > cpuid = 0 > panic: bufwrite: buffer is not busy???What does the machine do? Does it perhaps have 6to4 (stf) enabled?
At 03:08 PM 10/21/2010, Randy Bush wrote:>FreeBSD 8.1-STABLE #2: Thu Oct 21 15:30:45 UTC 2010 > root@rip.psg.com:/usr/obj/usr/src/sys/RIP amd64 > >console recording>em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet headerHi Randy, Do you know how this panic is triggered ? Are you able to create it on demand ? ---Mike>em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >em0: discard frame w/o packet header >panic: sbflush_internal: cc 4294965301 || mb 0 || mbcnt 0 >cpuid = 0 >panic: bufwrite: buffer is not busy??? > > >cpuid = 0 >Fatal trap 12: page fault while in kernel mode >Uptime: cpuid = 2; 48mapic id = 02 >36s >fault virtual address = 0xffff804000000000 >Physical memory: 4086 MB >fault code = supervisor read data, page not present >Dumping 1647 MB:instruction pointer = 0x20:0xffffffff804c22ae > (CTRL-C to abort) stack pointer = 0x28:0xffffff80000de9a0 >frame pointer = 0x28:0xffffff80000de9b0 >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 0 (em0 taskq) >trap number = 12 > 1632 1616 1600 1584 1568 1552 1536 1520 1504 1488 1472 1456 1440 > 1424 1408 1392 1376 1360 1344 1328 1312 1296 1280 1264 1248 1232 > 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024 > 1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 768 > 752 736 720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 > 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 > 208 192 176 160 144 128 112 96 80 64 48 32 16Attempt to write > outside dump device boundaries. > >** DUMP FAILED (ERROR 6) ** >Automatic reboot in 15 seconds - press a key on the console to abort >em0: Watchdog timeout -- resetting > >and locked up. required power cycle to reboot > >randy >_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"-------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike
At 09:11 PM 10/22/2010, Mike Tancsa wrote:>At 08:01 PM 10/22/2010, Chris Morrow wrote: >>Note, Warren and I attempted to test this this evening on a 10.04 Ubuntu >>box, no crashy-crashy... >I was able to trigger the issue on box (c). I was ping6ing box (a) when I did a hard down of (d)'s connected interface. The box then dropped to debugger Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80740a50 stack pointer = 0x28:0xffffff800005a890 frame pointer = 0x28:0xffffff800005a930 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock) [thread pid 12 tid 100007 ] Stopped at in6_cksum+0x410: movzwl (%rsi),%r10d db> bt Tracing pid 12 tid 100007 td 0xffffff00025083e0 in6_cksum() at in6_cksum+0x410 icmp6_reflect() at icmp6_reflect+0x312 icmp6_error() at icmp6_error+0x1ec nd6_llinfo_timer() at nd6_llinfo_timer+0x208 softclock() at softclock+0x2a6 intr_event_execute_handlers() at intr_event_execute_handlers+0x66 ithread_loop() at ithread_loop+0xb2 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff800005ad30, rbp = 0 --- db>>I was able to do it, but not the box I expected > >4 boxes > >(a) Attacking host 2001:db8:1:1/64 >(b) victim, not on a connected interface with a). Outside interface >- em0 - 2001:db8::2:1/64, inside interface - em1 - 2001:db8::3:1/64 >(c) a host behind (b) 2001:db8::3:c/64 >(d) a host behind (b), 2001:db8::3:d/64 > > >hosts (c) and (d) have default gateways to b). (c) however, has a >next hop for (a) via (d). So rather than go out its normal default >gateway, it takes an extra hop via (d). > >Start a ping6 from (a) to (c). Then down (d)'s interface so that >the ping6 fails. Let the ping keep running for an hour or >two. Eventually (b) gets error messages like > >Oct 22 18:38:32 zoo kernel: em1: discard frame w/o packet header > >and crashes. > >Unfortunately, I thought it would be (c) that crapped out, not (b) >and I didnt have crash dumps enabled on the host. Just in the >process of setting up a better environment. > > ---Mike > >>-chris >> >>On 10/22/10 16:27, Joel Jaeggli wrote: >> > Ok I'll try testing that on some box I can reach with both hands. >> > >> > fyi nagasaki is: >> > >> > [root@nagasaki ~]# uname -a >> > FreeBSD nagasaki.bogus.com 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #13: >> > Sun May 30 22:19:23 UTC 2010 >> > root@nagasaki.bogus.com:/usr/obj/usr/src/sys/GENERIC i386 >> > [root@nagasaki ~]# >> > >> > >> > On 10/22/10 1:17 PM, Randy Bush wrote: >> >>>>>>> Do you know how this panic is triggered ? Are you able to >> >>>>>>> create it on demand ? >> >>>>>> >> >>>>>> no i do not. bring server up and it'll happen in half an hour. >> >>>>>> and the server was happy for two months. so i am thinking hardware. >> >>>>> >> >>>>> Perhaps. The reason I ask is that I had a box go down last night with >> >>>>> the same set of errors. The box has a number of ipv6 routes, but its >> >>>>> next hop was down and the problems started soon after. So I wonder if >> >>>>> it has something to do with that. Do you have ipv6 on this box and >> >>>>> are all the next hop addresses correct / reachable ? >> >>>>> >> >>>>> Oct 22 02:06:02 i4 kernel: em1: discard frame w/o packet header >> >>>>> Oct 22 02:06:10 i4 kernel: em2: discard frame w/o packet header >> >>>>> Oct 22 02:06:21 i4 kernel: em1: discard frame w/o packet header >> >>>> >> >>>> it was co-incident with a border router being taken down for new router >> >>>> install. that router was the v6 exit the servers was >> using. i have now >> >>>> pointed default6 to a different exit. the server seems happy. >> >>> >> >>> >> >>> Are you servers still up ? I guess the question now is how to >> >>> trigger this problem on demand. Perhaps lots of inbound ipv6 traffic >> >>> with a bad next hop out ? How recent are you sources ? The kernel >> >>> said Oct 21st. Were the sources from then too ? >> >> >> >> yes, kernel and world from 21 oct >> >> >> >> chris had an idea on retrigger, install a static for a small dest that >> >> points to a hole. send a packet to the small dest. >> >> >> >> randy >> >> > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex Communications, mike@sentex.net >Providing Internet since 1994 www.sentex.net >Cambridge, Ontario Canada www.sentex.net/mike-------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike