*Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one of
the following two panics during multiuser startup, usually while running
the /usr/local/etc/rc.d scripts. (The instruction pointer is always
exactly one of these two, and they look fairly related.) If after two or
three reboots it manages to not panic, the system will run perfectly
stable.
For some probably-unrelated reason, the dump never finishes in either case.
First panic (note em0 warning before it):
-----
em0: discard frame w/o packet header
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff805e4fc5
stack pointer = 0x28:0xffffff80003299e0
frame pointer = 0x28:0xffffff8000329a00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (irq256: em0:rx 0)
trap number = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap() at trap+0x10a
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp =
0xffffff8000329a00 ---
m_freem() at m_freem+0x25
ether_nh_input() at ether_nh_input+0x82
netisr_dispatch_src() at netisr_dispatch_src+0x20b
em_rxeof() at em_rxeof+0x1ca
em_msix_rx() at em_msix_rx+0x24
intr_event_execute_handlers() at intr_event_execute_handlers+0x104
ithread_loop() at ithread_loop+0xa4
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
Uptime: 49s
Dumping 679 out of 12263 MB:
-----
Second panic (no em0 discard warning this time):
-----
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff8063c0e4
stack pointer = 0x28:0xffffff8000329a00
frame pointer = 0x28:0xffffff8000329a40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (irq256: em0:rx 0)
trap number = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap() at trap+0x10a
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp =
0xffffff8000329a40 ---
ether_nh_input() at ether_nh_input+0x94
netisr_dispatch_src() at netisr_dispatch_src+0x20b
em_rxeof() at em_rxeof+0x1ca
em_msix_rx() at em_msix_rx+0x24
intr_event_execute_handlers() at intr_event_execute_handlers+0x104
ithread_loop() at ithread_loop+0xa4
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
Uptime: 46s
Dumping 657 out of 12263 MB:..3%
On Mon, 28 Nov 2011 23:37:27 +0100, Mike Andrews <mandrews@bit0.com> wrote:> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one > of the following two panics during multiuser startup, usually while > running the /usr/local/etc/rc.d scripts. (The instruction pointer is > always exactly one of these two, and they look fairly related.) If > after two or three reboots it manages to not panic, the system will run > perfectly stable. > > For some probably-unrelated reason, the dump never finishes in either > case. > > First panic (note em0 warning before it): > ----- > em0: discard frame w/o packet header > > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff805e4fc5 > stack pointer = 0x28:0xffffff80003299e0 > frame pointer = 0x28:0xffffff8000329a00 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 12 (irq256: em0:rx 0) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > panic() at panic+0x187 > trap_fatal() at trap_fatal+0x290 > trap() at trap+0x10a > calltrap() at calltrap+0x8 > --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp = > 0xffffff8000329a00 --- > m_freem() at m_freem+0x25 > ether_nh_input() at ether_nh_input+0x82 > netisr_dispatch_src() at netisr_dispatch_src+0x20b > em_rxeof() at em_rxeof+0x1ca > em_msix_rx() at em_msix_rx+0x24 > intr_event_execute_handlers() at intr_event_execute_handlers+0x104 > ithread_loop() at ithread_loop+0xa4 > fork_exit() at fork_exit+0x11f > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- > Uptime: 49s > Dumping 679 out of 12263 MB: > > ----- > > Second panic (no em0 discard warning this time): > > ----- > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff8063c0e4 > stack pointer = 0x28:0xffffff8000329a00 > frame pointer = 0x28:0xffffff8000329a40 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 12 (irq256: em0:rx 0) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > panic() at panic+0x187 > trap_fatal() at trap_fatal+0x290 > trap() at trap+0x10a > calltrap() at calltrap+0x8 > --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp = > 0xffffff8000329a40 --- > ether_nh_input() at ether_nh_input+0x94 > netisr_dispatch_src() at netisr_dispatch_src+0x20b > em_rxeof() at em_rxeof+0x1ca > em_msix_rx() at em_msix_rx+0x24 > intr_event_execute_handlers() at intr_event_execute_handlers+0x104 > ithread_loop() at ithread_loop+0xa4 > fork_exit() at fork_exit+0x11f > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- > Uptime: 46s > Dumping 657 out of 12263 MB:..3% > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"Does it help if you disable msix on your em0? Google for 'sysctl em msix'. Or run 'sysctl -a | grep msix'. NB: I know nothing about the details of em of msix, so hopefully somebody with more clue responds also. Ronald.
On Mon, Nov 28, 2011 at 05:37:27PM -0500, Mike Andrews wrote:> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get > one of the following two panics during multiuser startup, usually > while running the /usr/local/etc/rc.d scripts. (The instruction > pointer is always exactly one of these two, and they look fairly > related.) If after two or three reboots it manages to not panic, > the system will run perfectly stable. > > For some probably-unrelated reason, the dump never finishes in either case. > > First panic (note em0 warning before it): > ----- > em0: discard frame w/o packet header > > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff805e4fc5 > stack pointer = 0x28:0xffffff80003299e0 > frame pointer = 0x28:0xffffff8000329a00 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 12 (irq256: em0:rx 0) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > panic() at panic+0x187 > trap_fatal() at trap_fatal+0x290 > trap() at trap+0x10a > calltrap() at calltrap+0x8 > --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp = 0xffffff8000329a00 --- > m_freem() at m_freem+0x25 > ether_nh_input() at ether_nh_input+0x82 > netisr_dispatch_src() at netisr_dispatch_src+0x20b > em_rxeof() at em_rxeof+0x1ca > em_msix_rx() at em_msix_rx+0x24 > intr_event_execute_handlers() at intr_event_execute_handlers+0x104 > ithread_loop() at ithread_loop+0xa4 > fork_exit() at fork_exit+0x11f > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- > Uptime: 49s > Dumping 679 out of 12263 MB: > > ----- > > Second panic (no em0 discard warning this time): > > ----- > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff8063c0e4 > stack pointer = 0x28:0xffffff8000329a00 > frame pointer = 0x28:0xffffff8000329a40 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 12 (irq256: em0:rx 0) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > panic() at panic+0x187 > trap_fatal() at trap_fatal+0x290 > trap() at trap+0x10a > calltrap() at calltrap+0x8 > --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp = 0xffffff8000329a40 --- > ether_nh_input() at ether_nh_input+0x94 > netisr_dispatch_src() at netisr_dispatch_src+0x20b > em_rxeof() at em_rxeof+0x1ca > em_msix_rx() at em_msix_rx+0x24 > intr_event_execute_handlers() at intr_event_execute_handlers+0x104 > ithread_loop() at ithread_loop+0xa4 > fork_exit() at fork_exit+0x11f > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- > Uptime: 46s > Dumping 657 out of 12263 MB:..3%We need the following things: * uname -a output * dmesg output (only details specific to emX NICs please) * pciconf -lvcb output (only details specific to emX NICs please) CC'ing Jack Vogel (driver author) who can hopefully shed some light on this. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
On Monday, November 28, 2011 5:37:27 pm Mike Andrews wrote:> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one of > the following two panics during multiuser startup, usually while running > the /usr/local/etc/rc.d scripts. (The instruction pointer is always > exactly one of these two, and they look fairly related.) If after two or > three reboots it manages to not panic, the system will run perfectly > stable. > > For some probably-unrelated reason, the dump never finishes in either case. > > First panic (note em0 warning before it): > ----- > em0: discard frame w/o packet headerThis is odd. I see one bug that could possibly trigger this, but not on x86: Index: if_em.c ==================================================================--- if_em.c (revision 228074) +++ if_em.c (working copy) @@ -4305,8 +4305,10 @@ em_rxeof(struct rx_ring *rxr, int count, int *done #ifndef __NO_STRICT_ALIGNMENT if (adapter->max_frame_size > (MCLBYTES - ETHER_ALIGN) && - em_fixup_rx(rxr) != 0) - goto skip; + em_fixup_rx(rxr) != 0) { + sendmp = NULL; + goto next_desc; + } #endif if (status & E1000_RXD_STAT_VP) { sendmp->m_pkthdr.ether_vtag @@ -4318,9 +4320,6 @@ em_rxeof(struct rx_ring *rxr, int count, int *done sendmp->m_pkthdr.flowid = rxr->msix; sendmp->m_flags |= M_FLOWID; #endif -#ifndef __NO_STRICT_ALIGNMENT -skip: -#endif rxr->fmp = rxr->lmp = NULL; } next_desc: @@ -4426,6 +4425,7 @@ em_fixup_rx(struct rx_ring *rxr) adapter->dropped_pkts++; m_freem(rxr->fmp); rxr->fmp = NULL; + rxr->lmp = NULL; error = ENOMEM; } } -- John Baldwin
On 11/28/11 5:48 PM, Ronald Klop wrote:> On Mon, 28 Nov 2011 23:37:27 +0100, Mike Andrews <mandrews@bit0.com> wrote: > >> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get >> one of the following two panics during multiuser startup, usually >> while running the /usr/local/etc/rc.d scripts. (The instruction >> pointer is always exactly one of these two, and they look fairly >> related.) If after two or three reboots it manages to not panic, the >> system will run perfectly stable. >> >> For some probably-unrelated reason, the dump never finishes in either >> case. >> >> First panic (note em0 warning before it): >> ----- >> em0: discard frame w/o packet header >> >> >> Fatal trap 9: general protection fault while in kernel mode >> cpuid = 0; apic id = 00 >> instruction pointer = 0x20:0xffffffff805e4fc5 >> stack pointer = 0x28:0xffffff80003299e0 >> frame pointer = 0x28:0xffffff8000329a00 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 12 (irq256: em0:rx 0) >> trap number = 9 >> panic: general protection fault >> cpuid = 0 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> kdb_backtrace() at kdb_backtrace+0x37 >> panic() at panic+0x187 >> trap_fatal() at trap_fatal+0x290 >> trap() at trap+0x10a >> calltrap() at calltrap+0x8 >> --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp >> = 0xffffff8000329a00 --- >> m_freem() at m_freem+0x25 >> ether_nh_input() at ether_nh_input+0x82 >> netisr_dispatch_src() at netisr_dispatch_src+0x20b >> em_rxeof() at em_rxeof+0x1ca >> em_msix_rx() at em_msix_rx+0x24 >> intr_event_execute_handlers() at intr_event_execute_handlers+0x104 >> ithread_loop() at ithread_loop+0xa4 >> fork_exit() at fork_exit+0x11f >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- >> Uptime: 49s >> Dumping 679 out of 12263 MB: >> >> ----- >> >> Second panic (no em0 discard warning this time): >> >> ----- >> >> Fatal trap 9: general protection fault while in kernel mode >> cpuid = 0; apic id = 00 >> instruction pointer = 0x20:0xffffffff8063c0e4 >> stack pointer = 0x28:0xffffff8000329a00 >> frame pointer = 0x28:0xffffff8000329a40 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 12 (irq256: em0:rx 0) >> trap number = 9 >> panic: general protection fault >> cpuid = 0 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> kdb_backtrace() at kdb_backtrace+0x37 >> panic() at panic+0x187 >> trap_fatal() at trap_fatal+0x290 >> trap() at trap+0x10a >> calltrap() at calltrap+0x8 >> --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp >> = 0xffffff8000329a40 --- >> ether_nh_input() at ether_nh_input+0x94 >> netisr_dispatch_src() at netisr_dispatch_src+0x20b >> em_rxeof() at em_rxeof+0x1ca >> em_msix_rx() at em_msix_rx+0x24 >> intr_event_execute_handlers() at intr_event_execute_handlers+0x104 >> ithread_loop() at ithread_loop+0xa4 >> fork_exit() at fork_exit+0x11f >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 --- >> Uptime: 46s >> Dumping 657 out of 12263 MB:..3%> Does it help if you disable msix on your em0? > Google for 'sysctl em msix'. Or run 'sysctl -a | grep msix'.OK, setting hw.em.enable_msix=0 in /boot/loader.conf does NOT help.