thr3ads.net - freebsd stable - Sporadic 9.0-RC2 boot-time panic [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Mike Andrews

2011-Nov-28 22:37 UTC

Sporadic 9.0-RC2 boot-time panic

*Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one of 
the following two panics during multiuser startup, usually while running 
the /usr/local/etc/rc.d scripts.  (The instruction pointer is always 
exactly one of these two, and they look fairly related.)  If after two or 
three reboots it manages to not panic, the system will run perfectly 
stable.

For some probably-unrelated reason, the dump never finishes in either case.

First panic (note em0 warning before it):
-----
em0: discard frame w/o packet header


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff805e4fc5
stack pointer           = 0x28:0xffffff80003299e0
frame pointer           = 0x28:0xffffff8000329a00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq256: em0:rx 0)
trap number             = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap() at trap+0x10a
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp =
0xffffff8000329a00 ---
m_freem() at m_freem+0x25
ether_nh_input() at ether_nh_input+0x82
netisr_dispatch_src() at netisr_dispatch_src+0x20b
em_rxeof() at em_rxeof+0x1ca
em_msix_rx() at em_msix_rx+0x24
intr_event_execute_handlers() at intr_event_execute_handlers+0x104
ithread_loop() at ithread_loop+0xa4
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
Uptime: 49s
Dumping 679 out of 12263 MB:

-----

Second panic (no em0 discard warning this time):

-----

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8063c0e4
stack pointer           = 0x28:0xffffff8000329a00
frame pointer           = 0x28:0xffffff8000329a40
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq256: em0:rx 0)
trap number             = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap() at trap+0x10a
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp =
0xffffff8000329a40 ---
ether_nh_input() at ether_nh_input+0x94
netisr_dispatch_src() at netisr_dispatch_src+0x20b
em_rxeof() at em_rxeof+0x1ca
em_msix_rx() at em_msix_rx+0x24
intr_event_execute_handlers() at intr_event_execute_handlers+0x104
ithread_loop() at ithread_loop+0xa4
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
Uptime: 46s
Dumping 657 out of 12263 MB:..3%

Ronald Klop

2011-Nov-28 23:05 UTC

head link

Sporadic 9.0-RC2 boot-time panic

On Mon, 28 Nov 2011 23:37:27 +0100, Mike Andrews <mandrews@bit0.com>
wrote:
> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one
> of the following two panics during multiuser startup, usually while  
> running the /usr/local/etc/rc.d scripts.  (The instruction pointer is  
> always exactly one of these two, and they look fairly related.)  If  
> after two or three reboots it manages to not panic, the system will run  
> perfectly stable.
>
> For some probably-unrelated reason, the dump never finishes in either  
> case.
>
> First panic (note em0 warning before it):
> -----
> em0: discard frame w/o packet header
>
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff805e4fc5
> stack pointer           = 0x28:0xffffff80003299e0
> frame pointer           = 0x28:0xffffff8000329a00
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                          = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 12 (irq256: em0:rx 0)
> trap number             = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> trap_fatal() at trap_fatal+0x290
> trap() at trap+0x10a
> calltrap() at calltrap+0x8
> --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp =  
> 0xffffff8000329a00 ---
> m_freem() at m_freem+0x25
> ether_nh_input() at ether_nh_input+0x82
> netisr_dispatch_src() at netisr_dispatch_src+0x20b
> em_rxeof() at em_rxeof+0x1ca
> em_msix_rx() at em_msix_rx+0x24
> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
> ithread_loop() at ithread_loop+0xa4
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
> Uptime: 49s
> Dumping 679 out of 12263 MB:
>
> -----
>
> Second panic (no em0 discard warning this time):
>
> -----
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff8063c0e4
> stack pointer           = 0x28:0xffffff8000329a00
> frame pointer           = 0x28:0xffffff8000329a40
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                          = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 12 (irq256: em0:rx 0)
> trap number             = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> trap_fatal() at trap_fatal+0x290
> trap() at trap+0x10a
> calltrap() at calltrap+0x8
> --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp =  
> 0xffffff8000329a40 ---
> ether_nh_input() at ether_nh_input+0x94
> netisr_dispatch_src() at netisr_dispatch_src+0x20b
> em_rxeof() at em_rxeof+0x1ca
> em_msix_rx() at em_msix_rx+0x24
> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
> ithread_loop() at ithread_loop+0xa4
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
> Uptime: 46s
> Dumping 657 out of 12263 MB:..3%
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
"freebsd-stable-unsubscribe@freebsd.org"
Does it help if you disable msix on your em0?
Google for 'sysctl em msix'. Or run 'sysctl -a | grep msix'.

NB: I know nothing about the details of em of msix, so hopefully somebody  
with more clue responds also.

Ronald.

Jeremy Chadwick

2011-Nov-29 06:50 UTC

head link

Sporadic 9.0-RC2 boot-time panic

On Mon, Nov 28, 2011 at 05:37:27PM -0500, Mike Andrews
wrote:> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get
> one of the following two panics during multiuser startup, usually
> while running the /usr/local/etc/rc.d scripts.  (The instruction
> pointer is always exactly one of these two, and they look fairly
> related.)  If after two or three reboots it manages to not panic,
> the system will run perfectly stable.
> 
> For some probably-unrelated reason, the dump never finishes in either case.
> 
> First panic (note em0 warning before it):
> -----
> em0: discard frame w/o packet header
> 
> 
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff805e4fc5
> stack pointer           = 0x28:0xffffff80003299e0
> frame pointer           = 0x28:0xffffff8000329a00
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 12 (irq256: em0:rx 0)
> trap number             = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> trap_fatal() at trap_fatal+0x290
> trap() at trap+0x10a
> calltrap() at calltrap+0x8
> --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp =
0xffffff8000329a00 ---
> m_freem() at m_freem+0x25
> ether_nh_input() at ether_nh_input+0x82
> netisr_dispatch_src() at netisr_dispatch_src+0x20b
> em_rxeof() at em_rxeof+0x1ca
> em_msix_rx() at em_msix_rx+0x24
> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
> ithread_loop() at ithread_loop+0xa4
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
> Uptime: 49s
> Dumping 679 out of 12263 MB:
> 
> -----
> 
> Second panic (no em0 discard warning this time):
> 
> -----
> 
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff8063c0e4
> stack pointer           = 0x28:0xffffff8000329a00
> frame pointer           = 0x28:0xffffff8000329a40
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 12 (irq256: em0:rx 0)
> trap number             = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> trap_fatal() at trap_fatal+0x290
> trap() at trap+0x10a
> calltrap() at calltrap+0x8
> --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp =
0xffffff8000329a40 ---
> ether_nh_input() at ether_nh_input+0x94
> netisr_dispatch_src() at netisr_dispatch_src+0x20b
> em_rxeof() at em_rxeof+0x1ca
> em_msix_rx() at em_msix_rx+0x24
> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
> ithread_loop() at ithread_loop+0xa4
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
> Uptime: 46s
> Dumping 657 out of 12263 MB:..3%
We need the following things:

* uname -a output
* dmesg output (only details specific to emX NICs please)
* pciconf -lvcb output (only details specific to emX NICs please)

CC'ing Jack Vogel (driver author) who can hopefully shed some light on
this.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |

John Baldwin

2011-Nov-29 15:50 UTC

head link

Sporadic 9.0-RC2 boot-time panic

On Monday, November 28, 2011 5:37:27 pm Mike Andrews
wrote:> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get one
of
> the following two panics during multiuser startup, usually while running 
> the /usr/local/etc/rc.d scripts.  (The instruction pointer is always 
> exactly one of these two, and they look fairly related.)  If after two or 
> three reboots it manages to not panic, the system will run perfectly 
> stable.
> 
> For some probably-unrelated reason, the dump never finishes in either case.
> 
> First panic (note em0 warning before it):
> -----
> em0: discard frame w/o packet header
This is odd.  I see one bug that could possibly trigger this, but not on
x86:

Index: if_em.c
==================================================================--- if_em.c
(revision 228074)
+++ if_em.c	(working copy)
@@ -4305,8 +4305,10 @@ em_rxeof(struct rx_ring *rxr, int count, int *done
 #ifndef __NO_STRICT_ALIGNMENT
 			if (adapter->max_frame_size >
 			    (MCLBYTES - ETHER_ALIGN) &&
-			    em_fixup_rx(rxr) != 0)
-				goto skip;
+			    em_fixup_rx(rxr) != 0) {
+				sendmp = NULL;
+				goto next_desc;
+			}
 #endif
 			if (status & E1000_RXD_STAT_VP) {
 				sendmp->m_pkthdr.ether_vtag @@ -4318,9 +4320,6 @@ em_rxeof(struct
rx_ring *rxr, int count, int *done
 			sendmp->m_pkthdr.flowid = rxr->msix;
 			sendmp->m_flags |= M_FLOWID;
 #endif
-#ifndef __NO_STRICT_ALIGNMENT
-skip:
-#endif
 			rxr->fmp = rxr->lmp = NULL;
 		}
 next_desc:
@@ -4426,6 +4425,7 @@ em_fixup_rx(struct rx_ring *rxr)
 			adapter->dropped_pkts++;
 			m_freem(rxr->fmp);
 			rxr->fmp = NULL;
+			rxr->lmp = NULL;
 			error = ENOMEM;
 		}
 	}


-- 
John Baldwin

Mike Andrews

2011-Dec-01 23:04 UTC

head link

Sporadic 9.0-RC2 boot-time panic

On 11/28/11 5:48 PM, Ronald Klop wrote:> On Mon, 28 Nov 2011 23:37:27 +0100, Mike Andrews <mandrews@bit0.com>
wrote:
>
>> *Sometimes* when booting 9.0-RC2 on *some* of my machines, I'll get
>> one of the following two panics during multiuser startup, usually
>> while running the /usr/local/etc/rc.d scripts. (The instruction
>> pointer is always exactly one of these two, and they look fairly
>> related.) If after two or three reboots it manages to not panic, the
>> system will run perfectly stable.
>>
>> For some probably-unrelated reason, the dump never finishes in either
>> case.
>>
>> First panic (note em0 warning before it):
>> -----
>> em0: discard frame w/o packet header
>>
>>
>> Fatal trap 9: general protection fault while in kernel mode
>> cpuid = 0; apic id = 00
>> instruction pointer = 0x20:0xffffffff805e4fc5
>> stack pointer = 0x28:0xffffff80003299e0
>> frame pointer = 0x28:0xffffff8000329a00
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 12 (irq256: em0:rx 0)
>> trap number = 9
>> panic: general protection fault
>> cpuid = 0
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>> kdb_backtrace() at kdb_backtrace+0x37
>> panic() at panic+0x187
>> trap_fatal() at trap_fatal+0x290
>> trap() at trap+0x10a
>> calltrap() at calltrap+0x8
>> --- trap 0x9, rip = 0xffffffff805e4fc5, rsp = 0xffffff80003299e0, rbp
>> = 0xffffff8000329a00 ---
>> m_freem() at m_freem+0x25
>> ether_nh_input() at ether_nh_input+0x82
>> netisr_dispatch_src() at netisr_dispatch_src+0x20b
>> em_rxeof() at em_rxeof+0x1ca
>> em_msix_rx() at em_msix_rx+0x24
>> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
>> ithread_loop() at ithread_loop+0xa4
>> fork_exit() at fork_exit+0x11f
>> fork_trampoline() at fork_trampoline+0xe
>> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
>> Uptime: 49s
>> Dumping 679 out of 12263 MB:
>>
>> -----
>>
>> Second panic (no em0 discard warning this time):
>>
>> -----
>>
>> Fatal trap 9: general protection fault while in kernel mode
>> cpuid = 0; apic id = 00
>> instruction pointer = 0x20:0xffffffff8063c0e4
>> stack pointer = 0x28:0xffffff8000329a00
>> frame pointer = 0x28:0xffffff8000329a40
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 12 (irq256: em0:rx 0)
>> trap number = 9
>> panic: general protection fault
>> cpuid = 0
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>> kdb_backtrace() at kdb_backtrace+0x37
>> panic() at panic+0x187
>> trap_fatal() at trap_fatal+0x290
>> trap() at trap+0x10a
>> calltrap() at calltrap+0x8
>> --- trap 0x9, rip = 0xffffffff8063c0e4, rsp = 0xffffff8000329a00, rbp
>> = 0xffffff8000329a40 ---
>> ether_nh_input() at ether_nh_input+0x94
>> netisr_dispatch_src() at netisr_dispatch_src+0x20b
>> em_rxeof() at em_rxeof+0x1ca
>> em_msix_rx() at em_msix_rx+0x24
>> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
>> ithread_loop() at ithread_loop+0xa4
>> fork_exit() at fork_exit+0x11f
>> fork_trampoline() at fork_trampoline+0xe
>> --- trap 0, rip = 0, rsp = 0xffffff8000329d00, rbp = 0 ---
>> Uptime: 46s
>> Dumping 657 out of 12263 MB:..3%
> Does it help if you disable msix on your em0?
> Google for 'sysctl em msix'. Or run 'sysctl -a | grep
msix'.

OK, setting hw.em.enable_msix=0 in /boot/loader.conf does NOT help.

freebsd stable - Nov 2011 - Sporadic 9.0-RC2 boot-time panic

Sporadic 9.0-RC2 boot-time panic

Sporadic 9.0-RC2 boot-time panic

Sporadic 9.0-RC2 boot-time panic

Sporadic 9.0-RC2 boot-time panic

Sporadic 9.0-RC2 boot-time panic