Chris Brentano
2009-Oct-20 22:25 UTC
[asterisk-users] Kernel panic w/ DAHDI 2.x/Digium TE220B
I've seen this consistently on three systems, with three different cards, and multiple versions of DAHDI. At first I thought the issue only occurred on newer, Nehalem-based, systems, but I reproduced it on a Core 2 Duo box as well. I've tested with dahdi-linux 2.2.0.2, dadhi- linux-complete 2.0.0+2.0.0, 2.1.0.2+2.1.0.2, and 2.2.0.2+2.2.0. The card is a Digium TE220B which uses the wct4xxp module. This does not happen, on the same systems and kernel version, with a TE121 using the wcte12xp module nor does it happen with a T100P using wct1xxp. OS is CentOS 5.3, and happens with kernel versions 2.6.18-164.el5 and 2.6.18-128.el5. I'm posting this wondering if anyone else has seen similar behavior. /etc/dahdi/system.conf: span=1,1,0,esf,b8xs bchan=1-23 dchan=24 loadzone=us defaultzone=us /etc/dahdi/modules: wct4xxp wcte12xp wct1xxp --- When I start dahdi, I see the following: # /etc/init.d/dahdi start Loading DAHDI hardware modules: wct4xxp: [ OK ] wcte12xp: [ OK ] wct1xxp: [ OK ] Running dahdi_cfg: VPM400: Not Present VPM450: Not Present [ OK ] Syslog output: Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Telephony Interface Registered on major 196 Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Version: 2.2.0.2 Oct 20 15:20:54 redbox-ast16 kernel: ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 16 (level, low) -> IRQ 169 Oct 20 15:20:54 redbox-ast16 kernel: Found TE2XXP at base address dfbfff80, remapped to ffffc20000022f80 Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP version c01a016c, burst ON Oct 20 15:20:54 redbox-ast16 kernel: Octasic optimized! Oct 20 15:20:54 redbox-ast16 kernel: FALC version: 00000005, Board ID: 00 Oct 20 15:20:54 redbox-ast16 kernel: Reg 0: 0x056af400 Oct 20 15:20:54 redbox-ast16 kernel: Reg 1: 0x056af000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 2: 0x00000000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 3: 0x00000000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 4: 0x0000ff01 Oct 20 15:20:54 redbox-ast16 kernel: Reg 5: 0x00000000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 6: 0xc01a016c Oct 20 15:20:54 redbox-ast16 kernel: Reg 7: 0x00001000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 8: 0x00000000 Oct 20 15:20:54 redbox-ast16 kernel: Reg 9: 0x00ff00ff Oct 20 15:20:54 redbox-ast16 kernel: Reg 10: 0x0000004a Oct 20 15:20:54 redbox-ast16 kernel: Found a Wildcard: Wildcard TE220 (4th Gen) Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Launching card: 0 Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Setting up global serial parameters Oct 20 15:20:55 redbox-ast16 kernel: About to enter spanconfig! Oct 20 15:20:55 redbox-ast16 kernel: Done with spanconfig! Oct 20 15:20:55 redbox-ast16 kernel: dahdi: Registered tone zone 0 (United States / North America) Oct 20 15:20:55 redbox-ast16 kernel: About to enter startup! Oct 20 15:20:55 redbox-ast16 kernel: TE2XXP: Span 1 configured for ESF/B8ZS Oct 20 15:20:55 redbox-ast16 kernel: wct2xxp: Setting yellow alarm on span 1 Oct 20 15:20:55 redbox-ast16 kernel: timing source auto card 0! Oct 20 15:20:55 redbox-ast16 kernel: SPAN 1: Primary Sync Source Oct 20 15:20:55 redbox-ast16 kernel: VPM400: Not Present Oct 20 15:20:55 redbox-ast16 kernel: VPM450: Not Present Oct 20 15:20:55 redbox-ast16 kernel: Completed startup! --- Now if I either start asterisk, or if I stop dahdi, it will panic: # /etc/init.d/dahdi stop Unloading DAHDI hardware modules: TE4XXP: Version Syncronization Error! TE4XXP: Version Syncronization Error! TE4XXP: Version Syncronization Error! TE4XXP: Version Syncronization Error! HARDWARE ERROR CPU 1: Machine Check Exception: 4 Bank 8: 00000000000000 TSC 0 This is not a software problem! Run through mcelog --ascii to decode and contact your hardware vendor Kernel panic - not syncing: Uncorrected machine check Syslog output (not much before restart): Oct 20 07:11:54 localhost kernel: TE4XXP: Version Synchronization Error! Oct 20 07:14:24 localhost syslogd 1.4.1: restart. ... --- I only see the machine check exception on the two Nehalem boxes (HP ProLiant ML350 G6, Z800 Workstation); on a Core 2 Duo (Dell Optiplex 745) it just hard freezes after the "Version Syncronization Error!" messages. If there's any further details I can provide I'm happy to do so. Would like to figure out what's happening here if anyone can help shed any light as this is completely holding up migration to Asterisk 1.6 and DAHDI. Thanks. - Chris
Chris Brentano
2009-Oct-22 17:01 UTC
[asterisk-users] (SOLVED) Kernel panic w/ DAHDI 2.x/Digium TE220B
FYI, in case anyone else encouters this issue. The card that I had which I could reproduce this with was hardware revision B4. I RMAed the card with Digium support and got a newer, revision C card, and the issue is no more. On 20 Oct, 2009, at 3:25 PM, Chris Brentano wrote:> I've seen this consistently on three systems, with three different > cards, and multiple versions of DAHDI. At first I thought the issue > only occurred on newer, Nehalem-based, systems, but I reproduced it on > a Core 2 Duo box as well. I've tested with dahdi-linux 2.2.0.2, dadhi- > linux-complete 2.0.0+2.0.0, 2.1.0.2+2.1.0.2, and 2.2.0.2+2.2.0. The > card is a Digium TE220B which uses the wct4xxp module. This does not > happen, on the same systems and kernel version, with a TE121 using the > wcte12xp module nor does it happen with a T100P using wct1xxp. OS is > CentOS 5.3, and happens with kernel versions 2.6.18-164.el5 and > 2.6.18-128.el5. I'm posting this wondering if anyone else has seen > similar behavior. > > /etc/dahdi/system.conf: > span=1,1,0,esf,b8xs > bchan=1-23 > dchan=24 > loadzone=us > defaultzone=us > > /etc/dahdi/modules: > wct4xxp > wcte12xp > wct1xxp > > --- > > When I start dahdi, I see the following: > > # /etc/init.d/dahdi start > Loading DAHDI hardware modules: > wct4xxp: [ OK ] > wcte12xp: [ OK ] > wct1xxp: [ OK ] > > Running dahdi_cfg: VPM400: Not Present > VPM450: Not Present > [ OK ] > > Syslog output: > > Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Telephony Interface > Registered on major 196 > Oct 20 15:20:54 redbox-ast16 kernel: dahdi: Version: 2.2.0.2 > Oct 20 15:20:54 redbox-ast16 kernel: ACPI: PCI Interrupt > 0000:03:08.0[A] -> GSI 16 (level, low) -> IRQ 169 > Oct 20 15:20:54 redbox-ast16 kernel: Found TE2XXP at base address > dfbfff80, remapped to ffffc20000022f80 > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP version c01a016c, burst > ON > Oct 20 15:20:54 redbox-ast16 kernel: Octasic optimized! > Oct 20 15:20:54 redbox-ast16 kernel: FALC version: 00000005, Board > ID: 00 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 0: 0x056af400 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 1: 0x056af000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 2: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 3: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 4: 0x0000ff01 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 5: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 6: 0xc01a016c > Oct 20 15:20:54 redbox-ast16 kernel: Reg 7: 0x00001000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 8: 0x00000000 > Oct 20 15:20:54 redbox-ast16 kernel: Reg 9: 0x00ff00ff > Oct 20 15:20:54 redbox-ast16 kernel: Reg 10: 0x0000004a > Oct 20 15:20:54 redbox-ast16 kernel: Found a Wildcard: Wildcard > TE220 (4th Gen) > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Launching card: 0 > Oct 20 15:20:54 redbox-ast16 kernel: TE2XXP: Setting up global > serial parameters > Oct 20 15:20:55 redbox-ast16 kernel: About to enter spanconfig! > Oct 20 15:20:55 redbox-ast16 kernel: Done with spanconfig! > Oct 20 15:20:55 redbox-ast16 kernel: dahdi: Registered tone zone 0 > (United States / North America) > Oct 20 15:20:55 redbox-ast16 kernel: About to enter startup! > Oct 20 15:20:55 redbox-ast16 kernel: TE2XXP: Span 1 configured for > ESF/B8ZS > Oct 20 15:20:55 redbox-ast16 kernel: wct2xxp: Setting yellow alarm > on span 1 > Oct 20 15:20:55 redbox-ast16 kernel: timing source auto card 0! > Oct 20 15:20:55 redbox-ast16 kernel: SPAN 1: Primary Sync Source > Oct 20 15:20:55 redbox-ast16 kernel: VPM400: Not Present > Oct 20 15:20:55 redbox-ast16 kernel: VPM450: Not Present > Oct 20 15:20:55 redbox-ast16 kernel: Completed startup! > > --- > > Now if I either start asterisk, or if I stop dahdi, it will panic: > > # /etc/init.d/dahdi stop > Unloading DAHDI hardware modules: TE4XXP: Version Syncronization > Error! > TE4XXP: Version Syncronization Error! > TE4XXP: Version Syncronization Error! > TE4XXP: Version Syncronization Error! > > > > HARDWARE ERROR > CPU 1: Machine Check Exception: 4 Bank 8: > 00000000000000 > TSC 0 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware > vendor > Kernel panic - not syncing: Uncorrected machine check > > > Syslog output (not much before restart): > > Oct 20 07:11:54 localhost kernel: TE4XXP: Version Synchronization > Error! > Oct 20 07:14:24 localhost syslogd 1.4.1: restart. > ... > > --- > > I only see the machine check exception on the two Nehalem boxes (HP > ProLiant ML350 G6, Z800 Workstation); on a Core 2 Duo (Dell Optiplex > 745) it just hard freezes after the "Version Syncronization Error!" > messages. If there's any further details I can provide I'm happy to do > so. Would like to figure out what's happening here if anyone can help > shed any light as this is completely holding up migration to Asterisk > 1.6 and DAHDI. Thanks. > > - Chris > > > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users
John Knight
2009-Oct-22 20:33 UTC
[asterisk-users] (SOLVED) Kernel panic w/ DAHDI 2.x/Digium TE220B
An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20091022/b5eedfb5/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: sig.png Type: image/png Size: 6096 bytes Desc: not available Url : http://lists.digium.com/pipermail/asterisk-users/attachments/20091022/b5eedfb5/attachment.png