Olivier
2011-Dec-09 15:52 UTC
[asterisk-users] Issue with dahdi 2.5.0 and Digium HA8-B400M
Hi, I'm trying to debug an asterisk 1.6.1.18 powered system equiped with an HA-8 and 2 B400M modules (8 BRI ports). Config is : libpri-1.4.12 dahdi-2.5.0 Only 4 BRI ports were connected in PtP mode, to 4 telco lines. This system operated correctly until it suddenly started to spits 245 lines like this (from 18:55:20 to 00:15:31) : [Dec 5 18:55:20] NOTICE[2608] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 4 [Dec 5 18:55:31] NOTICE[2605] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Dec 5 19:15:31] NOTICE[2605] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Dec 5 19:35:31] NOTICE[2605] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 then it printed more than 40k lines like this: [Dec 5 23:55:31] NOTICE[2605] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Dec 6 00:06:52] NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 2 [Dec 6 00:13:24] NOTICE[2607] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 3 [Dec 6 00:15:31] NOTICE[2605] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 1 [Dec 6 00:26:43] NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 [Dec 6 00:26:43] NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 [Dec 6 00:26:44] NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 [Dec 6 00:26:44] NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 At the same time, /var/log/kern.log showed 5 occurences of these lines: Dec 6 18:49:57 foo kernel: [4264680.548042] INFO: task b400m-0:13594 blocked for more than 120 seconds. Dec 6 18:49:57 foo kernel: [4264680.548052] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Dec 6 18:49:57 foo kernel: [4264680.548061] b400m-0 D eff4c1e9 0 13594 2 0x00000000 Dec 6 18:49:57 foo kernel: [4264680.548066] f62a8cc0 00000046 f6c4a200 eff4c1e9 000003ff c1419100 c1419100 c14146ac Dec 6 18:49:57 foo kernel: [4264680.548070] f62a8e7c c2488100 00000001 6b13f298 000f2687 00000002 000018d4 00000000 Dec 6 18:49:57 foo kernel: [4264680.548074] c24836ac f62a8e7c 3f8a9c68 00000400 00000078 00000000 00000000 00000000 Dec 6 18:49:57 foo kernel: [4264680.548078] Call Trace: Dec 6 18:49:57 foo kernel: [4264680.548084] [<c126d7b1>] ? schedule_timeout+0x20/0xb0 Dec 6 18:49:57 foo kernel: [4264680.548089] [<c10139f0>] ? lapic_next_event+0x13/0x16 Dec 6 18:49:57 foo kernel: [4264680.548092] [<c126d6ba>] ? wait_for_common+0xa4/0x100 Dec 6 18:49:57 foo kernel: [4264680.548095] [<c102d50a>] ? default_wake_function+0x0/0x8 Dec 6 18:49:57 foo kernel: [4264680.548100] [<f8cf8d0c>] ? wctdm_getreg+0x11c/0x13b [wctdm24xxp] Dec 6 18:49:57 foo kernel: [4264680.548104] [<f8cfc85e>] ? b400m_getreg+0x65/0x75 [wctdm24xxp] Dec 6 18:49:57 foo kernel: [4264680.548107] [<f8cfdd6c>] ? xhfc_work+0x33/0x5f8 [wctdm24xxp] Dec 6 18:49:57 foo kernel: [4264680.548111] [<c10412f7>] ? worker_thread+0x141/0x1bd Dec 6 18:49:57 foo kernel: [4264680.548114] [<f8cfdd39>] ? xhfc_work+0x0/0x5f8 [wctdm24xxp] Dec 6 18:49:57 foo kernel: [4264680.548117] [<c104403a>] ? autoremove_wake_function+0x0/0x2d Dec 6 18:49:57 foo kernel: [4264680.548119] [<c10411b6>] ? worker_thread+0x0/0x1bd Dec 6 18:49:57 foo kernel: [4264680.548122] [<c1043e08>] ? kthread+0x61/0x66 Dec 6 18:49:57 foo kernel: [4264680.548124] [<c1043da7>] ? kthread+0x0/0x66 Dec 6 18:49:57 foo kernel: [4264680.548127] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Summing all this: at 18:49:57 INFO: task b400m-0:13594 blocked for more than 120 seconds. at 18:55:20 NOTICE[2608] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 4 at 00:26:43 NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 What is the story that these logs are telling ? All I can see is my system started to dysfunction (no call in or out) and never recovered from that despite rebooting it. For the record, I replaced this system with another one together with a Patton 4638 box and it is now working OK, so I wouldn't bet too much on something really new coming from the Telco though I would be very happy to be proven wrong. In which direction shall I look after ? Regards
Shaun Ruffell
2011-Dec-09 16:38 UTC
[asterisk-users] Issue with dahdi 2.5.0 and Digium HA8-B400M
On Fri, Dec 09, 2011 at 04:52:53PM +0100, Olivier wrote:> Summing all this: > at 18:49:57 INFO: task b400m-0:13594 blocked for more than 120 seconds. > at 18:55:20 NOTICE[2608] chan_dahdi.c: PRI got event: HDLC Abort (6) on Primary D-channel of span 4 > at 00:26:43 NOTICE[2606] chan_dahdi.c: PRI got event: HDLC Overrun (7) on Primary D-channel of span 2 > > What is the story that these logs are telling ? > > All I can see is my system started to dysfunction (no call in or out) > and never recovered from that despite rebooting it.I would recommend updating to DAHDI-Linux 2.5.0.2 certainly (especially if you made any other changes to devices / kernel version on the system). But otherwise, since the system was working, and now stopped working regardless of rebooting, I believe your best course of action is to contact Digium's support department [1]. [1] http://www.digium.com/support -- Shaun Ruffell Digium, Inc. | Linux Kernel Developer 445 Jan Davis Drive NW - Huntsville, AL 35806 - USA Check us out at: www.digium.com & www.asterisk.org