Loic Didelot
2009-Aug-26 06:53 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
Hello, we have several customers with a PRI line and a Wildcard TE121. Everything worked fine, but now are one customer the PRI stops working after a few hours. "zap show channel 1" shows that the channel is InAlarm. I the log files of asterisk I see that the D-Channel seems down. Restarting asterisk does not help, but rebooting the whole server resolves the problem for another 2-3 hours. pri intense debug does not give me a lot of information except some regular SABME messages. I am using asterisk 1.4.24.1, zaptel 1.4.12.1 and libpri 1.4.3/ Any help is appreciated. Best regards, Lo?c.
Tzafrir Cohen
2009-Aug-26 07:34 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
On Wed, Aug 26, 2009 at 08:53:18AM +0200, Loic Didelot wrote:> Hello, > we have several customers with a PRI line and a Wildcard TE121. > Everything worked fine, but now are one customer the PRI stops working > after a few hours. > > "zap show channel 1" shows that the channel is InAlarm. I the log files > of asterisk I see that the D-Channel seems down. Restarting asterisk > does not help, but rebooting the whole server resolves the problem for > another 2-3 hours.The obvious stupid question: Does Zaptel (The kernel) report that the span is in alarm? cat /proc/zaptel/1 -- Tzafrir Cohen icq#16849755 jabber:tzafrir.cohen at xorcom.com +972-50-7952406 mailto:tzafrir.cohen at xorcom.com http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
Loic Didelot
2009-Aug-26 09:43 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
Ok, the problem just reappeared. Last lines of dmesg: [ 2936.169191] wcte12xp0: Missed interrupt. Increasing latency to 6 ms in order to compensate. [ 4734.685566] wcte12xp0: Missed interrupt. Increasing latency to 7 ms in order to compensate. cat /proc/zaptel/1 Span 1: WCT1/0 "Wildcard TE121 Card 0" (MASTER) HDB3/CCS/CRC4 IRQ misses: 3 1 WCT1/0/1 Clear (In use) 2 WCT1/0/2 Clear (In use) 3 WCT1/0/3 Clear (In use) 4 WCT1/0/4 Clear (In use) 5 WCT1/0/5 Clear (In use) 6 WCT1/0/6 Clear (In use) 7 WCT1/0/7 Clear (In use) 8 WCT1/0/8 Clear (In use) 9 WCT1/0/9 Clear (In use) 10 WCT1/0/10 Clear (In use) 11 WCT1/0/11 Clear (In use) 12 WCT1/0/12 Clear (In use) 13 WCT1/0/13 Clear (In use) 14 WCT1/0/14 Clear (In use) 15 WCT1/0/15 Clear (In use) 16 WCT1/0/16 HDLCFCS (In use) 17 WCT1/0/17 Clear (In use) 18 WCT1/0/18 Clear (In use) 19 WCT1/0/19 Clear (In use) 20 WCT1/0/20 Clear (In use) 21 WCT1/0/21 Clear (In use) 22 WCT1/0/22 Clear (In use) 23 WCT1/0/23 Clear (In use) 24 WCT1/0/24 Clear (In use) 25 WCT1/0/25 Clear (In use) 26 WCT1/0/26 Clear (In use) 27 WCT1/0/27 Clear (In use) 28 WCT1/0/28 Clear (In use) 29 WCT1/0/29 Clear (In use) 30 WCT1/0/30 Clear (In use) 31 WCT1/0/31 Clear (In use) Best regards, Lo?c. On Wed, 2009-08-26 at 10:34 +0300, Tzafrir Cohen wrote:> On Wed, Aug 26, 2009 at 08:53:18AM +0200, Loic Didelot wrote: > > Hello, > > we have several customers with a PRI line and a Wildcard TE121. > > Everything worked fine, but now are one customer the PRI stops working > > after a few hours. > > > > "zap show channel 1" shows that the channel is InAlarm. I the log files > > of asterisk I see that the D-Channel seems down. Restarting asterisk > > does not help, but rebooting the whole server resolves the problem for > > another 2-3 hours. > > The obvious stupid question: Does Zaptel (The kernel) report that the > span is in alarm? > > cat /proc/zaptel/1 >-- Lo?c DIDELOT MIXvoip S.a. Tel: +352 20 3333 20 Fax: +352 20 3333 90 ldidelot at mixvoip.com http://www.mixvoip.com
Loic Didelot
2009-Aug-26 09:53 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
Here is some more information: [ 2936.169191] wcte12xp0: Missed interrupt. Increasing latency to 6 ms in order to compensate. [ 4734.685566] wcte12xp0: Missed interrupt. Increasing latency to 7 ms in order to compensate. [ 4893.695402] zaptel Disabled echo canceller because of tone (rx) on channel 56 [ 5248.845635] wcte12xp: NMF workaround on! [ 5248.845640] wcte12xp: Setting yellow alarm [ 5248.845658] Zaptel: Master changed to XBUS-00/XPD-00 [ 5248.845777] wcte12xp0: Missed interrupt. Increasing latency to 8 ms in order to compensate. [ 5248.910831] wcte12xp: NMF workaround off! [ 5253.908078] Zaptel: Master changed to WCT1/0 [ 5253.964028] wcte12xp: Clearing yellow alarm [ 5335.053981] wcte12xp0: Missed interrupt. Increasing latency to 9 ms in order to compensate. cat /proc/interrupts CPU0 CPU1 0: 83 0 IO-APIC-edge timer 1: 2 0 IO-APIC-edge i8042 3: 100887 0 IO-APIC-edge serial 7: 0 0 IO-APIC-edge parport0 8: 3 0 IO-APIC-edge rtc 9: 1 0 IO-APIC-fasteoi acpi 16: 5381227 0 IO-APIC-fasteoi uhci_hcd:usb3, wcte12xp0, eth5 17: 570760 0 IO-APIC-fasteoi eth4 18: 0 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb7 19: 20413905 0 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb5 20: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 21: 33904 0 IO-APIC-fasteoi uhci_hcd:usb6, libata, libata NMI: 0 0 LOC: 604815 708592 ERR: 0 MIS: 0 Loic On Wed, 2009-08-26 at 10:34 +0300, Tzafrir Cohen wrote:> On Wed, Aug 26, 2009 at 08:53:18AM +0200, Loic Didelot wrote: > > Hello, > > we have several customers with a PRI line and a Wildcard TE121. > > Everything worked fine, but now are one customer the PRI stops working > > after a few hours. > > > > "zap show channel 1" shows that the channel is InAlarm. I the log files > > of asterisk I see that the D-Channel seems down. Restarting asterisk > > does not help, but rebooting the whole server resolves the problem for > > another 2-3 hours. > > The obvious stupid question: Does Zaptel (The kernel) report that the > span is in alarm? > > cat /proc/zaptel/1 >-- Lo?c DIDELOT MIXvoip S.a. Tel: +352 20 3333 20 Fax: +352 20 3333 90 ldidelot at mixvoip.com http://www.mixvoip.com
Tzafrir Cohen
2009-Aug-26 10:27 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
On Wed, Aug 26, 2009 at 11:53:25AM +0200, Loic Didelot wrote:> Here is some more information: > > [ 2936.169191] wcte12xp0: Missed interrupt. Increasing latency to 6 ms > in order to compensate. > [ 4734.685566] wcte12xp0: Missed interrupt. Increasing latency to 7 ms > in order to compensate. > [ 4893.695402] zaptel Disabled echo canceller because of tone (rx) on > channel 56 > [ 5248.845635] wcte12xp: NMF workaround on! > [ 5248.845640] wcte12xp: Setting yellow alarm > [ 5248.845658] Zaptel: Master changed to XBUS-00/XPD-00 > [ 5248.845777] wcte12xp0: Missed interrupt. Increasing latency to 8 ms > in order to compensate. > [ 5248.910831] wcte12xp: NMF workaround off! > [ 5253.908078] Zaptel: Master changed to WCT1/0 > [ 5253.964028] wcte12xp: Clearing yellow alarmThis should have caused events to be sent on each channel of the span to clear the alarm. You should be able to see those events on e.g. the "full" log of Asterisk. Do you see them?> [ 5335.053981] wcte12xp0: Missed interrupt. Increasing latency to 9 ms > in order to compensate.-- Tzafrir Cohen icq#16849755 jabber:tzafrir.cohen at xorcom.com +972-50-7952406 mailto:tzafrir.cohen at xorcom.com http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
Loic Didelot
2009-Aug-26 12:13 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
Hi, finally I got even more logs from dmesg that could be useful. [ 1380.149049] wcte12xp0: Missed interrupt. Increasing latency to 5 ms in order to compensate. [ 1446.621237] wcte12xp: turning off tone detection [ 1802.551501] wcte12xp: turning off tone detection [ 1979.801185] wcte12xp0: Missed interrupt. Increasing latency to 6 ms in order to compensate. [ 2579.233403] wcte12xp0: Missed interrupt. Increasing latency to 7 ms in order to compensate. [ 3179.055821] wcte12xp: NMF workaround on! [ 3179.055825] wcte12xp: Setting yellow alarm [ 3179.055847] Zaptel: Master changed to XBUS-00/XPD-00 [ 3179.055941] wcte12xp0: Missed interrupt. Increasing latency to 8 ms in order to compensate. [ 3179.120989] wcte12xp: NMF workaround off! [ 3184.118231] Zaptel: Master changed to WCT1/0 [ 3184.174181] wcte12xp: Clearing yellow alarm [ 3778.467882] wcte12xp0: Missed interrupt. Increasing latency to 9 ms in order to compensate. Does zaptel really need to increase the latency and to change the master? Best regards, Lo?c Didelot. On Wed, 2009-08-26 at 10:34 +0300, Tzafrir Cohen wrote:> On Wed, Aug 26, 2009 at 08:53:18AM +0200, Loic Didelot wrote: > > Hello, > > we have several customers with a PRI line and a Wildcard TE121. > > Everything worked fine, but now are one customer the PRI stops working > > after a few hours. > > > > "zap show channel 1" shows that the channel is InAlarm. I the log files > > of asterisk I see that the D-Channel seems down. Restarting asterisk > > does not help, but rebooting the whole server resolves the problem for > > another 2-3 hours. > > The obvious stupid question: Does Zaptel (The kernel) report that the > span is in alarm? > > cat /proc/zaptel/1 >-- Lo?c DIDELOT MIXvoip S.a. Tel: +352 20 3333 20 Fax: +352 20 3333 90 ldidelot at mixvoip.com http://www.mixvoip.com
Loic Didelot
2009-Aug-26 15:07 UTC
[asterisk-users] PRI worked fine for months, now it stopps working after 2-3 hours
Yes, I see the clearing, but the InAlarm flag stays to 1. Is there a way to restart zaptel without restarting the server. I tried restarting asterisk which did not help. Loic On Wed, 2009-08-26 at 13:27 +0300, Tzafrir Cohen wrote:> On Wed, Aug 26, 2009 at 11:53:25AM +0200, Loic Didelot wrote: > > Here is some more information: > > > > [ 2936.169191] wcte12xp0: Missed interrupt. Increasing latency to 6 ms > > in order to compensate. > > [ 4734.685566] wcte12xp0: Missed interrupt. Increasing latency to 7 ms > > in order to compensate. > > [ 4893.695402] zaptel Disabled echo canceller because of tone (rx) on > > channel 56 > > [ 5248.845635] wcte12xp: NMF workaround on! > > [ 5248.845640] wcte12xp: Setting yellow alarm > > [ 5248.845658] Zaptel: Master changed to XBUS-00/XPD-00 > > [ 5248.845777] wcte12xp0: Missed interrupt. Increasing latency to 8 ms > > in order to compensate. > > [ 5248.910831] wcte12xp: NMF workaround off! > > [ 5253.908078] Zaptel: Master changed to WCT1/0 > > [ 5253.964028] wcte12xp: Clearing yellow alarm > > This should have caused events to be sent on each channel of the span to > clear the alarm. You should be able to see those events on e.g. the > "full" log of Asterisk. Do you see them? > > > [ 5335.053981] wcte12xp0: Missed interrupt. Increasing latency to 9 ms > > in order to compensate. >-- Lo?c DIDELOT MIXvoip S.a. Tel: +352 20 3333 20 Fax: +352 20 3333 90 ldidelot at mixvoip.com http://www.mixvoip.com