Andrew McRory
2005-Jan-11 12:23 UTC
[Asterisk-Users] PRI Errors (HDLC Abort (6) on Primary D-channel)
UPDATE. The circuit has run clean since the 7th. It seems the telco found a problem after all but they can't tell me what they did... cause they think it was fixed as part of another case. Grrr. Well, I never doubted the T400P too much but I went ahead and bought a different card just in case. Haven't had to use it so far. One of the biggest reasons I did not think the Wildcard was at fault is that the only the LEC PRI span would crash. The other spans stayed up throughout the ordeal. Also the errors followed the PRI no matter what port it was plugged into. For anyone else seeing these errors, here is a checklist of things to look for: 1) Run the latest STABLE or CVS release of asterisk/zaptel/libpri If your problems started after an upgrade, revert to the revision that used to work. 2) Check and recheck your dialplan for errors. Make backups often so you can revert to a known good version. 3) Verify the line framing in /etc/zaptel.conf 4) Verify the switch type in /etc/asterisk/zapata.conf 5) Wildcard must be on its own IRQ. Example: [root@phone asterisk]# cat /proc/interrupts CPU0 CPU1 0: 52175299 0 local-APIC-edge timer 1: 7513 7309 IO-APIC-edge keyboard 8: 0 1 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 10: 0 0 IO-APIC-level usb-ohci 15: 19 3 IO-APIC-edge ide1 ----> 18: 260753631 260932719 IO-APIC-level tor2 20: 815258 827358 IO-APIC-level ide2, ide3 24: 4 8 IO-APIC-level mvSata 27: 836366 849040 IO-APIC-level eth0 28: 38 39 IO-APIC-level aic7xxx 31: 2429141 2497473 IO-APIC-level eth2 NMI: 0 0 LOC: 52176184 52176197 ERR: 0 MIS: 12907 NOTE: I ran for over 6 months on a system that shared the IRQ with the Ethernet and USB - just to see what would happen. It worked very well but when the errors started, this was the first thing I HAD to fix. Sharing an IRQ with Wildcard is NOT recommended. Put the card on it's own IRQ! With todays highly integrated motherboard's, it's getting to where you can't buy one without an APIC - unless it is a cheapie. NOTE 2: To enable APIC you have to compile it into the kernel or if running FC1 kernel you must use the SMP version. I am not sure why APIC is not in the standard FC1 kernel. Planning to research this further but have not had time to so far. 6) Motherboard should support APIC (advanced programmable interrupt controller) Use quality motherboard (no el-cheapo chipsets) Use quality memory (run memtest86 to verify) Use quality peripherals 7) Compile a Kernel from plain sources ftp://ftp.kernel.org/pub/linux/kernel NOTE: The _Fedora_Core_1_ nptl.2199 kernel available on my site has been used here with great success. Can't speak for other precompiled sources since I haven't used them. Also your hardware may not work as well as mine with this kernel. 8) Make sure you have good power / power supply. Run a UPS! 9) If a multiport card, move the PRI/T1 span to another port to see if the errors follow. 10) If you have extra equipment to install on an unused span, configure it and see if it crashes at the same time the problem circuit does. If all circuits crash at the same time that would indicate a hardware problem such as IRQ or noise on the PCI bus (bad motherboard or PCI controller??). If all the above checks out these Red Alarms and HDLC errors are most likely due to problems in provisioning the circuit. At this point, become the Telco's biggest squeaky wheel or wait weeks for them to fix it by accident, if at all. Once more here is my configuration and it works :-) (T400P) <--> <ASTERISK> <--> SIP/VoIP/etc. | LEC-PRI <-----> (Port1) (Port2) <---> <Max 40000 / V.90 Dial-up users> (Port3) <---> <Microcom isPorte / Fax Test> (Port4) <---> <unused / test> Thanks to all who offerd help!! I hope to not have to post about this error again! -- Andrew McRory - President/CTO Linux Systems Engineers, Inc. - http://www.linuxsys.com Located in beautiful Tallahassee, Florida Office 850-224-5737 Office 850-575-7213 Mobile 850-294-7567
Scott Stingel
2005-Jan-11 12:40 UTC
[Asterisk-Users] PRI Errors (HDLC Abort (6) on Primary D-channel)
Andrew- Thanks for posting your update and troubleshooting checklist. Most people on the forum don't take the time to re-post when a problem has been resolved - but that's the thing that helps people the most!. regards Scott Stingel President Emerging Voice Technology, Inc. www.evtmedia.com Andrew McRory wrote:>UPDATE. The circuit has run clean since the 7th. It seems the telco found >a problem after all but they can't tell me what they did... cause they >think it was fixed as part of another case. Grrr. > >Well, I never doubted the T400P too much but I went ahead and bought a >different card just in case. Haven't had to use it so far. > >One of the biggest reasons I did not think the Wildcard was at fault is >that the only the LEC PRI span would crash. The other spans stayed up >throughout the ordeal. Also the errors followed the PRI no matter what >port it was plugged into. > >For anyone else seeing these errors, here is a checklist of things to look >for: > > 1) Run the latest STABLE or CVS release of asterisk/zaptel/libpri > If your problems started after an upgrade, revert to the > revision that used to work. > 2) Check and recheck your dialplan for errors. Make backups > often so you can revert to a known good version. > 3) Verify the line framing in /etc/zaptel.conf > 4) Verify the switch type in /etc/asterisk/zapata.conf > 5) Wildcard must be on its own IRQ. Example: > > [root@phone asterisk]# cat /proc/interrupts > CPU0 CPU1 > 0: 52175299 0 local-APIC-edge timer > 1: 7513 7309 IO-APIC-edge keyboard > 8: 0 1 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 10: 0 0 IO-APIC-level usb-ohci > 15: 19 3 IO-APIC-edge ide1 > ----> 18: 260753631 260932719 IO-APIC-level tor2 > 20: 815258 827358 IO-APIC-level ide2, ide3 > 24: 4 8 IO-APIC-level mvSata > 27: 836366 849040 IO-APIC-level eth0 > 28: 38 39 IO-APIC-level aic7xxx > 31: 2429141 2497473 IO-APIC-level eth2 > NMI: 0 0 > LOC: 52176184 52176197 > ERR: 0 > MIS: 12907 > > NOTE: I ran for over 6 months on a system that shared the IRQ > with the Ethernet and USB - just to see what would happen. > It worked very well but when the errors started, this was > the first thing I HAD to fix. Sharing an IRQ with Wildcard is > NOT recommended. Put the card on it's own IRQ! With todays > highly integrated motherboard's, it's getting to where you > can't buy one without an APIC - unless it is a cheapie. > > NOTE 2: To enable APIC you have to compile it into the kernel > or if running FC1 kernel you must use the SMP version. I am > not sure why APIC is not in the standard FC1 kernel. Planning > to research this further but have not had time to so far. > 6) Motherboard should support APIC (advanced programmable interrupt controller) > Use quality motherboard (no el-cheapo chipsets) > Use quality memory (run memtest86 to verify) > Use quality peripherals > 7) Compile a Kernel from plain sources > ftp://ftp.kernel.org/pub/linux/kernel > > NOTE: The _Fedora_Core_1_ nptl.2199 kernel available on my > site has been used here with great success. Can't speak for > other precompiled sources since I haven't used them. Also your > hardware may not work as well as mine with this kernel. > 8) Make sure you have good power / power supply. Run a UPS! > 9) If a multiport card, move the PRI/T1 span to another port to > see if the errors follow. > 10) If you have extra equipment to install on an unused span, > configure it and see if it crashes at the same time the > problem circuit does. If all circuits crash at the same time > that would indicate a hardware problem such as IRQ or noise on > the PCI bus (bad motherboard or PCI controller??). > >If all the above checks out these Red Alarms and HDLC errors are most >likely due to problems in provisioning the circuit. At this point, become >the Telco's biggest squeaky wheel or wait weeks for them to fix it by >accident, if at all. > >Once more here is my configuration and it works :-) > > (T400P) <--> <ASTERISK> <--> SIP/VoIP/etc. > | > LEC-PRI <-----> (Port1) > (Port2) <---> <Max 40000 / V.90 Dial-up users> > (Port3) <---> <Microcom isPorte / Fax Test> > (Port4) <---> <unused / test> > >Thanks to all who offerd help!! I hope to not have to post about this >error again! > > >