I'm running Asterisk 1.2.26.1 svn rev 79171 on Trixbox 2.2. libpri 1.2.7 and zaptel 1.2.22.1. The hardware is a HP dl360 single cpu with a TE220B. The system load is below 0.10. I moved the server into production, with one PRI, on Friday. On that day we handled a couple thousand calls and I only saw one HDLC abort message. On Saturday half the calls and two abort messages an hour apart. On Sunday, after 1500 when there was only a couple calls, the HDLC messages went crazy. We're getting non-stop Abort messages, with Bad FCS thrown in about every tenth message. They come in bunches, with short 10-30 second breaks. Then every once and awhile there is an 30 minute break, sometimes a 3 hour break. The messages seems completely separate from system load. The system will be idle and get the messages and have no messages when I load up dozens of calls on it (using call files to complete calls) After reading the mailing list and various websites (asteriskguru.com has a couple articles), the first thing I did was look for IRQ conflicts. The module for the usb bus (no usb devices attached) was on the same IRQ. Disabling USB had no effect. zttool shows no IRQ misses. The second PRI was installed on Monday, that day with only two calls, the message came 11 times. Three times on Tuesday with no calls, then late at night I loaded it up with calls for testing (having call files call out on the second PRI to the first PRI) and no messages were generated. Again today its had a few messages with only a couple calls. I'm not sure what to try next, other than calling the telco and asking them to check their equipment. Does any one have a suggestion before I do that? Thanks.
Steven wrote:> I'm not sure what to try next, other than calling the telco and asking > them to check their equipment. Does any one have a suggestion before I > do that?I have a suggestion. Have you contacted Digium technical support for assistance with resolving this issue? -- Russell Bryant Senior Software Engineer Open Source Team Lead Digium, Inc.
Trixbox 2.2... I assume you are using the latest version. Normally I will ignore messages from trixbox users because they ask kindergarten stuff... but you seem to be knowledgeable and I'll assume you chose trixbox to make your life easier when it comes to dealing with others regarding the PBX. I also assume the PRI is delivered via some sort of HDSL terminated at an NIU ("SmartJack") Which is a box that will usually have 2 or 4 positions for line cards and 2 or 4 jacks marked "CPE1" etc.... usually at the bottom. Usually also you can look through the window at the top and see various lights. What is between the smartjack and your T1 card? What sort and length of cable? Any splices? Punchdown or patch panels? Also I'm not sure if Trixbox has this but ssh in and see if there is an application called zttool. What are the statistics it is providing?
On Wed, 2008-01-16 at 15:52 -0800, Steven wrote:> I'm running Asterisk 1.2.26.1 svn rev 79171 on Trixbox 2.2. libpri > 1.2.7 and zaptel 1.2.22.1. The hardware is a HP dl360 single cpu with a > TE220B. The system load is below 0.10. > > I moved the server into production, with one PRI, on Friday. On that > day we handled a couple thousand calls and I only saw one HDLC abort > message. On Saturday half the calls and two abort messages an hour > apart. On Sunday, after 1500 when there was only a couple calls, the > HDLC messages went crazy. > > We're getting non-stop Abort messages, with Bad FCS thrown in about > every tenth message. They come in bunches, with short 10-30 second > breaks. Then every once and awhile there is an 30 minute break, > sometimes a 3 hour break. The messages seems completely separate from > system load. The system will be idle and get the messages and have no > messages when I load up dozens of calls on it (using call files to > complete calls)> > I'm not sure what to try next, other than calling the telco and asking > them to check their equipment. Does any one have a suggestion before I > do that? >Hi Steven Some quick remarks. Generally, you would get this kind of messages if either the signal level is to low, or distorted. The clock-signal for the RX is regenerated from the RX-signal. Now-a-days, these lowerlevel protocols are (should be) completely handeled by the hw of the board (not sure, don't have one here), not by the system. So system load should be not an issue. Those dl360 are not capable of hosting much power greedy pci-boards, so i would not suspect your psu. You wrote you received before thousand calls without much of a problem. So although ESD-damage of the board can not entirely ruled out, its unlikely. You mention "went into production", Did this imply moving of the system from a testing room into a server-location? Other (longer) cables? Perhaps you can check with your telco wether they receive bad frames coming from you.... hw