David McGarry
2005-Jul-15 11:30 UTC
[Asterisk-Users] Channels being lost/disconnected using Q.SIG
Hi, We have integrated an Asterisk system (built from CVS HEAD) with an Avaya IP Office switch running Q.SIG on an E1 interface, using a Digium Wildcard TE110P. The system is being used for handling both inbound and outbound call traffic to a set of Asterisk agents who are connected via extensions on the IP Office - there is a maximum of 15 agents at any one time, allowing at least another 15 channels on the E1 to be used for inbound/outbound calls. This works most of the time, but once every day or two, all channels to the IP Office appear to be lost, in the sense that the agents suddenly hear nothing (even if they were in the middle of a call). What's strange is that Asterisk still shows the channels as being up - there is no sign of any calls having been disconnected. Having looked at the Asterisk log, we see this chain of events at the same time as this happens: Jul 12 13:31:52 NOTICE[20029] chan_zap.c: PRI got event: HDLC Bad FCS (8) on Primary D-channel of span 1 Jul 12 13:31:52 DEBUG[20029] chan_zap.c: Got event HDLC Bad FCS (8) on D-channel for span 1 Jul 12 13:31:53 WARNING[20029] chan_zap.c: [Span 0 D-Channel 0] PRI: !! Got reject for frame 37, but we only have others! Jul 12 13:31:53 VERBOSE[20029] logger.c: == Primary D-Channel on span 1 up These same set of log entries appear whenever this incident happens. If we disconnect the channels which Asterisk still regards as connected (using a soft hangup), they are cleared down and the agents can log back in and start again, but obviously this isn't ideal if they were in the middle of a call. We also sometimes see Asterisk enter a state whereby it stops logging/sending any events via the Manager API - and when typing 'show channels' on the CLI, it produces a "Avoiding initial deadlock" message at the end of the channel summary every time - the only way to resolve this is to kill -9 the process and restart Asterisk (using 'stop now' just hangs). I'm not sure if this issue is related to the Q.SIG one described above or it may be unrelated. Is anyone able to offer any suggestions as to what might be the cause of these problems, or advise what we can do to extract further information that might help troubleshooting? Regards, David.