Hi, This is a bit of a long shot and I don't have much information on what is actually happening... Our production Asterisk system: ~2,000 SIP handsets, (2) Digium TE220s, Asterisk 1.6.2.18, RHEL 5 x86_64 Every few weeks, or few months, or X amount of time, the SIP portion of Asterisk seems to hang/die. Nothing abnormal in the typical Asterisk logs; the "rest" of Asterisk still seems to function fine -- incoming DAHDI calls come in and are queued in the various queues. 'sip set debug on' gives NO messages/errors/information at all; 'module reload chan_sip.so' does nothing (never seems to unload/load it), again nothing in the logs. It just seems like the chan_sip.so (or everything SIP) just hangs -- no errors related to SIP; when Asterisk is in this "state" we are unable to stop it gracefully -- it always requires a kill -9. The times this happen seem totally random -- this issue has persisted through several different versions of Asterisk 1.6.2.x branch. Sometimes its weeks between occurrences, other times its months. This Asterisk host easily processes several thousand calls per day. Any ideas or if anyone could at least point us in the right direction would be greatly appreciated. When this does happen, and we need to intervene, we try to poke around for a few seconds and test different things, but again this is a production system, so the quicker its back up, the better. =) Thanks, Marc
On Wednesday 29 Jun 2011, Marc Smith wrote:> Hi, > > This is a bit of a long shot and I don't have much information on what > is actually happening... > > Our production Asterisk system: ~2,000 SIP handsets, (2) Digium > TE220s, Asterisk 1.6.2.18, RHEL 5 x86_64 > > Every few weeks, or few months, or X amount of time, the SIP portion > of Asterisk seems to hang/die. Nothing abnormal in the typical > Asterisk logs; the "rest" of Asterisk still seems to function fine -- > incoming DAHDI calls come in and are queued in the various queues. > 'sip set debug on' gives NO messages/errors/information at all; > 'module reload chan_sip.so' does nothing (never seems to unload/load > it), again nothing in the logs. > > It just seems like the chan_sip.so (or everything SIP) just hangs -- > no errors related to SIP; when Asterisk is in this "state" we are > unable to stop it gracefully -- it always requires a kill -9.Have you got extensions referenced in your dialplan that either don't have a corresponding entry in sip.conf, or where there isn't a device on the network? I inherited a system with a similar problem. The previously-installed Asterisk version (some flavour of 1.2) didn't seem to mind trying to dial non-existent devices; but the 1.6.2.9 that replaced it objected fatally. I ended up making major (i.e., stopping just shy of rewriting from scratch) edits to extensions.conf and sip.conf, pruning out sections that had lost their relevance, and everything was fine after that. -- AJS Answers come *after* questions.