Hello All, We have been experiencing some ongoing reliability problems with Asterisk for quite some time, and I am trying to find out if anyone else has experienced the same problems. We are running asterisk 1.4.17~dfsg-2+b1 on Debian Lenny, with a Digium PRI card, and have approximately 120 sip peers, mostly Snom 360s, with a few Grandstream GXP2000 and a handful of Handytone 486 units. The symptoms, when they occur, are as follows: -The inability to receive incoming calls to our ISDN PRI (callers get a busy tone), this starts off becoming intermittent but becomes permanent. -Asterisk cli commands work once, but then no longer return any data until disconnecting and reconnecting to the cli, i.e. sip show peers, show channels etc. -Internal SIP calls stop working -Calls remain stuck in queues, the queue members do not ring, and show as Busy when issuing a 'queue show' command. We've actually had these sort of problems for many months now, which originally started when we were running Asterisk 1.2 on Gentoo. We have done a large amount of fault finding and testing, which has involved a replacement ISDN card, reinstall on complete different server hardware, and changing to Asterisk 1.4 on Debian Lenny. I believe there may be two separate issues here - we did track down one problem to our cacti and nagios monitoring scripts, which were connecting and disconnecting to the manager interface several times per minute, which eventually caused asterisk to give the above symptoms, although in addition to the above, asterisk would consume 100% cpu on the box, and eventually need a hard-reboot of the server. I posted about this to the list a few weeks ago, and it was confirmed that this could cause such a problem. After stopping these services the problems were much reduced. However, we have now completely disabled the manager interface (enabled=no in manager.conf), and yesterday the problem occurred again - a restart of asterisk got everything going again. So really I'm at a loss as to where to go from here. A colleague of mine also has the same problem at his site running Asterisk 1.4 on Debian Lenny, he has never used the manager interface, and has completely different server hardware and ISDN card, so I wonder if it's a Debian specific problem? One option is to try reverting back to Asterisk 1.2, but that isn't really a long-term solution. We also had major problems with 1.2 with our Snom 360 phones, as with any Snom firmware > 6.2.2 there was a serious problem whereby on hangup the channels were not cleared down, meaning we had many outgoing ISDN calls held open for many hours until we realised the problem. This problem does not occur in Asterisk 1.4, although we have many log messages such as: chan_sip.c: Remote host can't match request BYE to call <callid> so I don't know if this is anything to worry about? Any help would be gratefully received! Thanks, Ben
Curious, you mention "a number of problems" that have "gone on for months" Question: Have you reported ANY or ALL of them to DIGIUM and if so what has been their response on each of these problems ? Ben Willcox wrote:> Hello All, > > We have been experiencing some ongoing reliability problems with > Asterisk for quite some time, and I am trying to find out if anyone else > has experienced the same problems. > > We are running asterisk 1.4.17~dfsg-2+b1 on Debian Lenny, with a Digium > PRI card, and have approximately 120 sip peers, mostly Snom 360s, with a > few Grandstream GXP2000 and a handful of Handytone 486 units. > > The symptoms, when they occur, are as follows: > > -The inability to receive incoming calls to our ISDN PRI (callers get a > busy tone), this starts off becoming intermittent but becomes permanent. > > -Asterisk cli commands work once, but then no longer return any data > until disconnecting and reconnecting to the cli, i.e. sip show peers, > show channels etc. > > -Internal SIP calls stop working > > -Calls remain stuck in queues, the queue members do not ring, and show > as Busy when issuing a 'queue show' command. > > > We've actually had these sort of problems for many months now, which > originally started when we were running Asterisk 1.2 on Gentoo. We have > done a large amount of fault finding and testing, which has involved a > replacement ISDN card, reinstall on complete different server hardware, > and changing to Asterisk 1.4 on Debian Lenny. > > I believe there may be two separate issues here - we did track down one > problem to our cacti and nagios monitoring scripts, which were > connecting and disconnecting to the manager interface several times per > minute, which eventually caused asterisk to give the above symptoms, > although in addition to the above, asterisk would consume 100% cpu on > the box, and eventually need a hard-reboot of the server. I posted about > this to the list a few weeks ago, and it was confirmed that this could > cause such a problem. After stopping these services the problems were > much reduced. > > However, we have now completely disabled the manager interface > (enabled=no in manager.conf), and yesterday the problem occurred again - > a restart of asterisk got everything going again. > So really I'm at a loss as to where to go from here. A colleague of mine > also has the same problem at his site running Asterisk 1.4 on Debian > Lenny, he has never used the manager interface, and has completely > different server hardware and ISDN card, so I wonder if it's a Debian > specific problem? > > One option is to try reverting back to Asterisk 1.2, but that isn't > really a long-term solution. We also had major problems with 1.2 with > our Snom 360 phones, as with any Snom firmware > 6.2.2 there was a > serious problem whereby on hangup the channels were not cleared down, > meaning we had many outgoing ISDN calls held open for many hours until > we realised the problem. This problem does not occur in Asterisk 1.4, > although we have many log messages such as: > > chan_sip.c: Remote host can't match request BYE to call <callid> > > so I don't know if this is anything to worry about? > > Any help would be gratefully received! > > Thanks, > Ben > > > > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users > > >
Ben Willcox wrote:> Hello All, > > One option is to try reverting back to Asterisk 1.2, but that isn't > really a long-term solution. We also had major problems with 1.2 with >Two things, 1.) On your queue setup, avoid using AgenCallbackLogin, it's known to cause deadlocked channels. 2.) Restart the Asterisk service once a week. I do this via a CRON job at 3am on Sundays. Doug -- Ben Franklin quote: "Those who would give up Essential Liberty to purchase a little Temporary Safety, deserve neither Liberty nor Safety."
On Tue, Mar 18, 2008 at 5:40 AM, Ben Willcox <ben.willcox at british-gymnastics.org> wrote:> Hello All, > > We have been experiencing some ongoing reliability problems with > Asterisk for quite some time, and I am trying to find out if anyone else > has experienced the same problems. > > We are running asterisk 1.4.17~dfsg-2+b1 on Debian Lenny, with a Digium > PRI card, and have approximately 120 sip peers, mostly Snom 360s, with a > few Grandstream GXP2000 and a handful of Handytone 486 units. > > The symptoms, when they occur, are as follows: > > -The inability to receive incoming calls to our ISDN PRI (callers get a > busy tone), this starts off becoming intermittent but becomes permanent. > > -Asterisk cli commands work once, but then no longer return any data > until disconnecting and reconnecting to the cli, i.e. sip show peers, > show channels etc. > > -Internal SIP calls stop working > > -Calls remain stuck in queues, the queue members do not ring, and show > as Busy when issuing a 'queue show' command. > > > We've actually had these sort of problems for many months now, which > originally started when we were running Asterisk 1.2 on Gentoo. We have > done a large amount of fault finding and testing, which has involved a > replacement ISDN card, reinstall on complete different server hardware, > and changing to Asterisk 1.4 on Debian Lenny. > > I believe there may be two separate issues here - we did track down one > problem to our cacti and nagios monitoring scripts, which were > connecting and disconnecting to the manager interface several times per > minute, which eventually caused asterisk to give the above symptoms, > although in addition to the above, asterisk would consume 100% cpu on > the box, and eventually need a hard-reboot of the server. I posted about > this to the list a few weeks ago, and it was confirmed that this could > cause such a problem. After stopping these services the problems were > much reduced. > > However, we have now completely disabled the manager interface > (enabled=no in manager.conf), and yesterday the problem occurred again - > a restart of asterisk got everything going again. > So really I'm at a loss as to where to go from here. A colleague of mine > also has the same problem at his site running Asterisk 1.4 on Debian > Lenny, he has never used the manager interface, and has completely > different server hardware and ISDN card, so I wonder if it's a Debian > specific problem? > > One option is to try reverting back to Asterisk 1.2, but that isn't > really a long-term solution. We also had major problems with 1.2 with > our Snom 360 phones, as with any Snom firmware > 6.2.2 there was a > serious problem whereby on hangup the channels were not cleared down, > meaning we had many outgoing ISDN calls held open for many hours until > we realised the problem. This problem does not occur in Asterisk 1.4, > although we have many log messages such as: > > chan_sip.c: Remote host can't match request BYE to call <callid> > > so I don't know if this is anything to worry about? > > Any help would be gratefully received! > > Thanks, > BenI have seen this when banging on the AMI but you eliminated that. Why not try a different OS such as CentOS for now? That would be my next step. Thanks, Steve Totaro
Check around on bugs.digium.com. You'll find a number of issues reported that sound similar. I'm hoping that 1.4.19 will fix a lot of stuff, since the release candidates seem much more stable to me. I couldn't keep Asterisk up for more than a few days before on 1.4.18. I've also applied a few SIP-related patches from various bug reports and things are much, much more stable. 1.4.17, which you mentioned, is also very buggy. 1.4.18 fixed many issues. Norman Franke Answering Service for Directors, Inc. www.myasd.com On Mar 18, 2008, at 7:40 AM, asterisk-users-request at lists.digium.com wrote:> We have been experiencing some ongoing reliability problems with > Asterisk for quite some time, and I am trying to find out if anyone > else > has experienced the same problems.-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20080318/2dfec6ee/attachment.htm
I believe most of them will be in 1.4.19-rc3 (and in SVN), but I applied patches to 1.4.19-rc2 from: Patches from 11712 and 12098. Plus another one I reported as 12162. Norman Franke Answering Service for Directors, Inc. www.myasd.com On Mar 18, 2008, at 12:11 PM, asterisk-users-request at lists.digium.com wrote:> On Tue, 2008-03-18 at 11:05 -0400, Norman Franke wrote: >> I've also applied a few SIP-related patches from various bug reports >> and things are much, much more stable. > > Mind sharing which patches you have applied?-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20080318/f02dc467/attachment.htm
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Have you tried disabling highpriority=yes in asterisk.conf? - -- Kind Regards, Matt Riddell Director _______________________________________________ http://www.venturevoip.com (Great new VoIP end to end solution) http://www.venturevoip.com/news.php (Daily Asterisk News - html) http://www.venturevoip.com/newrssfeed.php (Daily Asterisk News - rss) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH4G9SDQNt8rg0Kp4RAjIoAKCQEP/e8pR27gbz9p1ilGw8AvWA+wCgs7qX mIrPzDRPWsGt9goKwljsT0Q=W2og -----END PGP SIGNATURE-----