Mark W. Stoddard
2006-Jun-19 08:23 UTC
[Asterisk-Users] Asterisk 1.07 crash under Debian Sarge
I have just finished implementing an Asterisk system for my place of business (first one), and after three days of flawless usage, Asterisk seems to have crashed. I wasn't running with '-g', so I don't have a core dump. Here's the sequence of events leading up to the crash: 1. call comes in on our TDM2400P 2. all of our phones (about 26 Polycoms) ring. (it's after biz. hours, so all phones ring) 3. an employee answers the call. 4. the employee attempts a page (autoanswer + meetme AGI script with Polycoms) 5. about half the phones make it to the meeting, then the system crashes. 6. an executive calls my manager, who's on vacation, my manager calls me, autopsy begins. here's a few important snippets: ===========extensions.conf================[system-page] exten => 999,1,Macro(system-page,${CALLERIDNUM}) ; The first variable is the originating caller, the others are phones I ; wish to exclude from the system-wide paging. [macro-system-page] exten => s,1,AGI(allpage.agi|SIP/${CALLERIDNUM}) ;@TODO make more robust, not only SIP exten => s,2,MeetMe(999,Adqt) ;exten => s,2,Hangup [add-to-page] exten => listener,1,MeetMe(999,dmqx) ========================================== ==========/var/log/asterisk/debug=========Jun 12 17:44:12 DEBUG[17975]: Building dynamic conference '999' Jun 12 17:44:12 DEBUG[17975]: Placed channel SIP/302-6188 in ZAP conf 1023 Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' ... Jun 12 17:44:18 DEBUG[17975]: Hangup: channel: -2 index = 0, normal 51, callwait = -1, thirdcall = -1 Jun 12 17:44:18 DEBUG[17975]: Set option TDD MODE, value: OFF(0) on Zap/pseudo-1321090091 Jun 12 17:44:18 DEBUG[17975]: Updated conferencing on -2, with 0 conference users Jun 12 17:44:19 DEBUG[17975]: update_user_counter(302) - decrement inUse counter Jun 12 17:44:19 DEBUG[18016]: Building dynamic conference '999' Jun 12 17:44:20 DEBUG[18016]: Placed channel SIP/508-af01 in ZAP conf 1023 Jun 12 17:44:20 DEBUG[18016]: Hangup: channel: -2 index = 0, normal 41, callwait = -1, thirdcall = -1 Jun 12 17:44:20 DEBUG[18016]: Set option TDD MODE, value: OFF(0) on Zap/pseudo-1583015986 Jun 12 17:44:20 DEBUG[18016]: Updated conferencing on -2, with 0 conference users Jun 12 17:44:21 DEBUG[18016]: update_user_counter(508) - decrement outUse counter Jun 12 17:44:21 DEBUG[23992]: Stopping retransmission on '27725371050cbea5171801fc66d895a3@172.31.1.10' of Request 103: Found Jun 12 17:44:21 DEBUG[18017]: Building dynamic conference '999' Jun 12 17:44:22 DEBUG[18017]: Placed channel SIP/804-677b in ZAP conf 1023 Jun 12 17:44:22 DEBUG[18017]: Hangup: channel: -2 index = 0, normal 41, callwait = -1, thirdcall = -1 Jun 12 17:44:22 DEBUG[18017]: Set option TDD MODE, value: OFF(0) on Zap/pseudo-1132503448 Jun 12 17:44:22 DEBUG[18017]: Updated conferencing on -2, with 0 conference users Jun 12 17:44:23 DEBUG[18017]: update_user_counter(804) - decrement outUse counter ... Jun 12 17:44:32 DEBUG[18041]: Building dynamic conference '999' Jun 12 17:44:32 DEBUG[18019]: Building dynamic conference '999' Jun 12 17:44:32 DEBUG[18021]: Building dynamic conference '999' Jun 12 17:44:32 DEBUG[18028]: update_user_counter(404) - decrement outUse counter Jun 12 17:44:32 DEBUG[18042]: Placed channel SIP/401-1bec in ZAP conf 1023 Jun 12 17:44:32 DEBUG[18043]: Placed channel SIP/601-d011 in ZAP conf 1023 Jun 12 17:44:32 DEBUG[18043]: Hangup: channel: -2 index = 0, normal 41, callwait = -1, thirdcall = -1 Jun 12 17:44:32 DEBUG[18043]: Set option TDD MODE, value: OFF(0) on Zap/pseudo-726361999 Jun 12 17:44:32 DEBUG[18043]: Updated conferencing on -2, with 0 conference users Jun 12 17:44:32 DEBUG[18041]: Placed channel SIP/203-6116 in ZAP conf 1023 CRASH ================================= ==========/var/log/asterisk/messages=============Jun 12 17:40:49 WARNING[17955]: No such host: 806 Jun 12 17:40:49 NOTICE[17955]: Unable to create channel of type 'SIP' Jun 12 17:40:53 WARNING[17955]: Unable to request echo training on channel 1 Jun 12 17:43:42 WARNING[17958]: No such host: 806 Jun 12 17:43:42 NOTICE[17958]: Unable to create channel of type 'SIP' Jun 12 17:43:44 WARNING[17958]: Unable to request echo training on channel 1 Jun 12 17:44:12 NOTICE[18001]: Unable to request channel SIP/595 Jun 12 17:44:12 NOTICE[18004]: Unable to request channel SIP/808 Jun 12 17:44:12 NOTICE[18008]: Unable to request channel SIP/201 Jun 12 17:44:12 NOTICE[18011]: Unable to request channel SIP/212 Jun 12 17:44:12 NOTICE[17980]: Unable to request channel SIP/704 Jun 12 17:44:12 NOTICE[17984]: Unable to request channel SIP/802 Jun 12 17:44:12 NOTICE[17982]: Unable to request channel SIP/803 Jun 12 17:44:12 NOTICE[17985]: Unable to request channel SIP/801 Jun 12 17:44:32 WARNING[18041]: Conference not found CRASH =========================================== I have not been able to get the system to crash the same way again. It looks like Asterisk got into some odd loop creating the same conference over and over again instead of adding extensions to it. The ability to diagnose this bug will make/break the installation at my work, and rolling this out to customers. Any help is much appreciated. Let me know if further information is required. Mark Stoddard Techteriors
Julian Lyndon-Smith
2006-Jun-19 08:46 UTC
[Asterisk-Users] Asterisk 1.07 crash under Debian Sarge
I suspect that the majority of the advice that you are going to get would be to upgrade to the latest version of asterisk, as so many changes and bug fixes have been made since the 1.07 release. Julian. Mark W. Stoddard wrote:> I have just finished implementing an Asterisk system for my place of > business (first one), and after three days of flawless usage, Asterisk > seems to have crashed. I wasn't running with '-g', so I don't have a > core dump. Here's the sequence of events leading up to the crash: > 1. call comes in on our TDM2400P > 2. all of our phones (about 26 Polycoms) ring. (it's after biz. > hours, so all phones ring) > 3. an employee answers the call. > 4. the employee attempts a page (autoanswer + meetme AGI script with > Polycoms) > 5. about half the phones make it to the meeting, then the system > crashes. > 6. an executive calls my manager, who's on vacation, my manager calls > me, autopsy begins. > > here's a few important snippets: > > ===========extensions.conf================> [system-page] > exten => 999,1,Macro(system-page,${CALLERIDNUM}) > > ; The first variable is the originating caller, the others are phones I > ; wish to exclude from the system-wide paging. > [macro-system-page] > exten => s,1,AGI(allpage.agi|SIP/${CALLERIDNUM}) ;@TODO make more > robust, not only SIP > exten => s,2,MeetMe(999,Adqt) > ;exten => s,2,Hangup > > [add-to-page] > exten => listener,1,MeetMe(999,dmqx) > ==========================================> > ==========/var/log/asterisk/debug=========> Jun 12 17:44:12 DEBUG[17975]: Building dynamic conference '999' > Jun 12 17:44:12 DEBUG[17975]: Placed channel SIP/302-6188 in ZAP conf > 1023 > Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' > Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' > Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' > Jun 12 17:44:12 DEBUG[17979]: Manager received command 'Originate' > ... > Jun 12 17:44:18 DEBUG[17975]: Hangup: channel: -2 index = 0, normal > 51, callwait = -1, thirdcall = -1 > Jun 12 17:44:18 DEBUG[17975]: Set option TDD MODE, value: OFF(0) on > Zap/pseudo-1321090091 > Jun 12 17:44:18 DEBUG[17975]: Updated conferencing on -2, with 0 > conference users > Jun 12 17:44:19 DEBUG[17975]: update_user_counter(302) - decrement inUse > counter > Jun 12 17:44:19 DEBUG[18016]: Building dynamic conference '999' > Jun 12 17:44:20 DEBUG[18016]: Placed channel SIP/508-af01 in ZAP conf > 1023 > Jun 12 17:44:20 DEBUG[18016]: Hangup: channel: -2 index = 0, normal > 41, callwait = -1, thirdcall = -1 > Jun 12 17:44:20 DEBUG[18016]: Set option TDD MODE, value: OFF(0) on > Zap/pseudo-1583015986 > Jun 12 17:44:20 DEBUG[18016]: Updated conferencing on -2, with 0 > conference users > Jun 12 17:44:21 DEBUG[18016]: update_user_counter(508) - decrement > outUse counter > Jun 12 17:44:21 DEBUG[23992]: Stopping retransmission on > '27725371050cbea5171801fc66d895a3@172.31.1.10' of Request 103: Found > Jun 12 17:44:21 DEBUG[18017]: Building dynamic conference '999' > Jun 12 17:44:22 DEBUG[18017]: Placed channel SIP/804-677b in ZAP conf > 1023 > Jun 12 17:44:22 DEBUG[18017]: Hangup: channel: -2 index = 0, normal > 41, callwait = -1, thirdcall = -1 > Jun 12 17:44:22 DEBUG[18017]: Set option TDD MODE, value: OFF(0) on > Zap/pseudo-1132503448 > Jun 12 17:44:22 DEBUG[18017]: Updated conferencing on -2, with 0 > conference users > Jun 12 17:44:23 DEBUG[18017]: update_user_counter(804) - decrement > outUse counter > ... > Jun 12 17:44:32 DEBUG[18041]: Building dynamic conference '999' > Jun 12 17:44:32 DEBUG[18019]: Building dynamic conference '999' > Jun 12 17:44:32 DEBUG[18021]: Building dynamic conference '999' > Jun 12 17:44:32 DEBUG[18028]: update_user_counter(404) - decrement > outUse counter > Jun 12 17:44:32 DEBUG[18042]: Placed channel SIP/401-1bec in ZAP conf > 1023 > Jun 12 17:44:32 DEBUG[18043]: Placed channel SIP/601-d011 in ZAP conf > 1023 > Jun 12 17:44:32 DEBUG[18043]: Hangup: channel: -2 index = 0, normal > 41, callwait = -1, thirdcall = -1 > Jun 12 17:44:32 DEBUG[18043]: Set option TDD MODE, value: OFF(0) on > Zap/pseudo-726361999 > Jun 12 17:44:32 DEBUG[18043]: Updated conferencing on -2, with 0 > conference users > Jun 12 17:44:32 DEBUG[18041]: Placed channel SIP/203-6116 in ZAP conf > 1023 > CRASH > =================================> > ==========/var/log/asterisk/messages=============> Jun 12 17:40:49 WARNING[17955]: No such host: 806 > Jun 12 17:40:49 NOTICE[17955]: Unable to create channel of type 'SIP' > Jun 12 17:40:53 WARNING[17955]: Unable to request echo training on > channel 1 > Jun 12 17:43:42 WARNING[17958]: No such host: 806 > Jun 12 17:43:42 NOTICE[17958]: Unable to create channel of type 'SIP' > Jun 12 17:43:44 WARNING[17958]: Unable to request echo training on > channel 1 > Jun 12 17:44:12 NOTICE[18001]: Unable to request channel SIP/595 > Jun 12 17:44:12 NOTICE[18004]: Unable to request channel SIP/808 > Jun 12 17:44:12 NOTICE[18008]: Unable to request channel SIP/201 > Jun 12 17:44:12 NOTICE[18011]: Unable to request channel SIP/212 > Jun 12 17:44:12 NOTICE[17980]: Unable to request channel SIP/704 > Jun 12 17:44:12 NOTICE[17984]: Unable to request channel SIP/802 > Jun 12 17:44:12 NOTICE[17982]: Unable to request channel SIP/803 > Jun 12 17:44:12 NOTICE[17985]: Unable to request channel SIP/801 > Jun 12 17:44:32 WARNING[18041]: Conference not found > CRASH > ===========================================> > I have not been able to get the system to crash the same way again. It > looks like Asterisk got into some odd loop creating the same conference > over and over again instead of adding extensions to it. > > The ability to diagnose this bug will make/break the installation at my > work, and rolling this out to customers. Any help is much appreciated. > Let me know if further information is required. > > Mark Stoddard > Techteriors > > _______________________________________________ > --Bandwidth and Colocation provided by Easynews.com -- > > Asterisk-Users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users > >
Mark W. Stoddard
2006-Jun-19 14:59 UTC
[Asterisk-Users] Asterisk 1.07 crash under Debian Sarge
As far as hardware is concerned, I am using the following: * Dell Poweredge 2850 * 2GB RAM * 2x 73GB 10,000 SCSI drives mirrored * 1x Intel Xeon at 3.8GHz * 1x Digium TDM2400P * Dual redundant power supplies. (is saying "dual redundant" redundant?) * Stock cooling * UPS I was curious what version Debian testing is up to, apparently 1.2.7. I must have been living on Mars to have missed that. I'll attempt an upgrade from stable to testing on a testing machine (might even give it a try using Xen). If the upgrade goes well, I'll consider upgrading the production system to testing, or at least the Asterisk packages and dependencies thereof. I believe that Debian testing is scheduled to go stable this in a few (<6?) months, so it will be a good idea to at least see what I'm up against. How long does it take to restart Asterisk? I know there is a way to start Asterisk so that if it goes down, it comes back up immediately, that could be part of a solution right there. If the downtime is a second or two every few days, that's still adequate uptime for a commercial phone system. I'll let you know how things go. Mark Stoddard Techteriors