Udo Schacht-Wiegand
2009-Jan-24 11:50 UTC
[asterisk-users] Asterisk freezes with Fixup failed on channel SIP/...<MASQ>
On a production system, running 1.4.17 (compiled from bristuff-0.4.0-test6-xr1) we had this strange issue two times in the last weeks: [2009-01-13 13:58:30] WARNING[1213] channel.c: Fixup failed on channel SIP/2332-081d0108<MASQ>, strange things may happen. [2009-01-13 13:58:30] WARNING[1213] channel.c: Hangup failed! Strange things may happen! [2009-01-13 13:58:30] WARNING[1213] channel.c: Failed to perform masquerade [2009-01-13 13:58:30] WARNING[1213] channel.c: Channel 'SIP/2332-081d0108' may not have been hung up properly and: [2009-01-23 14:27:17] WARNING[21528] channel.c: Fixup failed on channel SIP/2332-083c3778<MASQ>, strange things may happen. [2009-01-23 14:27:17] WARNING[21528] channel.c: Hangup failed! Strange things may happen! [2009-01-23 14:27:17] WARNING[21528] channel.c: Failed to perform masquerade [2009-01-23 14:27:17] WARNING[21528] channel.c: Channel 'SIP/2332-083c3778' may not have been hung up properly Both times all SIP channels got stuck and the CLI became inresponsive. Calls continued for a while, but new SIP calls could not be established. On the second time this happended, all SIP phones could not subscribe to the Asterisk any longer and a few minutes later the log filled with: [2009-01-23 14:43:21] ERROR[22319] chan_sip.c: Call to peer '2333' rejected due to usage limit of 10 On the CLI one could see, that there were 100s of (rejected) calls to this SIP phones. The phones that show up in the ERROR messages are in a group call made by a Dial(Local/...&Local.../&Local/...) construct. But other SIP phones were affected as well. It seemed like the whole chan_sip module became stuck. I also could not "unload chan_sip.so", but can't remeber the exact error message it gave. The only thing that was left was to restart Asterisk. Can someone give me some clue what the 'Fixup failed ...' and 'masquerade' warnings actually mean? Any help appreciated. Udo
Grygoriy Dobrovolskyy
2009-Jan-24 14:20 UTC
[asterisk-users] Asterisk freezes with Fixup failed on channel SIP/...<MASQ>
Copy paste from freeswitch.org Asterisk uses a modular design where a central core loads shared objects to extend the functionality with bits of code known as "modules". Modules are used to implement specific protocols such as SIP, add applications such as custom IVRs and tie in other external interfaces such as the Manager Interface. The core of Asterisk is a threading model but a very conservative one. Only origination channels and channels executing an application have threads. The B leg of any call operate only within the same thread as the A leg and when something happens like a call transfer the channel must first be transferred to a threaded mode which often times includes a practice called channel masquerade, a process where all the internals of a channel are torn from one dynamic memory object and placed into another. A practice that was once described in the code comments as being "nasty". The same went for the opposite operation the thread was discarded by cloning the channel and letting the original hang-up which also required hacking the cdr structure to avoid seeing it as a new call. One will often see 3 or 4 channels up for a single call during a call transfer because of this. /* XXX This is a seriously wacked out operation. We're essentially putting the guts of the clone channel into the original channel. Start by killing off the original channel's backend. I'm not sure we're going to keep this function, because while the features are nice, the cost is very high in terms of pure nastiness. XXX */ This became the de facto way to pull a channel out of the grips of another thread and the source of many headaches for application developers. This uncertain threading scheme was one of the motivating factors for a rewrite. Asterisk uses linked-lists to manage its open channels. A linked-list is a series of dynamic memory chained together by using a structure that has a pointer to its own type as one of the members allowing you to endlessly chain objects and keep track of them. They are indeed a useful programming practice but when used in a threaded application become very difficult to manage. One must use mutexes, a kind of traffic light for threads to make sure only 1 thread ever has write access to the list or you risk one thread tearing a link out of a list while another is traversing it. This also leads to horrible situations where one thread may be destroying or masquerading a channel while another is accessing it which will result in a Segmentation Fault which is a fatal error in the program and causes it to instantly halt which, of course means in most cases all your calls will be lost. We've all seen the infamous "Avoiding initial deadlock" message which essentially is an attempt to lock a channel 10 times and if still won't lock, just go ahead and forget about the lock. 2009/1/24 Udo Schacht-Wiegand <asterisk at wiegand.name>> On a production system, running 1.4.17 (compiled from > bristuff-0.4.0-test6-xr1) we had this strange issue two times in the last > weeks: > > [2009-01-13 13:58:30] WARNING[1213] channel.c: Fixup failed on channel > SIP/2332-081d0108<MASQ>, strange things may happen. > [2009-01-13 13:58:30] WARNING[1213] channel.c: Hangup failed! Strange > things may happen! > [2009-01-13 13:58:30] WARNING[1213] channel.c: Failed to perform masquerade > [2009-01-13 13:58:30] WARNING[1213] channel.c: Channel 'SIP/2332-081d0108' > may not have been hung up properly > > and: > > [2009-01-23 14:27:17] WARNING[21528] channel.c: Fixup failed on channel > SIP/2332-083c3778<MASQ>, strange things may happen. > [2009-01-23 14:27:17] WARNING[21528] channel.c: Hangup failed! Strange > things may happen! > [2009-01-23 14:27:17] WARNING[21528] channel.c: Failed to perform > masquerade > [2009-01-23 14:27:17] WARNING[21528] channel.c: Channel 'SIP/2332-083c3778' > may not have been hung up properly > > Both times all SIP channels got stuck and the CLI became inresponsive. > Calls continued for a while, but new SIP calls could not be > established. > > On the second time this happended, all SIP phones could not subscribe to > the Asterisk any longer and a few minutes later the log > filled with: > > [2009-01-23 14:43:21] ERROR[22319] chan_sip.c: Call to peer '2333' rejected > due to usage limit of 10 > > On the CLI one could see, that there were 100s of (rejected) calls to this > SIP phones. > > The phones that show up in the ERROR messages are in a group call made by a > Dial(Local/...&Local.../&Local/...) construct. But other SIP phones were > affected as well. It seemed like the whole chan_sip module > became stuck. I also could not "unload chan_sip.so", but can't remeber the > exact error message it gave. > > The only thing that was left was to restart Asterisk. > > Can someone give me some clue what the 'Fixup failed ...' and 'masquerade' > warnings actually mean? > > Any help appreciated. > Udo > > > > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20090124/2d96b83e/attachment.htm