Stefan Viljoen
2015-Nov-03 06:54 UTC
[asterisk-users] 1.8.32.3 - no timing indicated, tens of thousands of __sip_autodestruct error messages
Hi list Just to let everybody know I think I've got to the bottom of the above problem / error. Turns out that the issues described in my previous post were caused by problems in an MSSQL database that the Asterisk 1.8.32.3 instance was writing to...! Once I dropped the FreeTDS driver (while trying to diagnose what was going on) the Asterisk instance immediately started performing normally and the __sip_autodestruct error messages disappeared. Channels linked to calls that were hung up closed normally once again and everything was golden. The problem on MSSQL turned out to be that some of the indices on the table to which the Asterisk instance was writing required rebuilding as the table had grown immensely since the Asterisk instance was created. The issue was that an insert that would have normally taken about 500 milliseconds was now taking up to 30 seconds to a minute. Therefore as a call is finished Asterisk attempted to write the call into the CDR database but was extremely delayed due to slow database performance. The __sip_autodestruct error messages apparently is a watchdog thread / process that then detects that channels that should be closed and gone have not been closed / disposed - it then posts a warning and then later tries to close the channels involved, which are still hanging around after the call they are linked to has finished. This had the effect that users could only make one call, then had to wait until the channels disappeared (sometimes up to a minute later as the DB managed to finish inserting that call's details in to the table with the index problems) before they could make a second call, while the above watchdog process complained loudly about channels that were hanging around that should not be as Asterisk waited for MSSQL. So the fix was simply to rebuild the indexes on the relevant tabe in MSSQL - this definitively solved the problem and tens of thousands of calls have now been made after the reindex in MSSQL and everything is fine. This is interesting as it seems that Asterisk does NOT use FreeTDS asynchronously, but synchronously as calls progress... Anyway, hope this helps someone who runs into something similar - the databases you touch from the CDR subsystem in Asterisk must be -FAST- and able to be inserted into in milliseconds. Apparently anything taking a second or more in a FreeTDS accessed DB linked to Asterisk can start causing problems in Asterisk itself. Regards Stefan