Thanks for the info, Joshua. Does PJSIP handle database access the same way Chan_sip did? We had a number of boxes running chan_sip referencing the same mysql server without issue. We're going to attempt to get a backtrace on the next occurance. We're also going to run a local copy of the database on the same physical asterisk instance and have the system reference it. Just to "throw everything at the wall". *Nick Olsen* Network Engineer Office: 321-408-5000 x103 Mobile: 321-794-0763 On Mon, Mar 2, 2020 at 1:58 PM Joshua C. Colp <jcolp at sangoma.com> wrote:> On Mon, Mar 2, 2020 at 2:52 PM Nick Olsen < > nick at floridavirtualsolutions.com> wrote: > >> Hello All, >> I'm using Asterisk 16.8.0 on a Centos 7 box. Previously 16.5.0, But >> recently upgraded to attempt to resolve this issue. Using bundled PJSIP. >> The PBX is using mysql realtime for most functions. The Mysql server is >> on the same lan as the asterisk box. >> >> As more users have been moved to this box. It's become unstable. >> Randomly, I'll start seeing "WARNING[12667] taskprocessor.c: The >> 'pjsip/distributor-00000173' task processor queue reached 500 scheduled >> tasks." >> >> At that time, Running "pjsip show contacts" and "pjsip show endpoints" >> returns nothing. And the box stops responding to all SIP. >> >> The only way I've found thus far to resolve the issue is a "service >> asterisk restart". >> >> I can confirm at the time of the issue running "asterisk -x 'core show >> taskprocessors' | grep 'distributor'" does show many items pending across >> all queues. And the number just increases. Normally when all is fine. >> They're all at 0. >> >> Google-foo hasn't produced anything for me outside issues from 13.x that >> claim to be resolved. Since asterisk isn't fully crashing, I don't think I >> can get backtrace. Someone please correct me if I'm wrong. >> Any ideas? Tips >> ? >> > > The wiki[1] has instructions for getting a backtrace for a deadlock from a > running process. It can be used to isolate why things are blocked. > Generally, though, when realtime is involved I've found that it usually > ends up being the database or that interaction in some way. Any hiccup or > issue there can result in blocking in Asterisk. > > [1] > https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-GettingInformationForADeadlock > > -- > Joshua C. Colp > Asterisk Technical Lead > Sangoma Technologies > Check us out at www.sangoma.com and www.asterisk.org > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > Check out the new Asterisk community forum at: > https://community.asterisk.org/ > > New to Asterisk? Start here: > https://wiki.asterisk.org/wiki/display/AST/Getting+Started > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20200302/1bb51043/attachment.html>
On Mon, Mar 2, 2020 at 4:24 PM Nick Olsen <nick at floridavirtualsolutions.com> wrote:> Thanks for the info, Joshua. > > Does PJSIP handle database access the same way Chan_sip did? We had a > number of boxes running chan_sip referencing the same mysql server without > issue. > > We're going to attempt to get a backtrace on the next occurance. We're > also going to run a local copy of the database on the same physical > asterisk instance and have the system reference it. Just to "throw > everything at the wall". >It uses the same underlying API and layer. It can do more frequent database access though due to queries and because PJSIP is multithreaded. -- Joshua C. Colp Asterisk Technical Lead Sangoma Technologies Check us out at www.sangoma.com and www.asterisk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20200302/b20f7de8/attachment.html>
We ultimately found this to be a voicemail issue. The voicemail is held in MYSQL as well (via ODBC). And we found when attempting to playback a customers voicemail unavail greeting is when the deadlock would occur (Immediately, every time. Throwing the same "task processors" errors, And making pjsip completely unresponsive). We had imported a number of greetings from a legacy asterisk system and the vast majority of them worked. When we deleted the row containing the customers unavail greeting (making asterisk revert to read the mailbox number) all issues went away. If we re-record the customers unavail greeting it works fine and the problem doesn't reoccur. This was one out of ~250 voicemails imported. Since then we've done a few more migrations and they've all gone smooth with the exception of the most recent one. ~50% of the imported greetings have caused asterisk to deadlock. We've been checking them now at time of migration. What I can't figure out is what it doesn't like about the greeting. It was on a previous asterisk system working fine. The row looks identical to a working one. The only thing I can guess is something about the blob for the recording goes wrong. It would be nice if asterisk handled that more gracefully. I post this mostly just for internet history. To hopefully help the next guy out who has this same issue. *Nick Olsen* Network Engineer Office: 321-408-5000 x103 Mobile: 321-794-0763 On Mon, Mar 2, 2020 at 8:29 PM Joshua C. Colp <jcolp at sangoma.com> wrote:> On Mon, Mar 2, 2020 at 4:24 PM Nick Olsen < > nick at floridavirtualsolutions.com> wrote: > >> Thanks for the info, Joshua. >> >> Does PJSIP handle database access the same way Chan_sip did? We had a >> number of boxes running chan_sip referencing the same mysql server without >> issue. >> >> We're going to attempt to get a backtrace on the next occurance. We're >> also going to run a local copy of the database on the same physical >> asterisk instance and have the system reference it. Just to "throw >> everything at the wall". >> > > It uses the same underlying API and layer. It can do more frequent > database access though due to queries and because PJSIP is multithreaded. > > -- > Joshua C. Colp > Asterisk Technical Lead > Sangoma Technologies > Check us out at www.sangoma.com and www.asterisk.org > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > Check out the new Asterisk community forum at: > https://community.asterisk.org/ > > New to Asterisk? Start here: > https://wiki.asterisk.org/wiki/display/AST/Getting+Started > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20200401/3ab95a6d/attachment.html>