Douglas Mortensen
2008-Nov-10 17:10 UTC
[asterisk-users] Asterisk daemon dies about once per day
I have an asterisk system where the asterisk daemon dies typically at least once per day. It is running in the wrapper safe_asterisk, which automatically starts the daemon back up. But we find this unacceptable because when the daemon dies, we usually have active calls drop, and sometimes we have to run asterisk -r -x "module reload" after the daemon starts back up before everything is working well again. Any help or insight would be greatly appreciated. Here's an overview of our system. Software =======Distro: Trixbox CE 2.6.1.1 (CentOS 5) Linux Kernel: 2.6.18-53.1.4.el5 Asterisk version: 1.4.21.2-2 (trixbox RPM) asterisk-addons: 1.4.6 (trixbox RPM) zaptel 1.4.11-1 (trixbox RPM) zaptel-modules 1.4.11-1.2.6.18_53.1.4.el5 (trixbox RPM) Hardware =======Rhino Ceros III (2U short-depth server) - Intel Desktop Board DG33FB - Intel Pentium D 2.2Ghz (E2200) - 1GB RAM - 80GB SATA HostRAID Mirror (RAID1) - Rhino R1T1-EC Single T1 card (as PRI, using 4 channels + D) - Rhino RCB8FXX/1 w 1 FXO Module (2 FXO ports total) Zaptel =======The cards we are using are mentioned above. Other than that, if it helps, here's what we're doing with our trunks. We are using 4 channels of the PRI (channels 1-4), plus the D-channel for signaling. The PRI is a U.S.-based T1. With the FXO ports, we are sharing 1 with a fax & credit card machine, and the other one is shared with a different fax, coming off of the fax's phone port (so there is pretty much no way for it to ever see or feel anything fax-related). I've looked a bit at the asterisk/full and messages log, but so far nothing stands out. I did see in one forum where someone was having a similar problem and it ended up being a problem with the FXO card sometimes detecting fax tones on the line. This is definitely a possibility with one of our FXO ports, but it surprises me that this could kill the asterisk daemon?? I did speak with Rhino tech support about these daemon restarts and he told me this was normal. He said, "I've got an asterisk system at my house, and the daemon dies every few days. Just make sure it gets run from safe_asterisk, and you'll be fine". Initially I thought we could live with that, but because it does give us dropped calls, and we sometimes have to reload the asterisk modules afterward, I would like to see if we can pin it down and resolve it. Based on forums I've seen, it seems to me that not everyone out there shares the acceptance of their asterisk daemon restarting every day or two, and that it may not be a wide-spread problem. The only 2 other asterisk systems I've dealt with didn't seem to have this problem. One is my company's system, which is also Trixbox, which we've been on since May 2008. We don't use Zaptel. We use SIP based trunks with bandwidth.com. I wonder if there is something with our Zaptel interfaces causing this. So not being a asterisk guru here, I'd really appreciate any help. Sincerely, - Doug Mortensen Impala Networks Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20081110/0c387615/attachment.htm
Steve Murphy
2008-Nov-10 17:50 UTC
[asterisk-users] Asterisk daemon dies about once per day
On Mon, 2008-11-10 at 10:10 -0700, Douglas Mortensen wrote:> I have an asterisk system where the asterisk daemon dies typically at > least once per day. It is running in the wrapper safe_asterisk, which > automatically starts the daemon back up. But we find this unacceptable > because when the daemon dies, we usually have active calls drop, and > sometimes we have to run asterisk -r -x "module reload" after the > daemon starts back up before everything is working well again. Any > help or insight would be greatly appreciated. >Douglas-- Are you getting core files in /tmp? Getting a stack trace from them could be very.... informative! If not, or there is no debug info in your asterisk, then I encourage you to recompile asterisk so that DONT_OPTIMIZE is turned off; and so your safe_asterisk script uses the g option to start asterisk, so it dumps core on a crash. murf> Here's an overview of our system. > > Software > =======> Distro: Trixbox CE 2.6.1.1 (CentOS 5) > Linux Kernel: 2.6.18-53.1.4.el5 > Asterisk version: 1.4.21.2-2 (trixbox RPM) > asterisk-addons: 1.4.6 (trixbox RPM) > zaptel 1.4.11-1 (trixbox RPM) > zaptel-modules 1.4.11-1.2.6.18_53.1.4.el5 (trixbox RPM) > > Hardware > =======> Rhino Ceros III (2U short-depth server) > - Intel Desktop Board DG33FB > - Intel Pentium D 2.2Ghz (E2200) > - 1GB RAM > - 80GB SATA HostRAID Mirror (RAID1) > - Rhino R1T1-EC Single T1 card (as PRI, using 4 channels + D) > - Rhino RCB8FXX/1 w 1 FXO Module (2 FXO ports total) > > Zaptel > =======> The cards we are using are mentioned above. Other than that, if it > helps, here's what we're doing with our trunks. We are using 4 > channels of the PRI (channels 1-4), plus the D-channel for signaling. > The PRI is a U.S.-based T1. With the FXO ports, we are sharing 1 with > a fax & credit card machine, and the other one is shared with a > different fax, coming off of the fax's phone port (so there is pretty > much no way for it to ever see or feel anything fax-related). > > I've looked a bit at the asterisk/full and messages log, but so far > nothing stands out. > > I did see in one forum where someone was having a similar problem and > it ended up being a problem with the FXO card sometimes detecting fax > tones on the line. This is definitely a possibility with one of our > FXO ports, but it surprises me that this could kill the asterisk > daemon?? > > I did speak with Rhino tech support about these daemon restarts and he > told me this was normal. He said, "I've got an asterisk system at my > house, and the daemon dies every few days. Just make sure it gets run > from safe_asterisk, and you'll be fine". Initially I thought we could > live with that, but because it does give us dropped calls, and we > sometimes have to reload the asterisk modules afterward, I would like > to see if we can pin it down and resolve it. > > Based on forums I've seen, it seems to me that not everyone out there > shares the acceptance of their asterisk daemon restarting every day or > two, and that it may not be a wide-spread problem. The only 2 other > asterisk systems I've dealt with didn't seem to have this problem. One > is my company's system, which is also Trixbox, which we've been on > since May 2008. We don't use Zaptel. We use SIP based trunks with > bandwidth.com. I wonder if there is something with our Zaptel > interfaces causing this. > > So not being a asterisk guru here, I'd really appreciate any help. > > Sincerely, > - > Doug Mortensen > Impala Networks Inc. > > > > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-- Steve Murphy Software Developer Digium -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3227 bytes Desc: not available Url : http://lists.digium.com/pipermail/asterisk-users/attachments/20081110/c383013d/attachment-0001.bin
We've found the Trixbox 2.6 series to be quite unstable which is why we're still running 2.4 for those installations that require Trixbox. Having someone tell you "your daemon crashing and restarting is normal" doesn't seem very bright. Wouldn't the proper path be to find out why it is crashing? I mean seriously, if the asterisk installation at *MY* house was crashing and restarting, who cares. But *YOURS*? Thats a different story... You should really dive into your logs to see where the problem is coming from. /var/log/asterisk/full should have what you want but you'll need to do some heavy sifting as it contains very verbose output. Also, post this on the Trixbox forums. I'm sure someone over there can lend a hand also. Good luck! Tim Nelson Systems/Network Support Rockbochs Inc. (218)727-4332 x105 ----- "Douglas Mortensen" wrote:> Asterisk daemon dies about once per dayI have an asterisk system where the asterisk daemon dies typically at least once per day. It is running in the wrapper safe_asterisk, which automatically starts the daemon back up. But we find this unacceptable because when the daemon dies, we usually have active calls drop, and sometimes we have to run asterisk -r -x "module reload" after the daemon starts back up before everything is working well again. Any help or insight would be greatly appreciated.> > Here's an overview of our system. > > Software > ======== > Distro: Trixbox CE 2.6.1.1 (CentOS 5) > Linux Kernel: 2.6.18-53.1.4.el5 > Asterisk version: 1.4.21.2-2 (trixbox RPM) > asterisk-addons: 1.4.6 (trixbox RPM) > zaptel 1.4.11-1 (trixbox RPM) > zaptel-modules 1.4.11-1.2.6.18_53.1.4.el5 (trixbox RPM) > > Hardware > ======== > Rhino Ceros III (2U short-depth server) > - Intel Desktop Board DG33FB > - Intel Pentium D 2.2Ghz (E2200) > - 1GB RAM > - 80GB SATA HostRAID Mirror (RAID1) > - Rhino R1T1-EC Single T1 card (as PRI, using 4 channels + D) > - Rhino RCB8FXX/1 w 1 FXO Module (2 FXO ports total) > > Zaptel > ======== > The cards we are using are mentioned above. Other than that, if it helps, here's what we're doing with our trunks. We are using 4 channels of the PRI (channels 1-4), plus the D-channel for signaling. The PRI is a U.S.-based T1. With the FXO ports, we are sharing 1 with a fax & credit card machine, and the other one is shared with a different fax, coming off of the fax's phone port (so there is pretty much no way for it to ever see or feel anything fax-related). > > I've looked a bit at the asterisk/full and messages log, but so far nothing stands out. > > I did see in one forum where someone was having a similar problem and it ended up being a problem with the FXO card sometimes detecting fax tones on the line. This is definitely a possibility with one of our FXO ports, but it surprises me that this could kill the asterisk daemon?? > > I did speak with Rhino tech support about these daemon restarts and he told me this was normal. He said, "I've got an asterisk system at my house, and the daemon dies every few days. Just make sure it gets run from safe_asterisk, and you'll be fine". Initially I thought we could live with that, but because it does give us dropped calls, and we sometimes have to reload the asterisk modules afterward, I would like to see if we can pin it down and resolve it. > > Based on forums I've seen, it seems to me that not everyone out there shares the acceptance of their asterisk daemon restarting every day or two, and that it may not be a wide-spread problem. The only 2 other asterisk systems I've dealt with didn't seem to have this problem. One is my company's system, which is also Trixbox, which we've been on since May 2008. We don't use Zaptel. We use SIP based trunks with bandwidth.com. I wonder if there is something with our Zaptel interfaces causing this. > > So not being a asterisk guru here, I'd really appreciate any help. > > Sincerely, > - > Doug Mortensen > Impala Networks Inc. > > > _______________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20081110/0cc00818/attachment.htm
Douglas Mortensen
2008-Nov-26 19:59 UTC
[asterisk-users] Asterisk daemon dies about once per day
OK. I know it's been a few weeks since my original post. Things have been busy ;-) Based on help from the trixbox forums and the asterisk-users mailing list, I have located 30 asterisk core dump files in /tmp. These date from 10/30/08 to 10/24/08. Today is 10/26/08. So this does agree with the intermittent nature of the problem. Many days there are no dumps. Other days there are 5, 7, 6, 1, 4, or 2 dumps. I have used the viewcore tool as indicated on http://www.voip-info.org/wiki-Asterisk+debugging on one of the most recent dump files and posted the output here: http://kgotsi.com/static.php?page=static-asterisk-core-dumps I really don't know what all of the output it produced means, so I'm relying on others with more expertise here to take a look and tell me what insight it may provide. FYI, I have not recompiled asterisk yet. I'm a little nervous about downtime that could be caused by the process not going smoothly, but definitely willing to do so if that's what is needed to fix this problem. Also today I changed a setting in FreePBX, which may have disabled the fax functionality (as previously mentioned as least one other trixbox user who had a problem similar to mine got it fixed by disabling faxing). The setting I changed was in the General Settings, under Fax Machine, changing the "Extension of fax machine for receiving faxes" from system to disabled. I do not know whether this effectively disables faxing, but it looks like it may. Also, I have not yet used the strace tool. I did install it from the CentOS repositories, and tried to use it, but I didn't have the syntax right. I may want to see if someone can assist me with the correct syntax to run it against my dump files (I believe this is what I'm supposed to do. Please let me know if I'm wrong here). I have also looked at one of the asterisk full log files up to the point that the daemon died. It appears to me that this could also point to the fax functionality causing the issue. I have posted the 200 lines prior to the daemon dying at: http://kgotsi.com/static.php?page=static-asterisk-full-log Again, any insight that someone could provide by examining this would be greatly appreciated. So I guess this is about all of the information I have to post for now. Thanks in advance for any assistance. Sincerely, - Doug Mortensen Network Consultant Impala Networks Original Message ------------------------------ Message: 8 Date: Mon, 10 Nov 2008 12:52:22 -0700 From: Steve Murphy <murf at digium.com> Subject: Re: [asterisk-users] Asterisk daemon dies about once per day To: murf at digium.com, Asterisk Users Mailing List - Non-Commercial Discussion <asterisk-users at lists.digium.com> Message-ID: <1226346742.13961.462.camel at digium2> Content-Type: text/plain; charset="us-ascii" On Mon, 2008-11-10 at 10:50 -0700, Steve Murphy wrote:> On Mon, 2008-11-10 at 10:10 -0700, Douglas Mortensen wrote: > > I have an asterisk system where the asterisk daemon dies typically at > > least once per day. It is running in the wrapper safe_asterisk, which > > automatically starts the daemon back up. But we find this unacceptable > > because when the daemon dies, we usually have active calls drop, and > > sometimes we have to run asterisk -r -x "module reload" after the > > daemon starts back up before everything is working well again. Any > > help or insight would be greatly appreciated. > > > > Douglas-- > > Are you getting core files in /tmp? Getting a stack trace from them > could > be very.... informative! > > If not, or there is no debug info in your asterisk, then I encourage you > to recompile asterisk so that DONT_OPTIMIZE is turned off; and so youruhhh, I mean "turned ON"... sorry> safe_asterisk script uses the g option to start asterisk, so it dumps > core on a crash. > > murf > > > Here's an overview of our system. > > > > Software > > =======> > Distro: Trixbox CE 2.6.1.1 (CentOS 5) > > Linux Kernel: 2.6.18-53.1.4.el5 > > Asterisk version: 1.4.21.2-2 (trixbox RPM) > > asterisk-addons: 1.4.6 (trixbox RPM) > > zaptel 1.4.11-1 (trixbox RPM) > > zaptel-modules 1.4.11-1.2.6.18_53.1.4.el5 (trixbox RPM) > > > > Hardware > > =======> > Rhino Ceros III (2U short-depth server) > > - Intel Desktop Board DG33FB > > - Intel Pentium D 2.2Ghz (E2200) > > - 1GB RAM > > - 80GB SATA HostRAID Mirror (RAID1) > > - Rhino R1T1-EC Single T1 card (as PRI, using 4 channels + D) > > - Rhino RCB8FXX/1 w 1 FXO Module (2 FXO ports total) > > > > Zaptel > > =======> > The cards we are using are mentioned above. Other than that, if it > > helps, here's what we're doing with our trunks. We are using 4 > > channels of the PRI (channels 1-4), plus the D-channel for signaling. > > The PRI is a U.S.-based T1. With the FXO ports, we are sharing 1 with > > a fax & credit card machine, and the other one is shared with a > > different fax, coming off of the fax's phone port (so there is pretty > > much no way for it to ever see or feel anything fax-related). > > > > I've looked a bit at the asterisk/full and messages log, but so far > > nothing stands out. > > > > I did see in one forum where someone was having a similar problem and > > it ended up being a problem with the FXO card sometimes detecting fax > > tones on the line. This is definitely a possibility with one of our > > FXO ports, but it surprises me that this could kill the asterisk > > daemon?? > > > > I did speak with Rhino tech support about these daemon restarts and he > > told me this was normal. He said, "I've got an asterisk system at my > > house, and the daemon dies every few days. Just make sure it gets run > > from safe_asterisk, and you'll be fine". Initially I thought we could > > live with that, but because it does give us dropped calls, and we > > sometimes have to reload the asterisk modules afterward, I would like > > to see if we can pin it down and resolve it. > > > > Based on forums I've seen, it seems to me that not everyone out there > > shares the acceptance of their asterisk daemon restarting every day or > > two, and that it may not be a wide-spread problem. The only 2 other > > asterisk systems I've dealt with didn't seem to have this problem. One > > is my company's system, which is also Trixbox, which we've been on > > since May 2008. We don't use Zaptel. We use SIP based trunks with > > bandwidth.com. I wonder if there is something with our Zaptel > > interfaces causing this. > > > > So not being a asterisk guru here, I'd really appreciate any help. > > > > Sincerely, > > - > > Doug Mortensen > > Impala Networks Inc. > > > > > > > > _______________________________________________ > > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > > > asterisk-users mailing list > > To UNSUBSCRIBE or update options visit: > > http://lists.digium.com/mailman/listinfo/asterisk-users > _______________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-- Steve Murphy Software Developer Digium -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3227 bytes Desc: not available Url : http://lists.digium.com/pipermail/asterisk-users/attachments/20081110/2f8d85ee/attachment-0001.bin ------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 6792 bytes Desc: not available Url : http://lists.digium.com/pipermail/asterisk-users/attachments/20081126/49b723e3/attachment.bin