I've been using asterisk for 3+ years now, I love it, but it doesnt love me back. :-) It was crashing frequently and seemingly randomly prior to this latest upgrade. Not sure what version it was running prior to upgrade (it was probably an old CVS HEAD from 2+ years go.) Anyway, currently running 1.4.21.2. == Problem = Problem is that its crashing for seemingly no reason at all, no errors on the console, no logs (that I can find), nothing in /var/lib/messages - its puzzeling! Management is screaming like banshees, calls are dropping like flies, and all hell is about to break loose if I can't stop asterisk from crashing every couple of hours, taking down any Zaptel calls with it. I've been thinking of switching over to CallWeaver, but I havn't got another Zaptel card to plugin to my testing box, so I'd like to just get Asterisk stabilized right now - but I'm at a loss of even where to start. == System Details = Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard drive. I've got two 4-port FXO cards in PCI slots. lspci reports: 02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface 02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface [root at asterisk ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 1 model name : Intel(R) Pentium(R) 4 CPU 1.50GHz stepping : 2 cpu MHz : 1483.679 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 2924.54 = Thanks for any help or advice anyone may have. Cheers! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
I've been running 1.4.21.2 on SUSE 11.0 for about 4 months. In my experience, the fewer database interfaces you can use, the more stable it will be. -----Original Message----- From: asterisk-users-bounces at lists.digium.com [mailto:asterisk-users-bounces at lists.digium.com] On Behalf Of Josiah Bryan Sent: Thursday, February 05, 2009 8:57 AM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: [asterisk-users] Crash Hard, Crash Often I've been using asterisk for 3+ years now, I love it, but it doesnt love me back. :-) It was crashing frequently and seemingly randomly prior to this latest upgrade. Not sure what version it was running prior to upgrade (it was probably an old CVS HEAD from 2+ years go.) Anyway, currently running 1.4.21.2. == Problem = Problem is that its crashing for seemingly no reason at all, no errors on the console, no logs (that I can find), nothing in /var/lib/messages - its puzzeling! Management is screaming like banshees, calls are dropping like flies, and all hell is about to break loose if I can't stop asterisk from crashing every couple of hours, taking down any Zaptel calls with it. I've been thinking of switching over to CallWeaver, but I havn't got another Zaptel card to plugin to my testing box, so I'd like to just get Asterisk stabilized right now - but I'm at a loss of even where to start. == System Details = Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard drive. I've got two 4-port FXO cards in PCI slots. lspci reports: 02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface 02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface [root at asterisk ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 1 model name : Intel(R) Pentium(R) 4 CPU 1.50GHz stepping : 2 cpu MHz : 1483.679 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 2924.54 = Thanks for any help or advice anyone may have. Cheers! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224 _______________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Josiah Bryan wrote:> I've been using asterisk for 3+ years now, I love it, but it doesnt love > me back. :-) > >The first place I usually start is with memtest86 Doug -- Ben Franklin quote: "Those who would give up Essential Liberty to purchase a little Temporary Safety, deserve neither Liberty nor Safety."
<snip> Problem is that its crashing for seemingly no reason at all, no errors on the console, no logs (that I can find), nothing in /var/lib/messages - its puzzeling! Management is screaming like banshees, calls are dropping like flies, and all hell is about to break loose if I can't stop asterisk from crashing every couple of hours, taking down any Zaptel calls with it. </snip> I am assuming you have debug turned on so that you can see what's going on when it crashes? If not, open the * console (asterisk -r) and type (core set verbose 100) and (core set debug 100). Then leave the console open so you can see if * was doing anything special when it crashed. --Dave
Roderick A. Anderson wrote:> Doug Lytle wrote: >> Josiah Bryan wrote: >>> I've been using asterisk for 3+ years now, I love it, but it doesnt love >>> me back. :-) >>> >>> >> The first place I usually start is with memtest86 > > Here, here! > > Every time I have had problems with a system (not just Asterisk) > crashing and there is nothing in the logs it turns out to be hardware. > > One slight exception was where a UPS would brownout every so often > causing the system to go out to lunch. Even though there were three > power supplies in that system someone (not me) had _forgot_ to put them > on separate UPS' ... they were all on one. > > Actually, that is hardware, just not in the system case. > > So check your hardware.I must admit, I've suspected hardware as well - the individual FXO "chips" on the two TDM400P have slowly gone dead till I only have four working FXO chips between 8 slots - thats fine, since I only have four POTS lines right now, but still a bit annoying. They are 2 - 3 yrs old, so I guess its just their time. How would I go about pinpointing / diagnosing the hardware fault? Not sure exactly what to do with memtest86 - any pointers? Thanks! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
David Gibbons wrote:> <snip> > Problem is that its crashing for seemingly no reason at all, no errors > on the console, no logs (that I can find), nothing in /var/lib/messages > - its puzzeling! Management is screaming like banshees, calls are > dropping like flies, and all hell is about to break loose if I can't > stop asterisk from crashing every couple of hours, taking down any > Zaptel calls with it. > </snip> > > I am assuming you have debug turned on so that you can see what's going on when it crashes? If not, open the * console (asterisk -r) and type (core set verbose 100) and (core set debug 100). Then leave the console open so you can see if * was doing anything special when it crashed. >I've ran with verbose quite high lately, but havn't left debug on. Well, I just opened console and turned debug on to 100 so we'll wait and see what it shows next time it crashes. It's due for another any time now... -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
It *is* doing mysql CDR and a whole host of custom AGI scripts. AGI to mudge the CID, AGI to handle receptionist routing/selections, AGI for voicemail (not using builtin vm app) - all the AGI scripts do mysql connections. Would the CDR connection be a problem? -josiah Danny Nicholas wrote:> I've been running 1.4.21.2 on SUSE 11.0 for about 4 months. In my > experience, the fewer database interfaces you can use, the more stable it > will be. > > -----Original Message----- > From: asterisk-users-bounces at lists.digium.com > [mailto:asterisk-users-bounces at lists.digium.com] On Behalf Of Josiah Bryan > Sent: Thursday, February 05, 2009 8:57 AM > To: Asterisk Users Mailing List - Non-Commercial Discussion > Subject: [asterisk-users] Crash Hard, Crash Often > > I've been using asterisk for 3+ years now, I love it, but it doesnt love > me back. :-) > > It was crashing frequently and seemingly randomly prior to this latest > upgrade. Not sure what version it was running prior to upgrade (it was > probably an old CVS HEAD from 2+ years go.) Anyway, currently running > 1.4.21.2. > > == Problem => > Problem is that its crashing for seemingly no reason at all, no errors > on the console, no logs (that I can find), nothing in /var/lib/messages > - its puzzeling! Management is screaming like banshees, calls are > dropping like flies, and all hell is about to break loose if I can't > stop asterisk from crashing every couple of hours, taking down any > Zaptel calls with it. > > I've been thinking of switching over to CallWeaver, but I havn't got > another Zaptel card to plugin to my testing box, so I'd like to just get > Asterisk stabilized right now - but I'm at a loss of even where to start. > > == System Details => > Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard > drive. I've got two 4-port FXO cards in PCI slots. > > lspci reports: > 02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX > Modem/ISDN interface > 02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX > Modem/ISDN interface > > > [root at asterisk ~]# cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 15 > model : 1 > model name : Intel(R) Pentium(R) 4 CPU 1.50GHz > stepping : 2 > cpu MHz : 1483.679 > cache size : 256 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 2924.54 > > > => > Thanks for any help or advice anyone may have. Cheers! > -josiah >-- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
Josiah Bryan wrote:> David Gibbons wrote: >> <snip> >> Problem is that its crashing for seemingly no reason at all, no errors >> on the console, no logs (that I can find), nothing in /var/lib/messages >> - its puzzeling! Management is screaming like banshees, calls are >> dropping like flies, and all hell is about to break loose if I can't >> stop asterisk from crashing every couple of hours, taking down any >> Zaptel calls with it. >> </snip> >> >> I am assuming you have debug turned on so that you can see what's going on when it crashes? If not, open the * console (asterisk -r) and type (core set verbose 100) and (core set debug 100). Then leave the console open so you can see if * was doing anything special when it crashed. >> > > I've ran with verbose quite high lately, but havn't left debug on. Well, > I just opened console and turned debug on to 100 so we'll wait and see > what it shows next time it crashes. It's due for another any time now... >Alright, latest console output right before latest crash shows: == Parsing '/etc/asterisk/manager.conf': Found == Manager 'script' logged on from 10.10.9.5 -- Executing [954 at playground:1] AGI("Local/954 at playground-604a,2", "paging-hack.pl") in new stack -- Launched AGI Script /var/lib/asterisk/agi-bin/paging-hack.pl > Channel Local/954 at playground-604a,1 was answered. == Manager 'script' logged off from 10.10.9.5 -- Executing [664 at playground:1] Answer("Local/954 at playground-604a,1", "") in new stack -- Executing [664 at playground:2] PlayTones("Local/954 at playground-604a,1", "750+440+1030+3000+5000+15000") in new stack -- Executing [664 at playground:3] Wait("Local/954 at playground-604a,1", "2") in new stack == Parsing '/etc/asterisk/manager.conf': Found == Manager 'script' logged on from 127.0.0.1 -- Executing [952 at paging:1] Playback("Local/952 at paging-7883,2", "beep") in new stack > Channel Local/952 at paging-7883,1 was answered. -- Executing [951 at playground:1] MeetMe("Local/952 at paging-7883,1", "951|qaA") in new stack == Manager 'script' logged off from 127.0.0.1 -- AGI Script paging-hack.pl completed, returning 0 -- Executing [954 at playground:2] Goto("Local/954 at playground-604a,2", "conferences|951|1") in new stack -- Goto (conferences,951,1) -- Executing [951 at conferences:1] MeetMe("Local/954 at playground-604a,2", "951|qaA") in new stack == Parsing '/etc/asterisk/meetme.conf': Found == Parsing '/etc/asterisk/meetme.conf': Found -- Created MeetMe conference 1023 for conference '951' -- <Local/952 at paging-7883,2> Playing 'beep' (language 'en') [Feb 5 11:29:03] WARNING[24824]: file.c:1204 waitstream_core: Unexpected control subclass '-1' -- Executing [952 at paging:2] Dial("Local/952 at paging-7883,2", "Console/dsp") in new stack << Call placed to 'dsp' on console >> << Auto-answered >> -- Called dsp -- ALSA/default answered Local/952 at paging-7883,2 asterisk*CLI> Disconnected from Asterisk server Executing last minute cleanups Asterisk cleanly ending (0). [root at asterisk ~]# I know that all looks a bit weird, but its related to this problem I had last September: http://lists.digium.com/pipermail/asterisk-users/2008-September/217822.html My extensions.conf has the following notes: ; PAGING HACK ; AGI script: paging-hack.pl is called when user dials 249 ; The script puts the user in 951, then calls the Console into ; 951, and starts a fork monitoring the users leg of the call - ; as soon as the user hangs up, the fork automatically ; hangs up the Console. ; ? ????? WHY ????? ?? ; Well, simple, as of version 1.4.21.2 of asterisk, ; when a user dialed 249 and got the Console directly, ; after the user hung up, ringing tone was heard over ; the console until I manually typed 'hangup' in the ; asterisk console - even then, asterisk said 'no calls to hangup' ; The mailing list was no help, so I wrote paaging-hack.pl as a, ; well, a hack to get it to a point where paging still worked. exten => 951,1,MeetMe(951|qaA) So, 249 does AGI(paging-hack.pl), and from there, the user and the Console are dragged into a MeetMe conference for the user to speak his/her page. (The script doesn't do the hangup on the console actually - it just leaves the console active for the next page.) So, anyway, thats the output right before the last crash - any ideas as to why based on that info? Thanks! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
Doug Lytle wrote:> Josiah Bryan wrote: >> Roderick A. Anderson wrote: >> >> How would I go about pinpointing / diagnosing the hardware fault? Not >> sure exactly what to do with memtest86 - any pointers? >> > A lot of distros have memtest86 as a boot option on the CD/DVD. You > select it and let it run. It'll scan for bad memory. And, shoot lots > of red errors when encountered. If the memory checks fail, you'll know > that you need to replace the chip.Ahhhhh I see! Gotcha. I'll try to run that tonite or this weekend then when the plant is closed. Thanks! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
Wilton Helm wrote:> One relevant question that hasn't been addressed is whether just the > application is crashing or the whole computer (Linux). > > I would second the hardware idea, with emphasis on generic hardware, > especially RAM. I had a Suse 10 box that kept crashing and doing funny > stuff. I ended up running an extended RAM test on it--one of those > pattern sensitivity tests that takes an hour or two to run. Turned out > that one of the SIMMs I had just bought and installed had a subtle > problem. It would never show up on a straightforward test, but certain > address ranges would fail on one or two of the exotic pattern tests. > > It came from a reputable vendor who does 100% testing themselves, so it > was apparently subtle enough to slip through their test. They took it > back and replaced it. I ran for a few weeks without the module with no > crashes and when I put the replacement in everything was still fine. > > WiltonJust the application crashes. I'll try changing RAM simms to see if that helps. Thanks! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
Josiah Bryan wrote:> <snip> > Problem is that its crashing for seemingly no reason at all, no errors > on the console, no logs (that I can find), nothing in /var/lib/messages > - its puzzeling! Management is screaming like banshees, calls are > dropping like flies, and all hell is about to break loose if I can't > stop asterisk from crashing every couple of hours, taking down any > Zaptel calls with it. > </snip>That description reminds me of a problem I ran into a while back. One fan had quietly failed, and the temperature would slowly creep up inside the box until things started 'acting funny' and the box would lock up soon after. It'd run fine for 3-4 hours, then just keel over and die. The logs didn't show anything consistent just before the event. The failing fxo modules are also an interesting symptom. Perhaps your power supply is misbehaving? is the power supply in that machine of good quality? I've never experienced it, but a friend has had two motherboards become unstable within a couple of months of each other, after running fine for 3-4 years. When he examined the motherboards, both had capacitors around the CPU that had visibly 'ballooned' like a leaking alkaline battery would. A long shot, but another example of previously stable hardware ceasing to be so. Do you have another PC you can swap the drive and cards into, to try to rule out hardware instability? could you run lm_sensors? (along with one of the logging/alarm packages that support it). -- Paul
Paul Chambers wrote:> Josiah Bryan wrote: >> <snip> >> Problem is that its crashing for seemingly no reason at all, no errors >> on the console, no logs (that I can find), nothing in /var/lib/messages >> - its puzzeling! Management is screaming like banshees, calls are >> dropping like flies, and all hell is about to break loose if I can't >> stop asterisk from crashing every couple of hours, taking down any >> Zaptel calls with it. >> </snip><snip>> That description reminds me of a problem I ran into a while back. One > fan had quietly failed, and the temperature would slowly creep up inside > the box until things started 'acting funny' and the box would lock up > soon after. It'd run fine for 3-4 hours, then just keel over and die. > The logs didn't show anything consistent just before the event.The wierd thing is that its *just* the asterisk process that dies - the rest of the system stays solidly up... <snip>> Do you have another PC you can swap the drive and cards into, to try to > rule out hardware instability? could you run lm_sensors? (along with one > of the logging/alarm packages that support it).Well, Paul, it looks like that was indeed the problem (hardware instability.) I came into the office last night after everyone left in order to swap out the RAM in the server - lo and behold, I didn't have any of that type of RAM around (RIMM's ??), so I had to do an emergency hard drive & PCI card transplant to a similar chassis. After a bit of tweaking to get ALSA to work right and the NIC to play nice in the new chassis, asterisk came online and worked beautifully. (And, shockingly enough, the zaptel cards just *worked* - no tweaking needed!) So far, no crashes today (by this time, normally it's crashed two or three times already in a day.) So, we'll see how she runs - If I were a betting man, I'd say that something in that old chassis was going out - probably the RAM as stated before, but not sure. As far as the power supply being "good", I believe it was - didn't check. The server was a re-purposed high-end CAD workstation - the dismal RAM and CPU belie the solid construction of the chassis and the quality of the workmanship in the way the server was put together. Now that I've waxed weird, I'll just say the hardware seems to have been the problem and I'll keep and eye on it. This may have yet saved me from converting over to the callweaver fork - we'll see. :-) Cheers! -josiah -- Josiah Bryan IT Manager Productive Concepts, Inc. jbryan at productiveconcepts.com (765) 964-6009, ext. 224
Reasonably Related Threads
- Redirect two channels to each other?
- Knowing incoming call technology and channel [SOLVED]
- For Dial(), when calling party hangs up, redirect called party to another location in the dialplan?
- Sipura 841 and headset
- connecting a sipura sip device to asterisk before dialing any digits