Muro, Sam
2010-Feb-09 04:42 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
Hi Team Can someone advice me on how i can lower the load average on my asterisk server? dahdi-linux-2.1.0.4 dahdi-tools-2.1.0.2 libpri-1.4.10.1 asterisk-1.4.25.1 2 X TE412P Digium cards on ISDN PRI Im using the system as an IVR without any transcoding or bridging ************************************** top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75 Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie Cpu0 : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3961100k total, 3837920k used, 123180k free, 108944k buffers Swap: 779144k total, 56k used, 779088k free, 3602540k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 683 root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6 root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9 root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12 root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18 root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21 root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23 root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24 root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26 root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27 root 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10 -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5 0 0 0 S 0.0 0.0 0:16.45 events/4 ********************************************* Thanks Sam
Alex Balashov
2010-Feb-09 05:07 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
Do you want the advice in ALL CAPS? On 02/08/2010 11:42 PM, Muro, Sam wrote:> Hi Team > > Can someone advice me on how i can lower the load average on my asterisk > server? > > dahdi-linux-2.1.0.4 > dahdi-tools-2.1.0.2 > libpri-1.4.10.1 > asterisk-1.4.25.1 > > 2 X TE412P Digium cards on ISDN PRI > > Im using the system as an IVR without any transcoding or bridging > > ************************************** > top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, > 55.75 > Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie Cpu0 > : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st > Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si, > 0.0%st > Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Mem: 3961100k total, 3837920k used, 123180k free, 108944k buffers > Swap: 779144k total, 56k used, 779088k free, 3602540k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 683 > root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk > 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top > 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init > 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80 migration/0 3 > root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 > root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6 > root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 > root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2 > 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 > root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12 > root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 > root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15 > root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 > root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18 > root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 > root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23 > root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24 > root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26 > root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27 root > 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root > 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10 > -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5 > 0 0 0 S 0.0 0.0 0:16.45 events/4 > ********************************************* > > Thanks > Sam > > >-- Alex Balashov - Principal Evariste Systems LLC Tel : +1 678-954-0670 Direct : +1 678-954-0671 Web : http://www.evaristesys.com/
Steve Totaro
2010-Feb-09 05:19 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
On Mon, Feb 8, 2010 at 11:42 PM, Muro, Sam <research at businesstz.com> wrote:> Hi Team > > Can someone advice me on how i can lower the load average on my asterisk > server? > > dahdi-linux-2.1.0.4 > dahdi-tools-2.1.0.2 > libpri-1.4.10.1 > asterisk-1.4.25.1 > > 2 X TE412P Digium cards on ISDN PRI > > Im using the system as an IVR without any transcoding or bridging > > ************************************** > top - 10:27:57 up 199 days, ?5:18, ?2 users, ?load average: 67.75, 62.55, > 55.75 > Tasks: 149 total, ? 1 running, 148 sleeping, ? 0 stopped, ? 0 zombie Cpu0 > : 10.3%us, 32.0%sy, ?0.0%ni, 57.3%id, ?0.0%wa, ?0.0%hi, ?0.3%si, ?0.0%st > Cpu1 ?: 10.6%us, 34.6%sy, ?0.0%ni, 54.8%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Cpu2 ?: 13.3%us, 36.5%sy, ?0.0%ni, 49.8%id, ?0.0%wa, ?0.0%hi, ?0.3%si, > 0.0%st > Cpu3 ?: ?8.6%us, 39.5%sy, ?0.0%ni, 51.8%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Cpu4 ?: ?7.3%us, 38.0%sy, ?0.0%ni, 54.7%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Cpu5 ?: 17.9%us, 37.5%sy, ?0.0%ni, 44.5%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Cpu6 ?: 13.3%us, 37.2%sy, ?0.0%ni, 49.5%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Cpu7 ?: 12.7%us, 37.3%sy, ?0.0%ni, 50.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, > 0.0%st > Mem: ? 3961100k total, ?3837920k used, ? 123180k free, ? 108944k buffers > Swap: ? 779144k total, ? ? ? 56k used, ? 779088k free, ?3602540k cached > > ?PID USER ? ? ?PR ?NI ?VIRT ?RES ?SHR S %CPU %MEM ? ?TIME+ ?COMMAND 683 > root ? ? ?15 ? 0 97968 ?36m 5616 S 307.7 ?0.9 ?41457:34 asterisk > 17176 root ? ? ?15 ? 0 ?2196 1052 ?800 R ?0.7 ?0.0 ? 0:00.32 top > ? ?1 root ? ? ?15 ? 0 ?2064 ?592 ?512 S ?0.0 ?0.0 ? 0:13.96 init > ? ?2 root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 5:27.80 migration/0 3 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.11 ksoftirqd/0 4 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/0 5 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:07.67 migration/1 6 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.09 ksoftirqd/1 7 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/1 8 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:16.92 migration/2 9 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.03 ksoftirqd/2 > ? 10 root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/2 11 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:34.54 migration/3 12 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.15 ksoftirqd/3 13 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/3 14 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:54.66 migration/4 15 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.01 ksoftirqd/4 16 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/4 17 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:39.64 migration/5 18 > root ? ? ?39 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.21 ksoftirqd/5 19 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/5 20 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:06.27 migration/6 21 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.03 ksoftirqd/6 22 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/6 23 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 1:23.24 migration/7 24 > root ? ? ?34 ?19 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.17 ksoftirqd/7 25 > root ? ? ?RT ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:00.00 watchdog/7 26 > root ? ? ?10 ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:25.70 events/0 27 root > ? ? 10 ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:37.83 events/1 28 root > 10 ?-5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:15.67 events/2 29 root ? ? ?10 > -5 ? ? 0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:40.36 events/3 30 root ? ? ?10 ?-5 > ?0 ? ?0 ? ?0 S ?0.0 ?0.0 ? 0:16.45 events/4 > ********************************************* > > Thanks > Sam >Even though you shouldn't have to, have your rebooted? 200 days of uptime and this just started? Have you recently updated the box? ksoftirqd seems to have issues in some kernels. That is where I would start after restarting Asterisk and or the server. http://tinyurl.com/ygd2eha Thanks, Steve T
Tzafrir Cohen
2010-Feb-09 14:23 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
On Tue, Feb 09, 2010 at 07:42:48AM +0300, Muro, Sam wrote:> Hi Team > > Can someone advice me on how i can lower the load average on my asterisk > server? > > dahdi-linux-2.1.0.4 > dahdi-tools-2.1.0.2 > libpri-1.4.10.1 > asterisk-1.4.25.1 > > 2 X TE412P Digium cards on ISDN PRI > > Im using the system as an IVR without any transcoding or bridging > > ************************************** > top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, > 55.75 > Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie Cpu0 > : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st > Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si, > 0.0%st > Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%stSystem is fairly loaded, but there's still plenty of idle CPU cycles. If we were in a storm of CPU-intensive processes, we would have expected many more "running" processes. Right now we have none (the single process is 'top' itself).> Mem: 3961100k total, 3837920k used, 123180k free, 108944k buffers > Swap: 779144k total, 56k used, 779088k free, 3602540k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 683 > root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk > 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top > 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init > 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80 migration/0 3Processes seem to be sorted by size. You should have pressed 'p' to go back to sorting by CPU. Now we don't even see the worst offenders.> root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 > root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6 > root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 > root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2 > 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 > root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12 > root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 > root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15 > root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 > root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18 > root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 > root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23 > root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24 > root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26 > root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27 root > 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root > 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10 > -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5 > 0 0 0 S 0.0 0.0 0:16.45 events/4Those are all kernel threads rather than real processes. So I suspect one of two things: 1. You're right after such a storm. The load average will decreases sharply. 2. There are many processes hung in state 'D' (uninterruptable system call). If a process is hung in such a system call for long, it normally means a problem. E.g. disk-access issues which causes all processes trying to acess a certain file to hang. -- Tzafrir Cohen icq#16849755 jabber:tzafrir.cohen at xorcom.com +972-50-7952406 mailto:tzafrir.cohen at xorcom.com http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
Stephen Davies
2010-Feb-09 15:56 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
On 9 February 2010 06:42, Muro, Sam <research at businesstz.com> wrote:> Hi Team > > Can someone advice me on how i can lower the load average on my asterisk > server? > > dahdi-linux-2.1.0.4 > dahdi-tools-2.1.0.2 > libpri-1.4.10.1 > asterisk-1.4.25.1 > > 2 X TE412P Digium cards on ISDN PRI > > Im using the system as an IVR without any transcoding or bridging > > ************************************** > top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, > 55.75 >Hi Sam! Are there any side-effects from the high load average? The system doesn't seem to be CPU or disk bound from the look of the CPU stats. System %age is high by way - software echo cancellaton?, and Asterisk is using a lot of cpu which isn't suprising. I'm guessing you are running 8 spans and 200+ calls into your IVR? If the system is actually performing fine then I'd just say that there is something about the Asterisk threads that makes them look runnable and that accounts for the high load average. Is the IVR an agi or fastagi or what? - the code path may have a "spinlock" logic to it that means that many threads are runnable but when scheduled just go back to sleep. That would account for high load average with lots of spare CPU. If that's what is happening then I wouldn't worry much more about it. Regards, Steve PS: Alex - why the dig about ALL CAPS? The post wasn't in caps? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20100209/645d8b7d/attachment.htm
Robert Grignon
2010-Feb-09 17:24 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
Come on, was that necessary? He was asking for help and considers it an important issue... If you want to chastise the guy at least offer up a solution for him... Sam - I have never tried this solution but Sangoma has a reference to this. http://wiki.sangoma.com/files/wanpipe-linux-asterisk-tutorials/How_to_Re duce_Asterisk_System_Loads.pdf -----Original Message----- From: asterisk-users-bounces at lists.digium.com [mailto:asterisk-users-bounces at lists.digium.com] On Behalf Of Alex Balashov Sent: Monday, February 08, 2010 11:07 PM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: Re: [asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75 Do you want the advice in ALL CAPS? On 02/08/2010 11:42 PM, Muro, Sam wrote:> Hi Team > > Can someone advice me on how i can lower the load average on my > asterisk server? > > dahdi-linux-2.1.0.4 > dahdi-tools-2.1.0.2 > libpri-1.4.10.1 > asterisk-1.4.25.1 > > 2 X TE412P Digium cards on ISDN PRI > > Im using the system as an IVR without any transcoding or bridging > > ************************************** > top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, > 62.55, > 55.75 > Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombieCpu0> : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, > 0.0%st > Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si,> 0.0%st > Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si,> 0.0%st > Mem: 3961100k total, 3837920k used, 123180k free, 108944kbuffers> Swap: 779144k total, 56k used, 779088k free, 3602540kcached> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND683> root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk > 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top > 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init > 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80migration/0 3> root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 > root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6 > root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 > root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2 > 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00watchdog/2 11> root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12 > root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 > root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15 > root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 > root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18 > root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 > root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21 > root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23 > root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24 > root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25 > root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26 > root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27root> 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root > 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10 > -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5 > 0 0 0 S 0.0 0.0 0:16.45 events/4 > ********************************************* > > Thanks > Sam > > >-- Alex Balashov - Principal Evariste Systems LLC Tel : +1 678-954-0670 Direct : +1 678-954-0671 Web : http://www.evaristesys.com/ -- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Muro, Sam
2010-Feb-10 06:37 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
Hi Steve> Even though you shouldn't have to, have your rebooted? 200 days of > uptime and this just started?It seems this problem is common as i have three boxes of the same capacity with exactly the same problem. So reboot should only solve the problem for a while> > Have you recently updated the box? >No.> ksoftirqd seems to have issues in some kernels. That is where I would > start after restarting Asterisk and or the server. >Allow me to look at it and revert> http://tinyurl.com/ygd2eha > > Thanks, > Steve T >
Muro, Sam
2010-Feb-10 07:12 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
>> Hi Team >> >> Can someone advice me on how i can lower the load average on my asterisk >> server? >> >> dahdi-linux-2.1.0.4 >> dahdi-tools-2.1.0.2 >> libpri-1.4.10.1 >> asterisk-1.4.25.1 >> >> 2 X TE412P Digium cards on ISDN PRI >> >> Im using the system as an IVR without any transcoding or bridging >> >> ************************************** >> top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, >> 62.55, >> 55.75 >> Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie >> Cpu0 >> : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st >> Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si, >> 0.0%st >> Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, >> 0.0%st > > System is fairly loaded, but there's still plenty of idle CPU cycles. If > we were in a storm of CPU-intensive processes, we would have expected > many more "running" processes. Right now we have none (the single > process is 'top' itself). > >> Mem: 3961100k total, 3837920k used, 123180k free, 108944k buffers >> Swap: 779144k total, 56k used, 779088k free, 3602540k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 683 >> root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk >> 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top >> 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init >> 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80 migration/0 >> 3 > > Processes seem to be sorted by size. You should have pressed 'p' to go > back to sorting by CPU. Now we don't even see the worst offenders. >Tried option 'p' but doesnt seems to exist. Centos 5.3 kernel 2.6.18-128> >> root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 >> root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6 >> root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 >> root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9 >> root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2 >> 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 >> 11 >> root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12 >> root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 >> root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15 >> root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 >> root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18 >> root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 >> root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21 >> root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23 >> root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24 >> root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25 >> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26 >> root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27 root >> 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root >> 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10 >> -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5 >> 0 0 0 S 0.0 0.0 0:16.45 events/4 > > Those are all kernel threads rather than real processes. > > So I suspect one of two things: > > 1. You're right after such a storm. The load average will decreases > sharply.What do you mean Trafrir Its obvious that the effect increases with increase number of active channels. e.g. @channels=90, load average = 4 but @channels =235, load average= 60+> > 2. There are many processes hung in state 'D' (uninterruptable system > call). If a process is hung in such a system call for long, it normally > means a problem. E.g. disk-access issues which causes all processes > trying to acess a certain file to hang.I presume this should happen if there is irq sharing between disks and cards which isnt my case.> > --
Muro, Sam
2010-Feb-10 07:25 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
>> Hi Team >> >> Can someone advice me on how i can lower the load average on myasterisk server?>> >> dahdi-linux-2.1.0.4 >> dahdi-tools-2.1.0.2 >> libpri-1.4.10.1 >> asterisk-1.4.25.1 >> >> 2 X TE412P Digium cards on ISDN PRI >> >> Im using the system as an IVR without any transcoding or bridging >> >> ************************************** >> top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, >> 55.75 >> > > Hi Sam!Hello Steve!> > Are there any side-effects from the high load average? The systemdoesn't seem to be CPU or disk bound from the look of the CPU stats. System %age is> high by way - software echo cancellaton?, and Asterisk is using a lot ofcpu> which isn't suprising. >Yes. Audio quality issues. I have enabled the hardware echo cancellation and configured echocancel=yes echocancelwhenbridged=yes echotraining=yes> I'm guessing you are running 8 spans and 200+ calls into your IVR? >You are correct. 8 span which process up to 240 calls at pick time> If the system is actually performing fine then I'd just say that thereis something about the Asterisk threads that makes them look runnable and that> accounts for the high load average. Is the IVR an agi or fastagi orwhat? - I have the agi scripts not as ivr but to help populate the required information into mysql db. Probably here is where the problem lies i have to connect and disconnect to mysql each time a call is made or a specific menu is selected Here is the script ***** #!/usr/bin/perl -w use strict; use DBI(); use Scalar::Util qw/weaken/; my $cdr_log_file = "/var/log/asterisk/ivr_log"; my $mysql_host = "cdr01"; my $mysql_db = "ivrcdrdb"; my $mysql_table = "tbl_ivrcdr_details"; my $mysql_user = "ivruser"; my $mysql_pwd = "a09876a"; my $sth; my $data0= $ARGV[0]; my $data1= $ARGV[1]; my $data2= $ARGV[2]; my $data3= $ARGV[3]; my $data4= $ARGV[4]; my $data5= $ARGV[5]; my $data6= $ARGV[6]; my $data7= $ARGV[7]; # Connect to database # print "Connecting to database...\n\n"; my $dbh DBI->connect("DBI:mysql:database=$mysql_db;host=$mysql_host","$mysql_user","$mysql_pwd",{'RaiseError' => 1}); my $insert_str = "insert into $mysql_table (calldate, language, src, duration, accountcode, uniqueid, currentmenu, nextmenu) values (\"$data0\", \"$data1\", \"$data2\", \"$data3\", \"$data4\", \"$data5\", \"$data6\", \"$data7\");\n"; $sth = $dbh->prepare($insert_str); $sth->execute(); # print "\n\nOK.\n"; $sth->finish(); $dbh->disconnect(); # Trying to resolve memory leak should it happen delete($ARGV[0]); delete($ARGV[1]); delete($ARGV[2]); delete($ARGV[3]); delete($ARGV[4]); delete($ARGV[5]); delete($ARGV[6]); delete($ARGV[7]); exit; *********************> the code path may have a "spinlock" logic to it that means that manythreads> are runnable but when scheduled just go back to sleep. That wouldaccount for high load average with lots of spare CPU. If that's what is happening then I wouldn't worry much more about it.> > Regards, > SteveRegards Sam
RESEARCH
2010-Feb-10 10:23 UTC
[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
> > snip >>> >> >> You are correct. 8 span which process up to 240 calls at pick time >> >>> If the system is actually performing fine then I'd just say that there >> is something about the Asterisk threads that makes them look runnable >> and that >>> accounts for the high load average. ?Is the IVR an agi or fastagi or >> what? - >> >> I have the agi scripts not as ivr but to help populate the required >> information into mysql db. Probably here is where the problem lies i >> have >> to connect and disconnect to mysql each time a call is made or a >> specific >> menu is selected >> >> Here is the script >> ***** >> #!/usr/bin/perl -w >> use strict; >> use DBI(); >> use Scalar::Util qw/weaken/; >> >> my $cdr_log_file = "/var/log/asterisk/ivr_log"; >> my $mysql_host = "cdr01"; >> my $mysql_db = "ivrcdrdb"; >> my $mysql_table = "tbl_ivrcdr_details"; >> my $mysql_user = "ivruser"; >> my $mysql_pwd = "a09876a"; >> >> >> my $sth; >> >> my $data0= $ARGV[0]; >> my $data1= $ARGV[1]; >> my $data2= $ARGV[2]; >> my $data3= $ARGV[3]; >> my $data4= $ARGV[4]; >> my $data5= $ARGV[5]; >> my $data6= $ARGV[6]; >> my $data7= $ARGV[7]; >> >> >> # Connect to database >> # print "Connecting to database...\n\n"; >> my $dbh >>DBI->connect("DBI:mysql:database=$mysql_db;host=$mysql_host","$mysql_user"," $mysql_pwd",{'RaiseError'>> => 1}); >> >> my $insert_str = "insert into $mysql_table (calldate, language, src, >> duration, accountcode, uniqueid, currentmenu, nextmenu) values >> (\"$data0\", \"$data1\", \"$data2\", \"$data3\", ?\"$data4\", >> \"$data5\", >> \"$data6\", \"$data7\");\n"; >> ? ? ? $sth = $dbh->prepare($insert_str); >> ? ? ? $sth->execute(); >> >> # print "\n\nOK.\n"; >> >> $sth->finish(); >> $dbh->disconnect(); >> >> >> # Trying to resolve memory leak should it happen >> delete($ARGV[0]); >> delete($ARGV[1]); >> delete($ARGV[2]); >> delete($ARGV[3]); >> delete($ARGV[4]); >> delete($ARGV[5]); >> delete($ARGV[6]); >> delete($ARGV[7]); >> >> >> exit; >> ********************* >> >>> the code path may have a "spinlock" logic to it that means that many >> threads >>> are runnable but when scheduled just go back to sleep. ?That would >> account for high load average with lots of spare CPU. ?If that's what is >> happening then I wouldn't worry much more about it. >>> >>> Regards, >>> Steve >> >> Regards >> Sam > > If I were you, and I am not and never will be, I would move over to > fastagi and offload all that Perl and database stuff off to a > designated server just to handle that stuff. > > I have had the EXACT same problem and that is how it was fixed, > fastagi running to a Windows box that had a process developed (written > in C something) by the M$ developers to hit the M$SQL databases. > > We were also doing a ton of things with the AMI which we figured out > how to do the same end result without banging on the AMI, such as > using call files rather than AMI to originate a call. > > Load avg dropped to one or under if I remember correctly. > > Thanks, > Steve Totaro >Thank you Steve for your recommendation. Ofcoz i have separate server that is hosting the db and i will consider doing fastagi and see it it will help @Phil. The credintials displayed there are dummy, so don't worry unless you mean something else @Steve Edward. Can you share your C agi codes? I presume what you want me to do is rewrite the script in C and use it as compiled binary @Tzafrir. How about this [ivr4 ~]# ps aux | grep D USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1975 0.0 0.0 3920 688 pts/4 S+ 13:17 0:00 grep D root 3413 0.0 0.0 1832 576 ? Ss 2009 80:58 /usr/sbin/mDNSResponder -b -f /etc/services_mDNS I have killed that process but no changes @All, looks like the conclusion has been made that this is to do with AGI. Let me address it and see how it reacts. I shall feedback Thanks Sam