I'm trying to figure out what's causing an average system load of 3+ to 5+ on an Intel quad core. The server has with 2 KVM guests (assigned 1 core and 2 cores) that's lightly loaded (0.1~0.4) each. Both guest/host are running 64bit CentOS 5.6 Originally I suspected maybe it's i/o but on checking, there is very little i/o wait % as well. Plenty of free disk space available on all physical drives and memory sufficient for the usage with barely any swap in use. While performance/responsiveness of the host and guests doesn't seem to be affected, I'm still concerned about this odd load figures. Would appreciate it if anybody can suggest what else I should be looking at? Output of various commands on the host: Top ==top - 15:15:39 up 5 days, 8:48, 1 user, load average: 4.76, 4.18, 4.43 Tasks: 210 total, 1 running, 209 sleeping, 0 stopped, 0 zombie Cpu(s): 4.1%us, 2.5%sy, 0.0%ni, 76.4%id, 17.0%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 3759496k total, 2813076k used, 946420k free, 357300k buffers Swap: 8193016k total, 5648k used, 8187368k free, 736640k cached free -m ===== total used free shared buffers cached Mem: 3671 2750 921 0 348 719 -/+ buffers/cache: 1682 1989 Swap: 8000 5 7995 vmstat 3 =====procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 5648 949072 357096 735788 0 0 105 50 17 22 2 1 95 1 0 0 0 5648 949428 357104 735788 0 0 44 2682 14142 14697 6 4 84 6 0 0 0 5648 949304 357104 735788 0 0 1 324 14047 14753 2 1 96 1 0 1 1 5648 946916 357104 735788 0 0 0 215 14410 14496 2 2 90 6 0 0 0 5648 946520 357104 735788 0 0 23 267 13703 14664 2 2 91 6 0 sar ==02:40:01 PM CPU %user %nice %system %iowait %steal %idle 02:50:01 PM all 2.17 0.00 2.18 4.30 0.00 91.35 03:00:01 PM all 2.47 0.00 2.23 3.57 0.00 91.73 03:10:01 PM all 2.29 0.00 2.07 3.77 0.00 91.87 03:20:01 PM all 2.63 0.00 2.07 3.28 0.00 92.03 Average: all 2.39 0.00 1.95 4.77 0.00 90.89
Hello Emmanuel, On Wed, 2011-06-08 at 15:26 +0800, Emmanuel Noobadmin wrote:> Originally I suspected maybe it's i/o but on checking, there is very > little i/o wait % as well.> Cpu(s): 4.1%us, 2.5%sy, 0.0%ni, 76.4%id, 17.0%wa, 0.0%hi, 0.1%si, 0.0%st17% i/o wait time seems a significant amount to me. I'm not sure if that's unusual when running multiple VMs, but it's probably worth investigating a bit further. Regards, Leonard. -- mount -t life -o ro /dev/dna /genetic/research
On 06/08/11 02:26, Emmanuel Noobadmin wrote:> Cpu(s): 4.1%us, 2.5%sy, 0.0%ni, 76.4%id, 17.0%wa, 0.0%hi, 0.1%si, 0.0%st> 02:50:01 PM all 2.17 0.00 2.18 4.30 0.00 91.35 > 03:00:01 PM all 2.47 0.00 2.23 3.57 0.00 91.73top Cpu(s) line is averaged for all cpus/cores. to display individual cpus/cores press: 1 you'll likely see one cpu/core being pegged with iowait. to identify the offending process within top press: fj<enter> to display the P column(last used CPU). watch top for a few minutes to see what is using all of the disk io. sar output is averaged over the 10 minute interval. for smaller sar time slices edit cron file: /etc/cron.d/sysstat disks are often swamped by "two things happening at once"... backups migrating a VM database upgrades .rrd average updates -- Steven Tardy Systems Analyst Information Technology Infrastructure Information Technology Services Mississippi State University sjt5 at its.msstate.edu
On 06/08/2011 12:26 AM, Emmanuel Noobadmin wrote:> I'm trying to figure out what's causing an average system load of 3+ > to 5+ on an Intel quad corewatch 'ps axf | awk "{ if ( \$3 !~ /S/ ) { print; } }"' The processes that aren't sleeping count toward your load. The above command will print non-sleeping processes. If I'm not mistaken, that will tell you what's causing the load regardless of whether it's contention over CPU, IO, or other causes. You can get similar data from "top" if you tell it to sort by process status ( F, w, Enter ) and then reverse the sort ( R ).