Günter Zimmermann
2009-Nov-20  08:13 UTC
[CentOS-virt] steadily increasing/high loadavg without i/o wait or cpu utilization
Hi all, I just installed centos 5.4 xen-kernel on intel core i5 machine as dom0. After some hours of syncing a raid10 array (8 sata disk) I noticed a steadily increasing loadavg. I think without reasonable i/o wait or cpu utilization the loadavg on this system should be very lower. If this loadavg is normal I would be greatful if somone could explain why. The screenshots below show that there is neither much i/o wait nor much cpu utilization. top - 09:10:25 up 9:26, 1 user, load average: 7.24, 7.63, 7.72 Tasks: 116 total, 4 running, 112 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 42.5%sy, 0.0%ni, 57.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 1.3%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.0%si, 1.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 10.0%sy, 0.0%ni, 89.0%id, 0.0%wa, 0.0%hi, 0.0%si, 1.0%st Mem: 7805952k total, 809612k used, 6996340k free, 112092k buffers Swap: 0k total, 0k used, 0k free, 341304k cached [root at vserver ~]# iostat -d -x sda sdb sdc sdd sde sdf sg sdh Linux 2.6.18-164.6.1.el5xen (vserver.zimmermann.com) 20.11.2009 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 1364,57 0,66 1820,12 0,89 25477,44 12,37 14,00 1,11 0,61 0,41 75,10 sdb 1167,12 0,68 2017,45 0,89 25476,53 12,47 12,63 1,16 0,57 0,39 79,49 sdc 1308,06 0,66 1876,65 0,91 25477,64 12,50 13,58 1,14 0,61 0,42 78,73 sdd 1267,27 0,66 1917,28 0,92 25476,32 12,57 13,29 1,09 0,57 0,42 80,16 sde 1146,76 0,63 2037,99 0,87 25477,94 11,96 12,50 0,98 0,48 0,31 63,80 sdf 1126,88 0,64 2057,62 0,87 25475,99 12,02 12,38 1,08 0,52 0,34 69,89 sdh 472,21 0,66 2712,31 0,92 25476,13 12,43 9,39 0,66 0,24 0,15 41,03 [root at vserver ~]# cat /proc/loadavg 7.22 7.58 7.69 8/129 23348 Best regards, G?nter Zimmermann
Christopher G. Stach II
2009-Nov-20  08:43 UTC
[CentOS-virt] steadily increasing/high loadavg without i/o wait or cpu utilization
----- "G?nter Zimmermann" <guenter.zimmermann at gmx.at> wrote:> Hi all, > > I just installed centos 5.4 xen-kernel on intel core i5 machine as > dom0. > After some hours of syncing a raid10 array (8 sata disk) I noticed a > steadily increasing loadavg. I think without reasonable i/o wait or > cpu > utilization the loadavg on this system should be very lower. If this > loadavg is normal I would be greatful if somone could explain why. > The > screenshots below show that there is neither much i/o wait nor much > cpu > utilization.Do you have any zombie or D state processes? Try: ps -eo stat,command | awk '$1 ~ /^[DRZ]/{print}' If you have any in D, you can use SysRq-T and/or the pid's wchan in /proc to figure out what they're doing or dmesg to figure out where they may have barfed. -- Christopher G. Stach II
Christopher G. Stach II
2009-Nov-20  09:41 UTC
[CentOS-virt] steadily increasing/high loadavg without i/o wait or cpu utilization
----- "G?nter Zimmermann" <guenter.zimmermann at gmx.at> wrote:> Thank you for your reply. I could find just one process in D and this > is > a delayed resync from another raid device. It is delayed because of > the > big resync in progress. I think this is not the problem. Could the > running resync cause the high loadavg without showing up i/o wait or > cpu > utilization in top?It sure can. All of the D, R, and Zs count toward your load average. It looks like you should be somewhere approaching 8. -- Christopher G. Stach II
Dennis J.
2009-Nov-20  12:32 UTC
[CentOS-virt] steadily increasing/high loadavg without i/o wait or cpu utilization
On 11/20/2009 09:13 AM, G?nter Zimmermann wrote:> Hi all, > > I just installed centos 5.4 xen-kernel on intel core i5 machine as dom0. > After some hours of syncing a raid10 array (8 sata disk) I noticed a > steadily increasing loadavg. I think without reasonable i/o wait or cpu > utilization the loadavg on this system should be very lower. If this > loadavg is normal I would be greatful if somone could explain why. The > screenshots below show that there is neither much i/o wait nor much cpu > utilization. > > top - 09:10:25 up 9:26, 1 user, load average: 7.24, 7.63, 7.72 > Tasks: 116 total, 4 running, 112 sleeping, 0 stopped, 0 zombie > Cpu0 : 0.0%us, 42.5%sy, 0.0%ni, 57.5%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu1 : 0.0%us, 1.3%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.0%si, > 1.0%st > Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu3 : 0.0%us, 10.0%sy, 0.0%ni, 89.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 1.0%st > Mem: 7805952k total, 809612k used, 6996340k free, 112092k buffers > Swap: 0k total, 0k used, 0k free, 341304k cached > > [root at vserver ~]# iostat -d -x sda sdb sdc sdd sde sdf sg sdh > Linux 2.6.18-164.6.1.el5xen (vserver.zimmermann.com) 20.11.2009 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 1364,57 0,66 1820,12 0,89 25477,44 12,37 > 14,00 1,11 0,61 0,41 75,10 > sdb 1167,12 0,68 2017,45 0,89 25476,53 12,47 > 12,63 1,16 0,57 0,39 79,49 > sdc 1308,06 0,66 1876,65 0,91 25477,64 12,50 > 13,58 1,14 0,61 0,42 78,73 > sdd 1267,27 0,66 1917,28 0,92 25476,32 12,57 > 13,29 1,09 0,57 0,42 80,16 > sde 1146,76 0,63 2037,99 0,87 25477,94 11,96 > 12,50 0,98 0,48 0,31 63,80 > sdf 1126,88 0,64 2057,62 0,87 25475,99 12,02 > 12,38 1,08 0,52 0,34 69,89 > sdh 472,21 0,66 2712,31 0,92 25476,13 12,43 > 9,39 0,66 0,24 0,15 41,03 > > [root at vserver ~]# cat /proc/loadavg > 7.22 7.58 7.69 8/129 23348Does the process list show anything out of the ordinary? You can get a bogus load indication if you have mounted a nfs share, turn the server off and do a couple of "ls <nfs mountpoint>" for example. In that case each ls process will hang and drive up the load even though they don't actually cause any cpu/io pressure on the system. Regards, Dennis