Alfred von Campe
2008-Feb-13 02:15 UTC
[CentOS] Strange performance issues under CentOS 5.1
I am still running CentOS 4.6 on our production systems, but I am starting to plan the upgrade to CentOS 5.1. I have one test system running 5.1 that is the exact same hardware configuration as my 4.6 test system. One of our builds runs about 6 times slower on the 5.1 system, even though is uses less overall CPU time. I first suspected something wrong with the disk, but the results from bonnie++ show that the 5.1 system is slightly faster: Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- -- Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % CP /sec %CP centos4.6 16G 35933 10 21301 5 46507 6 41.8 0 Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- -- Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % CP /sec %CP centos5.1 16G 42015 14 21179 5 49863 4 91.6 0 Then I ran the build with "/usr/bin/time --verbose", and here are the results (first 4.6 then 5.1): Command being timed: "make" User time (seconds): 32.15 System time (seconds): 3.52 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:35.88 Command being timed: "make" User time (seconds): 22.05 System time (seconds): 3.11 Percent of CPU this job got: 11% Elapsed (wall clock) time (h:mm:ss or m:ss): 3:31.35 As you can see from the above, there is a lot of idle time on the 5.1 system. Finally, I ran the build with "strace -c", and here are the top ten lines of that output (again, 4.6 first and then 5.1): % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 53.81 16.804147 54916 306 58 waitpid 34.75 10.853461 82851 131 wait4 5.29 1.650844 9 177706 154581 open 1.61 0.503701 15 34408 read 0.91 0.283706 15 18607 write 0.60 0.185894 12 14919 10364 stat64 0.52 0.163340 10 16495 9079 access 0.47 0.146933 7 20581 mmap2 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.07 15.173924 52687 288 58 waitpid 38.50 9.724412 83831 116 wait4 0.54 0.135194 7 19199 10705 access 0.36 0.090850 54 1681 1334 execve 0.27 0.067686 5 14423 10570 stat64 0.11 0.027676 1 24832 read 0.09 0.022339 0 155810 135765 open 0.03 0.007617 159 48 unlink Any suggestions as to what could possible be causing this? I am fresh out of other ideas to try. Alfred
William L. Maltby
2008-Feb-13 02:57 UTC
[CentOS] Strange performance issues under CentOS 5.1
On Tue, 2008-02-12 at 21:15 -0500, Alfred von Campe wrote:> I am still running CentOS 4.6 on our production systems, but I am > starting to plan the upgrade to CentOS 5.1. I have one test system > running 5.1 that is the exact same hardware configuration as my 4.6 > test system. One of our builds runs about 6 times slower on the 5.1 > system, even though is uses less overall CPU time. I first suspected > something wrong with the disk, but the results from bonnie++ show > that the 5.1 system is slightly faster: > > Version 1.03 ------Sequential Output------ --Sequential > Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- -- > Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % > CP /sec %CP > centos4.6 16G 35933 10 21301 5 > 46507 6 41.8 0 > > > Version 1.03 ------Sequential Output------ --Sequential > Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- -- > Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % > CP /sec %CP > centos5.1 16G 42015 14 21179 5 > 49863 4 91.6 0 > > Then I ran the build with "/usr/bin/time --verbose", and here are the > results (first 4.6 then 5.1): > > Command being timed: "make" > User time (seconds): 32.15 > System time (seconds): 3.52 > Percent of CPU this job got: 99% > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:35.88 > > Command being timed: "make" > User time (seconds): 22.05 > System time (seconds): 3.11 > Percent of CPU this job got: 11% > Elapsed (wall clock) time (h:mm:ss or m:ss): 3:31.35 > > As you can see from the above, there is a lot of idle time on the 5.1 > system. Finally, I ran the build with "strace -c", and here are the > top ten lines of that output (again, 4.6 first and then 5.1): > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 53.81 16.804147 54916 306 58 waitpid > 34.75 10.853461 82851 131 wait4 > 5.29 1.650844 9 177706 154581 open > 1.61 0.503701 15 34408 read > 0.91 0.283706 15 18607 write > 0.60 0.185894 12 14919 10364 stat64 > 0.52 0.163340 10 16495 9079 access > 0.47 0.146933 7 20581 mmap2 > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 60.07 15.173924 52687 288 58 waitpid > 38.50 9.724412 83831 116 wait4 > 0.54 0.135194 7 19199 10705 access > 0.36 0.090850 54 1681 1334 execve > 0.27 0.067686 5 14423 10570 stat64 > 0.11 0.027676 1 24832 read > 0.09 0.022339 0 155810 135765 open > 0.03 0.007617 159 48 unlink > > Any suggestions as to what could possible be causing this? I am > fresh out of other ideas to try.Check BIOS settings? For memory, CAS etc. the same? Disk hardware the same and specified identically? Presumming that nothing is found there, install system accounting packages and run some SAR reports. You may see a clue in them. Any "tweaks" on the old system you forgot to apply on the new? Elevator, buffer flush interval changes, etc? Any other noticeable things on there that may cause it? Presume the slowdown is caused by a process that you are not looking at. "Hangs" while some other process is waiting or tying up the CPU. Try running top. I notice an execve shows on the new one that is not in the old. One says "hmmm....". What does swapon -s show? Is the system "seeing" the same amount of memory "available" or have BIOS settings in one reduced available? If all new equipment on the new one, open her up and reseat all connections, PCI cards and mem sticks. Make sure all power connectors are well seated to MB and drives. Front side bus and memory speeds set the same in BIOS? That's all I can think of that may be even remotely related ATM> > Alfred > <snip sig stuff>HTH -- Bill