Hello, I have strange behaviour on a server that I can't get a handle on. I have a reasonably powerful server running VMware server 1.0.4-56528. It has a RAID5 build with mdadm on 5 SATA drives. Masses of ram and 2 XEON CPUs. But it stutters. Example : fire up vi, press and keep finger on i. After filling 2-3 lines, the display is stopped for 2-12 seconds. Then they continue. This happens even on the host OS, at the console. Host system running CentOS 5.2 x86-64: CPU : 2x Xeon E5430 @ 2.66GHz RAM : 24GB Mobo : DSBV-DX HD : 5 x SATA ST3750330AS 750GB in RAID5 There are 5 VMs, detailed at http://www.awale.qc.ca/vmware/stj1.txt to make this mail shorter. Seems to me this system should be more then adequate to handle the load. This is what vmstat on the host looks like when the server is "unhappy" : http://www.awale.qc.ca/vmware/vmstat.txt Spending a lot of time in 'wa', but 'bo' and 'bi' are miniscule. The problem seems like a disk problem. I grow to suspect that SATA isn't ready for the big time. I also grow to dislike RAID5. Questions : - Anyone have a clue or other on how to track down my bottle neck? - SATA NCQ is limited to 15 queue depth. Is this per-SATA-port or per-SATA-chip? Or does this question make no sense? - I realise there are more recent versions of CentOS out. Are there specific items in the changelogs that would affect my problem? Thank you for any help, -Philip
Philip Gwyn <liste at artware.qc.ca> writes:> Hello, > > I have strange behaviour on a server that I can't get a handle on. I have a > reasonably powerful server running VMware server 1.0.4-56528. It has a RAID5 > build with mdadm on 5 SATA drives. Masses of ram and 2 XEON CPUs. But it > stutters....> The problem seems like a disk problem. I grow to suspect that SATA isn't ready > for the big time. I also grow to dislike RAID5.Personally, I will use RAID5, and I will use SATA, but I will not use SATA with RAID5 except in 'tape replacement' roles. The weak bits of RAID5 (read/ write cycle on sub-stripe writes) are often exasterbated by the weak bits of SATA (slow seek time, slow rotational speed) creating a perfect storm of suck. Not to say that's your primary problem. Actually, it sounds a whole lot like the problems I get with xen on heavily used servers, if I don't assign a core exclusively to the dom0 (or at least give it a very high priority.) But I have little knowledge of or experience with VMware, so I don't know if you have a similar problem. -- Luke S. Crawford http://prgmr.com/xen/ - Hosting for the technically adept http://nostarch.com/xen.htm - We don't assume you are stupid.
I have been using a 3ware 9690se sata card with raid5. I have been running Centos using xen and have had no problems. I wrote my virtual machines to the raw raid5 drive. Its seems to have worked fine for me. On Thu, Sep 24, 2009 at 9:18 PM, Philip Gwyn <liste at artware.qc.ca> wrote:> Hello, > > I have strange behaviour on a server that I can't get a handle on. I have > a > reasonably powerful server running VMware server 1.0.4-56528. It has a > RAID5 > build with mdadm on 5 SATA drives. Masses of ram and 2 XEON CPUs. But it > stutters. > > Example : fire up vi, press and keep finger on i. After filling 2-3 lines, > the > display is stopped for 2-12 seconds. Then they continue. This happens > even on > the host OS, at the console. > > Host system running CentOS 5.2 x86-64: > > CPU : 2x Xeon E5430 @ 2.66GHz > RAM : 24GB > Mobo : DSBV-DX > HD : 5 x SATA ST3750330AS 750GB in RAID5 > > There are 5 VMs, detailed at http://www.awale.qc.ca/vmware/stj1.txt to > make > this mail shorter. > > Seems to me this system should be more then adequate to handle the load. > > This is what vmstat on the host looks like when the server is "unhappy" : > http://www.awale.qc.ca/vmware/vmstat.txt > Spending a lot of time in 'wa', but 'bo' and 'bi' are miniscule. > > The problem seems like a disk problem. I grow to suspect that SATA isn't > ready > for the big time. I also grow to dislike RAID5. > > Questions : > > - Anyone have a clue or other on how to track down my bottle neck? > > - SATA NCQ is limited to 15 queue depth. Is this per-SATA-port or > per-SATA-chip? Or does this question make no sense? > > - I realise there are more recent versions of CentOS out. Are there > specific > items in the changelogs that would affect my problem? > > Thank you for any help, > > -Philip > > > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org > http://lists.centos.org/mailman/listinfo/centos-virt >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20090925/7b1db5e9/attachment-0006.html>
On Thu, Sep 24, 2009 at 8:18 PM, Philip Gwyn <liste at artware.qc.ca> wrote:> Hello, > > I have strange behaviour on a server that I can't get a handle on. ?I have a > reasonably powerful server running VMware server 1.0.4-56528. ?It has a RAID5 > build with mdadm on 5 SATA drives. ?Masses of ram and 2 XEON CPUs. ?But it > stutters. > > Example : fire up vi, press and keep finger on i. ?After filling 2-3 lines, the > display is stopped for 2-12 seconds. ?Then they continue. ?This happens even on > the host OS, at the console. > > Host system running CentOS 5.2 x86-64: > > ?CPU : 2x Xeon E5430 @ 2.66GHz > ?RAM : 24GB > ?Mobo : DSBV-DX > ? HD : 5 x SATA ST3750330AS 750GB in RAID5 > > There are 5 VMs, detailed at http://www.awale.qc.ca/vmware/stj1.txt to make > this mail shorter. > > Seems to me this system should be more then adequate to handle the load. > > This is what vmstat on the host looks like when the server is "unhappy" : > ? http://www.awale.qc.ca/vmware/vmstat.txt > Spending a lot of time in 'wa', but 'bo' and 'bi' are miniscule. > > The problem seems like a disk problem. ?I grow to suspect that SATA isn't ready > for the big time. ?I also grow to dislike RAID5. > > Questions : > > - Anyone have a clue or other on how to track down my bottle neck? > > - SATA NCQ is limited to 15 queue depth. ?Is this per-SATA-port or > ?per-SATA-chip? Or does this question make no sense? > > - I realise there are more recent versions of CentOS out. ?Are there specific > ?items in the changelogs that would affect my problem?VMware Server 1.0.x was never supported on RHEL/CentOS 5.x, especially as early as 1.0.4. Not that it can't be made to work, but it just wasn't made for newer kernel versions. We run up to 10 guests in VMware Server 1.0.9 on a single Xeon quad core with the host running CentOS 4, SATA hardware RAID 1. Admittedly, our guests are pretty low CPU, low throughput, but it works just fine for us. If your guests are not really hammering the disk system, then you may be on a wild goose chase blaming RAID 5. In my time on the VMware forums, it was always suggested to use single CPU guests running non-smp kernels for Server 1.0.x. It might help to convert the one smp guest you have. If you can afford some down-time, reconfigure the host to use compatible CentOS/VMware versions (4.x/1.0.x or 5.x/2.x respectively). At the very least, get the latest VMware Server 1.0.9. -- Jeff
Hi, On Thu, Sep 24, 2009 at 21:18, Philip Gwyn <liste at artware.qc.ca> wrote:> The problem seems like a disk problem. I grow to suspect that SATA isn't ready > for the big time. I also grow to dislike RAID5. > > Questions : > - Anyone have a clue or other on how to track down my bottle neck?You can use the command "iostat -kx 1 /dev/sd?" which will give you more information of what is happening, in particular it will show %util which will show how often the drive is busy, and you can correlate that with the rkB/s and wkB/s to see how much data is being read or written to that specific drive. You also have averages for the request size (to know if you have many small operations or a few big ones), queue size, service time and wait time. See "man iostat" for more details. It's not installed by default on CentOS 5 but it's available from the base repositories, just run "yum install sysstat" if you don't have it yet. If you are using RAID-5 you might want to see if the chunk size you are using is good. You can specify that when you create a new array using the "-c" option to mdadm. I don't think you can change that after it's created. The default is 64kB which sounds sane enough but you might want to check if yours was created with that value or not. The problem is basically if you have big operations that are larger than the chunk size it will require operations on all the disks which means all of them will have to seek to a specific position to complete your operation, and while they are doing that they will not be able to work on any other requests. If you have high usage and random access the disks will spend a lot of time seeking. If that is the case, you might want to increase the chunk size so that most operations can be fulfilled by one disk only so that the others are free to work on other requests at that time. On the other hand, if you have specific areas of your filesystem that are hit more often that fall always on the same disk, that disk will be used more than the other ones, so your performance will be effectively limited by that one disk instead of multiplied by the number of disks due to the striped access. In that case it might make sense to reduce the chunk size in order to make the access more even across disks. I read sometime ago that ext2/ext3 has a way of allocating blocks that will create such unfair distribution when you are striping across a certain number of disks, I don't know exactly how that works but you might want to check into that. I remember that when you create the ext2/ext3 filesystem you can use an option such as "stride=..." to give a hint on the disk layout so that the filesystem can disalign those blocks enough to spread the load across the disks. But I remember I could never exactly figure out what "stride=..." number would make sense to me... the documentation is kind of scarce in this area, but check the mke2fs manpage anyway if you have a disk that is more "hot" than the others and you think that might be the problem. You can also experiment with other filesystems such as XFS which is available in the extras repository. And of course, make sure "cat /proc/mdstat" shows everything OK, make sure you aren't running a degraded array before you start investigating its performance. I'm sure there are performance tunings that can be done with, e.g., hdparm, tweaking numbers in /proc and /sys filesystems, or changing the kernel scheduler, but I'm not really experienced with that so I couldn't really advise you on that. I'm sure others will have such experience and will be able to give you pointers on that. You might want to ask on the main list in that case, instead of the -virt one. HTH, Filipe
Philip Gwyn wrote:> Hello, > > I have strange behaviour on a server that I can't get a handle on. I have a > reasonably powerful server running VMware server 1.0.4-56528. It has a RAID5 > build with mdadm on 5 SATA drives. Masses of ram and 2 XEON CPUs. But it stutters. >This will double your memory usage. But it should fix your I/O. Take a look at http://vmfaq.com/?View=entry&EntryID=25 In particular, putting your temporary directory in a ramdisk will improve your I/O profile immensely. Edit /etc/vmware/config and add: tmpDirectory = "/tmp/vmware" mainMem.useNamedFile = "FALSE" sched.mem.pshare.enable = "FALSE" MemTrimRate = "0" MemAllowAutoScaleDown = "FALSE" prefvmx.useRecommendedLockedMemSize = "TRUE" prefvmx.minVmMemPct = "100" Edit /etc/fstab and add tmpfs /tmp/vmware tmpfs defaults,size=100% 0 0 and edit /tmp/cron.daily/tmpwatch and add '-x /tmp/vmware' to the tmpwatch command line for /tmp. make your mount point for /tmp/vmware and mount /tmp/vmware restart vmware. That is how I run my systems. -- Benjamin Franz
A short follow up to indicate how I solved my problem : - moved all the ram files to /dev/shm - downgraded host to CentOS 4.8 (was 5.2) - Moved virtual disks to RAID1 (was RAID5) - Spread the virtual disks over various raidsets (was all on same raidset) The first element alone was not helpful. I was not able to test RAID1 vs RAID5 in isolation from 4.8 vs 5.2, which would have been nice. I might be downgrading all the other hosts to 4.8, in which case I might be able to test it in isolation. -Philip