Gergely Horváth
2013-Nov-13 15:31 UTC
[libvirt-users] Lots of threads and increased IO load
Hello guys, We have a lot of small computers on top of two servers running QEMU/KVM virtualisation with libvirt. In this case, most of the VMs are not doing much work, some of them barely touches any "hardware" resources. I have two problems, they might be connected. When I do a bigger IO job on one of the VMs (like downloading files to local disk) all other VMs' IO load increases. Even our LDAP server's IO load is triggering an alert in our monitoring system - though that server barely does any IO throughout the day. The big IO job can NOT be considered big in the terms of the host computers IO capabilities, as it has two really fast SSD drives in hardware RAID 0 configuration. It can easily do hundreds of megabytes per second random write. The file copy I was running barely touched 50 MB/s. So this should not be a problem for the host's disks and I also do not understand why other computers IO load is increased. The second problem came when I looked at `iotop`, to see how much IO read and write is going on. The computer, that was doing the copying easily had like 20-30 threads writing pieces to the disk. Some of the threads are obviously "processor emulators" - they been running since the birth of the VM, but what are these other threads? TL;DR; Two questions than, to sum up everything: * How can I control and maybe decrease the number of threads spawned for a virtual machine? * How can I know what is the default IO caching for a VM, and what caching mechanism/policy should I set to minimise IO latency (i.e. I do not really care if the data is written to the disk later than the guest thinks) Thank you. P.S: We have been running hypertable servers the whole time, and even those did not generate such mess with IO... -- Üdvözlettel / Best regards Horváth Gergely | gergely.horvath@inepex.com IneTrack - Nyomkövetés egyszerűen | Inepex Kft. Ügyfélszolgálat: support@inetrack.hu | +36 30 825 7646 | support.inetrack.hu Web: www.inetrack.hu | nyomkovetes-blog.hu | facebook.com/inetrack Inepex - The White Label GPS fleet-tracking platform | www.inepex.com
Eric Blake
2013-Nov-13 15:38 UTC
Re: [libvirt-users] Lots of threads and increased IO load
On 11/13/2013 08:31 AM, Gergely Horváth wrote:> The second problem came when I looked at `iotop`, to see how much IO > read and write is going on. The computer, that was doing the copying > easily had like 20-30 threads writing pieces to the disk. Some of the > threads are obviously "processor emulators" - they been running since > the birth of the VM, but what are these other threads?These are qemu threads, right? Then asking on the qemu list is a more appropriate list; my understanding is that qemu creates transient threads to handle aio requests on an as-needed basis, and that it is qemu that decides when these threads are needed or not (and not libvirt).> Two questions than, to sum up everything: > * How can I control and maybe decrease the number of threads spawned for > a virtual machine?I don't know if qemu exposes a knob for limiting the number of aio helper threads it can spawn, or even if that is a good idea.> * How can I know what is the default IO caching for a VM, and what > caching mechanism/policy should I set to minimise IO latency (i.e. I do > not really care if the data is written to the disk later than the guest > thinks)What domain XML are you using? Yes, there are different disk cache policies (writethrough vs. none) which have definite performance vs. risk tradeoffs according to the amount of IO latency you want the guest to see; but again, the qemu list may be a better help in determining which policy is best for your needs. Once you know the policy you want, then we can help you figure out how to represent it in the domain XML. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Gergely Horváth
2013-Nov-13 16:07 UTC
Re: [libvirt-users] Lots of threads and increased IO load
On 2013-11-13 16:38, Eric Blake wrote:> I don't know if qemu exposes a knob for limiting the number of aio > helper threads it can spawn, or even if that is a good idea.Those threads do not cause any problems, the host is dealing with a lot of them and has no problem (CPU usage and load is incredibly low in practice). Thank you for the clarification, I see why are there more threads sometimes.> What domain XML are you using? Yes, there are different disk cache > policies (writethrough vs. none) which have definite performance vs. > risk tradeoffs according to the amount of IO latency you want the guest > to see; but again, the qemu list may be a better help in determining > which policy is best for your needs. Once you know the policy you want, > then we can help you figure out how to represent it in the domain XML.I do not know exactly what is your question, but here is the relevant part of one the guests: <domain type='kvm' id='10'> <name>...</name> <uuid>...</uuid> <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static'>1</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-1.6'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>SandyBridge</model> <vendor>Intel</vendor> <feature policy='require' name='pbe'/> ... <feature policy='require' name='monitor'/> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/ssd/vmstorage/web1.raw'/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/ssd/vmstorage/web1-1.swap'/> <target dev='vdc' bus='virtio'/> <alias name='virtio-disk2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> ... <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> <seclabel type='none'/> </domain> Currently, the running guests have no "cache" parameter passed to qemu, so I guess they are using the default qemu setting which is writethrough according to the QEMU wiki. (http://en.wikibooks.org/wiki/QEMU/Devices/Storage) As I understand, moving forward towards more risk and more "performance" I can try to experiment with "writeback" then? i.e. <driver name='qemu' type='raw' cache="writeback" /> Cheers. -- Üdvözlettel / Best regards Horváth Gergely | gergely.horvath@inepex.com IneTrack - Nyomkövetés egyszerűen | Inepex Kft. Ügyfélszolgálat: support@inetrack.hu | +36 30 825 7646 | support.inetrack.hu Web: www.inetrack.hu | nyomkovetes-blog.hu | facebook.com/inetrack Inepex - The White Label GPS fleet-tracking platform | www.inepex.com