Hi Richard,
Thanks a lot for the response - very helpful to go through.
I'm using libguestfs 1.26.5 on an Ubuntu 14.04 OS, which is running as a
baremetal server. I was simply using "virt-df", but after looking into
the "-P" option a little more I have incorporated it. That greatly
improves the performance compared to the original (I checked the free memory
available and it ranged from 2-3G, which wouldn't have given me as many
parallel threads running). Question about this - is it safe to enforce a larger
threadcount without checking available memory first?
I ran the baselines and got the following:
Starting the appliance = ~3.4s
Performing inspection of a guest = ~5s
I also looked at your blog posts - very interesting stuff. I played with setting
"LIBGUESTFS_ATTACH_METHOD=appliance", but didn't notice much
difference here. I'm testing on a Qemu-KVM/Openstack with 29 guests running.
KVM has the default client connections as 20, so I expected to see a little
improvement when setting the above env var. After setting, I changed
"-P" to 29 instead of 20, but didn't see any difference.
Any additional suggestions as the scale dramatically increases to > 3000
guests (This will likely be on a system with much more available memory)?
Ideally we would like to gather guest disk used/free <5 minute intervals for
all guests - do you think this is possible using virt-df?
Thanks!
Dan Ryder
-----Original Message-----
From: Richard W.M. Jones [mailto:rjones@redhat.com]
Sent: Wednesday, September 10, 2014 12:32 PM
To: Dan Ryder (daryder)
Cc: libguestfs@redhat.com
Subject: Re: [Libguestfs] Scaling virt-df performance
On Wed, Sep 10, 2014 at 01:38:16PM +0000, Dan Ryder (daryder)
wrote:> Hello,
>
> I have been looking at the "virt-df" libguestfs tool to get
> guest-level disk used/free statistics - specifically with
> Qemu-KVM/Openstack. This works great for a few Openstack instances,
> but when I begin to scale (even to ~30 instances/guests) the
> performance really takes a hit. The time it takes for the command to
> complete seems to scale linearly with the amount of guests/domains
> running on the hypervisor (note - I am using "virt-df" for all
guests,
> not specifying one at a time; although I've tried that, too).
>
> For ~30 guests, the "virt-df" command takes around 90 seconds to
> complete. We are looking to support a scale of 3,000-30,000 guests
> disk used/free. It looks like this won't be remotely possible using
> "virt-df".
With sufficient memory, non-nested, on hardware built in the last 3 years, you
should get performance of about 1 second / guest (pipelined). So the figure you
give is about 3 times too high than where it should be.
Just to get some basic things out of the way:
- What version of virt-df are you using and on what distro?
- Is this nested?
- What exact virt-df command(s) are you running?
- Are you using the -P option?
- How much free memory is on the system? virt-df runs multiple
threads in parallel, but the number to run is computed according to
the amount of free memory[1].
Also have a look at the guestfs-performance manual[2]. I'm especially
interested in the results of the baseline measurements on that page, but the
rest of the page should answer some of your questions too.
There's also an interesting Perl script you might try playing with.
Rich.
[1]
https://github.com/libguestfs/libguestfs/blob/master/df/estimate-max-threads.c
[2] http://libguestfs.org/guestfs-performance.1.html
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora
Windows cross-compiler. Compile Windows programs, test, and build Windows
installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW