Igor Serebryany
2010-Dec-03 10:58 UTC
[libvirt-users] busy loop in libvirtd (cpu usage 100%)
Hi! Occasionally of late, I've seen a few cases where libvirtd cpu usage shoots up to 100% and stays there indefinitely. This seems to happen when a QEMU VM is starting up, although on one occasion I *think* I saw it happen after a QEMU VM was p2p-migrated. Doing strace -f -p <libvirtd pid> reveals a flood of poll() functions calls like these: [pid 1690] poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=11, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}, {fd=9, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=21, events=POLLOUT}, {fd=14, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=21, events=POLLIN|POLLERR|POLLHUP}, {fd=20, events=POLLIN|POLLERR|POLLHUP}], 14, -1) = 1 ([{fd=21, revents=POLLOUT}]) It seems that because 1 is returned each time, libvirtd just goes crazy dealing with fd-3, but I have no idea what fd-3 is. Restarting libvirtd fixes the high load, and then everything just goes back to chugging along as usual. This is on libvirt 0.8.5 with qemu 0.12.5 on a debian Squeeze system (the libvirt is compiled by hand). I'm not sure what's causing it, whether it's a bug in my own code somehow or inside libvirtd. I'd appreciate some help on how to debug this problem further -- restarting libvirtd is kind of a pain for me, because my application, which monitors the health on the node, maintains open connections to 'qemu:///system' and would thus have to restart itself as well... --Igor -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20101203/beb62f0c/attachment.sig>
Daniel P. Berrange
2010-Dec-06 11:11 UTC
[libvirt-users] busy loop in libvirtd (cpu usage 100%)
On Fri, Dec 03, 2010 at 04:58:23AM -0600, Igor Serebryany wrote:> Hi! > > Occasionally of late, I've seen a few cases where libvirtd cpu usage > shoots up to 100% and stays there indefinitely. This seems to happen > when a QEMU VM is starting up, although on one occasion I *think* I > saw it happen after a QEMU VM was p2p-migrated. > > Doing strace -f -p <libvirtd pid> reveals a flood of poll() functions calls > like these: > > [pid 1690] poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=11, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}, {fd=9, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=21, events=POLLOUT}, {fd=14, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=21, events=POLLIN|POLLERR|POLLHUP}, {fd=20, events=POLLIN|POLLERR|POLLHUP}], 14, -1) = 1 ([{fd=21, revents=POLLOUT}]) > > It seems that because 1 is returned each time, libvirtd just goes > crazy dealing with fd-3, but I have no idea what fd-3 is. > > Restarting libvirtd fixes the high load, and then everything just > goes back to chugging along as usual. > > This is on libvirt 0.8.5 with qemu 0.12.5 on a debian Squeeze system > (the libvirt is compiled by hand). > > I'm not sure what's causing it, whether it's a bug in my own code > somehow or inside libvirtd. I'd appreciate some help on how to debug > this problem further -- restarting libvirtd is kind of a pain for > me, because my application, which monitors the health on the node, > maintains open connections to 'qemu:///system' and would thus have > to restart itself as well...This is the kind of problem you need to use GDB to diagnose. Install the libvirt-debuginfo (or equivalent for non-Fedora), and attach to the libvirt process. Then do print eventLoop.handleCount print eventLoop.handles[0] print eventLoop.handles[1] print eventLoop.handles[2] ... Until you find the one with fd=21 in it. We're looking for the name of the function callback associated with this fd, which GDB should have print out each time. Daniel