Bogdan Ćulibrk
2012-Jul-09 13:06 UTC
[Pkg-xen-devel] Bug#666135: Multiple "Domain-0", slow libvirt
In our case, this seems to be some kind of bug with xenstored.
For example, let's take our servers "vm7" and "vm10"
into
consideration, which are configured exactly the same, Debian Squeeze
vanilla, no backports.
"vm7", as stated below, has more Domain-0 entries, but executes virsh
commands (f.e. virsh pwd, virsh list etc) faster (under a second) than
"vm10".
[vm7 ~]# ./dump-xenstore.sh | grep Domain-0 | wc -l
22
[vm10 ~]# ./dump-xenstore.sh | grep Domain-0 | wc -l
14
If you strace xenstored by attaching to it's process, you'll see that
even the basic virsh command, like virsh pwd, doesn't finish until it
"is done" with xenstored. When xenstored finishes passing data to
virsh, virsh gives it's output.
From the strace output, xenstored on vm10 gives ~20x more ENOENT
errors than xenstored strace on vm7:
[vm7 ~]# grep ENOENT xenstored.strace | wc -l
110
[vm10 ~]# grep ENOENT xenstored.strace | wc -l
2232
From what I can deduce, there is something "wrong" with vm10's
xenstored's tdb, because it points to unexisting data and takes a lot
of time on that?
[vm10 ~]# grep tdb.0x17a47b0 xenstored.strace
2013 open("/var/lib/xenstored/tdb.0x17a47b0",
O_WRONLY|O_CREAT|O_TRUNC, 0640) = 154
2013 open("/var/lib/xenstored/tdb.0x17a47b0", O_RDWR) = 155
2013 rename("/var/lib/xenstored/tdb.0x17a47b0",
"/var/lib/xenstored/tdb") = 0
2013 unlink("/var/lib/xenstored/tdb.0x17a47b0") = -1 ENOENT (No such
file or directory)
2013 open("/var/lib/xenstored/tdb.0x17a47b0",
O_WRONLY|O_CREAT|O_TRUNC, 0640) = 153
2013 open("/var/lib/xenstored/tdb.0x17a47b0", O_RDWR) = 156
2013 rename("/var/lib/xenstored/tdb.0x17a47b0",
"/var/lib/xenstored/tdb") = 0
2013 unlink("/var/lib/xenstored/tdb.0x17a47b0") = -1 ENOENT (No such
file or directory)
---snip---
(last 4 lines repeated 80 times)
NB: same slowdown appears on more dom0s.
Ian Campbell
2012-Jul-09 15:43 UTC
[Pkg-xen-devel] Bug#666135: Bug#666135: Multiple "Domain-0", slow libvirt
On Mon, 2012-07-09 at 15:06 +0200, Bogdan ?ulibrk wrote:> 2013 open("/var/lib/xenstored/tdb.0x17a47b0", > O_WRONLY|O_CREAT|O_TRUNC, 0640) = 153 > 2013 open("/var/lib/xenstored/tdb.0x17a47b0", O_RDWR) = 156 > 2013 rename("/var/lib/xenstored/tdb.0x17a47b0", "/var/lib/xenstored/tdb") = 0 > 2013 unlink("/var/lib/xenstored/tdb.0x17a47b0") = -1 ENOENT (No such > file or directory) > ---snip--- > > (last 4 lines repeated 80 times)I suspect (athough I'm not sure) that the creation/deletion of these tdb.<ID> are an artefact of the stupid way the C xenstored does transactions (it basically copies the whole db and "commit" is really an atomic rename) Is /var/lib/xenstored/tdb huge? If so then as a workaround it might be helpful to remove it on boot (before xenstored is started), this is generally a best practice anyway. It is also not uncommon for folks to redirect it to a ramdisk (i.e. stick a tmpfs on /var/lib/xenstored). There is no persistent data stored in xenstored so this is safe to do. C xenstored is pretty sucky but in 4.0/Squeeze I don't think the ocaml xenstored (aka oxenstored) is available yet. Ian.