Hello,
I am working on forking hvm domains and I have been using xenstore to
send commands to qemu. I have noticed that occasionally qemu''s watch
is
not read. I am using the qemu fd handler methods. I have been trying
to duplicate the missed read outside of qemu and the rest of xend, but I
haven''t been able to. However, I think I may have discovered a memory
leak in lowlevel/xs/xs.c.
I am attaching:
- commandee.c : consumer of "commands"
- commander.py : issuer of commands
- xsblockingchannel.py : com channel for the commander
I am working against a changset from midsummer (11536:041be3f6b38e)
since I''m trying to iron out some of my bugs before moving forward in
the revisions.
% gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread
% sudo ./commandee > /dev/null
% time sudo python commander.py > /dev/null #(in separate terminal)
Traceback (most recent call last):
File "commander.py", line 9, in ?
xsbc = xsblockingchannel.xsblockingchannel("test")
File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__
self.xs.watch(self.path, self)
xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'')
Exception xen.lowlevel.xs.Error: (2, ''No such file or
directory'') in <bound method xsblockingchannel.__del__ of
<xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>>
ignored
1.02s user 5.00s system 8% cpu 1:08.61 total
Given that I am unwatching the watches in xsblockingchannel I don''t
think the memory leak would be due to me. As far as the lost watches
are concerned, I would speculate that something is wrong with qemu''s
select/callback handling, or that xenstore is racing and dropping
things. If the memory leak has already been fixed, I apologise for the
noise. I am moving on to unix domain sockets for now, as I have been
running into this problem for a while and I don''t currently have the
patience to debug it, but I thought I should mention it so that it is on
the general radar.
Once I finish with the domain sockets I should find out whether it is
the qemu select handling.
Regards,
John McCullough
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
On Mon, Nov 13, 2006 at 09:23:52PM -0800, John McCullough wrote:> Hello, > > I am working on forking hvm domains and I have been using xenstore to > send commands to qemu. I have noticed that occasionally qemu''s watch is > not read. I am using the qemu fd handler methods. I have been trying > to duplicate the missed read outside of qemu and the rest of xend, but I > haven''t been able to. However, I think I may have discovered a memory > leak in lowlevel/xs/xs.c. > > I am attaching: > - commandee.c : consumer of "commands" > - commander.py : issuer of commands > - xsblockingchannel.py : com channel for the commander > > I am working against a changset from midsummer (11536:041be3f6b38e) > since I''m trying to iron out some of my bugs before moving forward in > the revisions. > > % gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread > % sudo ./commandee > /dev/null > % time sudo python commander.py > /dev/null #(in separate terminal) > Traceback (most recent call last): > File "commander.py", line 9, in ? > xsbc = xsblockingchannel.xsblockingchannel("test") > File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__ > self.xs.watch(self.path, self) > xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'') > Exception xen.lowlevel.xs.Error: (2, ''No such file or > directory'') in <bound method xsblockingchannel.__del__ of > <xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>> ignored > > 1.02s user 5.00s system 8% cpu 1:08.61 totalI can''t remember the details, but I have vague recollection of a Xenstore-related problem where the error was ENOMEM, but that error was misleading. One thing you can try is to run xenstore in trace mode: export XENSTORED_TRACE=1 before running xend, which will add "-T /var/log/xen/xenstored-trace.log" to the xenstore command line. This will give you a log of every operation on the store, which will tell us whether the error code is coming from xenstore. Regarding the missing read -- that''s probably because you are not handling EAGAIN when doing xs_transaction_end. You need to retry if xs_transaction_end fails with errno = EAGAIN, being careful not to allocate memory each retry. See xenstore_client.c for example. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Nov 14, 2006 at 10:42:16AM +0000, Ewan Mellor wrote:> On Mon, Nov 13, 2006 at 09:23:52PM -0800, John McCullough wrote: > > > Hello, > > > > I am working on forking hvm domains and I have been using xenstore to > > send commands to qemu. I have noticed that occasionally qemu''s watch is > > not read. I am using the qemu fd handler methods. I have been trying > > to duplicate the missed read outside of qemu and the rest of xend, but I > > haven''t been able to. However, I think I may have discovered a memory > > leak in lowlevel/xs/xs.c. > > > > I am attaching: > > - commandee.c : consumer of "commands" > > - commander.py : issuer of commands > > - xsblockingchannel.py : com channel for the commander > > > > I am working against a changset from midsummer (11536:041be3f6b38e) > > since I''m trying to iron out some of my bugs before moving forward in > > the revisions. > > > > % gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread > > % sudo ./commandee > /dev/null > > % time sudo python commander.py > /dev/null #(in separate terminal) > > Traceback (most recent call last): > > File "commander.py", line 9, in ? > > xsbc = xsblockingchannel.xsblockingchannel("test") > > File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__ > > self.xs.watch(self.path, self) > > xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'') > > Exception xen.lowlevel.xs.Error: (2, ''No such file or > > directory'') in <bound method xsblockingchannel.__del__ of > > <xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>> ignored > > > > 1.02s user 5.00s system 8% cpu 1:08.61 total > > I can''t remember the details, but I have vague recollection of a > Xenstore-related problem where the error was ENOMEM, but that error was > misleading.The problem I was thinking was a misleading ENOSPC, not a misleading ENOMEM (and we''ve since fixed the ENOSPC). It looks like your ENOMEM might be real. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel