Hello, I am working on forking hvm domains and I have been using xenstore to send commands to qemu. I have noticed that occasionally qemu''s watch is not read. I am using the qemu fd handler methods. I have been trying to duplicate the missed read outside of qemu and the rest of xend, but I haven''t been able to. However, I think I may have discovered a memory leak in lowlevel/xs/xs.c. I am attaching: - commandee.c : consumer of "commands" - commander.py : issuer of commands - xsblockingchannel.py : com channel for the commander I am working against a changset from midsummer (11536:041be3f6b38e) since I''m trying to iron out some of my bugs before moving forward in the revisions. % gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread % sudo ./commandee > /dev/null % time sudo python commander.py > /dev/null #(in separate terminal) Traceback (most recent call last): File "commander.py", line 9, in ? xsbc = xsblockingchannel.xsblockingchannel("test") File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__ self.xs.watch(self.path, self) xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'') Exception xen.lowlevel.xs.Error: (2, ''No such file or directory'') in <bound method xsblockingchannel.__del__ of <xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>> ignored 1.02s user 5.00s system 8% cpu 1:08.61 total Given that I am unwatching the watches in xsblockingchannel I don''t think the memory leak would be due to me. As far as the lost watches are concerned, I would speculate that something is wrong with qemu''s select/callback handling, or that xenstore is racing and dropping things. If the memory leak has already been fixed, I apologise for the noise. I am moving on to unix domain sockets for now, as I have been running into this problem for a while and I don''t currently have the patience to debug it, but I thought I should mention it so that it is on the general radar. Once I finish with the domain sockets I should find out whether it is the qemu select handling. Regards, John McCullough _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Nov 13, 2006 at 09:23:52PM -0800, John McCullough wrote:> Hello, > > I am working on forking hvm domains and I have been using xenstore to > send commands to qemu. I have noticed that occasionally qemu''s watch is > not read. I am using the qemu fd handler methods. I have been trying > to duplicate the missed read outside of qemu and the rest of xend, but I > haven''t been able to. However, I think I may have discovered a memory > leak in lowlevel/xs/xs.c. > > I am attaching: > - commandee.c : consumer of "commands" > - commander.py : issuer of commands > - xsblockingchannel.py : com channel for the commander > > I am working against a changset from midsummer (11536:041be3f6b38e) > since I''m trying to iron out some of my bugs before moving forward in > the revisions. > > % gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread > % sudo ./commandee > /dev/null > % time sudo python commander.py > /dev/null #(in separate terminal) > Traceback (most recent call last): > File "commander.py", line 9, in ? > xsbc = xsblockingchannel.xsblockingchannel("test") > File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__ > self.xs.watch(self.path, self) > xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'') > Exception xen.lowlevel.xs.Error: (2, ''No such file or > directory'') in <bound method xsblockingchannel.__del__ of > <xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>> ignored > > 1.02s user 5.00s system 8% cpu 1:08.61 totalI can''t remember the details, but I have vague recollection of a Xenstore-related problem where the error was ENOMEM, but that error was misleading. One thing you can try is to run xenstore in trace mode: export XENSTORED_TRACE=1 before running xend, which will add "-T /var/log/xen/xenstored-trace.log" to the xenstore command line. This will give you a log of every operation on the store, which will tell us whether the error code is coming from xenstore. Regarding the missing read -- that''s probably because you are not handling EAGAIN when doing xs_transaction_end. You need to retry if xs_transaction_end fails with errno = EAGAIN, being careful not to allocate memory each retry. See xenstore_client.c for example. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Nov 14, 2006 at 10:42:16AM +0000, Ewan Mellor wrote:> On Mon, Nov 13, 2006 at 09:23:52PM -0800, John McCullough wrote: > > > Hello, > > > > I am working on forking hvm domains and I have been using xenstore to > > send commands to qemu. I have noticed that occasionally qemu''s watch is > > not read. I am using the qemu fd handler methods. I have been trying > > to duplicate the missed read outside of qemu and the rest of xend, but I > > haven''t been able to. However, I think I may have discovered a memory > > leak in lowlevel/xs/xs.c. > > > > I am attaching: > > - commandee.c : consumer of "commands" > > - commander.py : issuer of commands > > - xsblockingchannel.py : com channel for the commander > > > > I am working against a changset from midsummer (11536:041be3f6b38e) > > since I''m trying to iron out some of my bugs before moving forward in > > the revisions. > > > > % gcc -o commandee commandee.c /usr/lib/libxenstore.a -lpthread > > % sudo ./commandee > /dev/null > > % time sudo python commander.py > /dev/null #(in separate terminal) > > Traceback (most recent call last): > > File "commander.py", line 9, in ? > > xsbc = xsblockingchannel.xsblockingchannel("test") > > File "/net/xen/xsstress/xsblockingchannel.py", line 19, in __init__ > > self.xs.watch(self.path, self) > > xen.lowlevel.xs.Error: (12, ''Cannot allocate memory'') > > Exception xen.lowlevel.xs.Error: (2, ''No such file or > > directory'') in <bound method xsblockingchannel.__del__ of > > <xsblockingchannel.xsblockingchannel instance at 0xb7ce448c>> ignored > > > > 1.02s user 5.00s system 8% cpu 1:08.61 total > > I can''t remember the details, but I have vague recollection of a > Xenstore-related problem where the error was ENOMEM, but that error was > misleading.The problem I was thinking was a misleading ENOSPC, not a misleading ENOMEM (and we''ve since fixed the ENOSPC). It looks like your ENOMEM might be real. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel