Starting about 2 weeks ago Xend has started to fall over a lot doing what should be normal actions, so much so that our xm-test suite more or less fails all the tests as it keeps apparently knocking over xend on relatively simple operations like create, and list. What is most bizarre now is this "no such process" error generated by Xend. Here is a segment of our xm-test results that shows this: ... [dom0] Running `xm create /dev/null ramdisk=/root/xm-test/ramdisk/initrd.img kernel=/boot/vmlinuz-2.6.12-xenU root=/dev/ram0 name=default memory=64'' Using config file "/dev/null". Traceback (most recent call last): File "/usr/sbin/xm", line 10, in ? main.main(sys.argv) File "/root/xen-unstable/dist/install/usr/lib/python/xen/xm/main.py", line 724, in main handle_xend_error(argv[1], args[0], ex) File "/root/xen-unstable/dist/install/usr/lib/python/xen/xm/main.py", line 162, in handle_xend_error raise ex xen.xend.XendProtocol.XendError: (3, ''No such process'') Unable to create domain ... Any thoughts on why this is might be the case? -Sean -- __________________________________________________________________ Sean Dague Mid-Hudson Valley sean at dague dot net Linux Users Group http://dague.net http://mhvlug.org There is no silver bullet. Plus, werewolves make better neighbors than zombies, and they tend to keep the vampire population down. __________________________________________________________________ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
SD> Any thoughts on why this is might be the case? So, as far as I can tell, there is some state being kept in xend, which causes the problem. In my testing, I create and destroy a domain repeatedly with the same name. Sometimes a destroy operation marks the domain in XenDomainDict as "terminated", but doesn''t actually remove it. Then, xend allows another domain by the same name to be created, thus corrupting xend''s internal domain list. Next, the create routines in xend try to unpause the domain referenced by the name, which turns up the record from the list of the old domain, and therefore the old domid. The unpause routine makes a call to libxc to unpause the old domid, which isn''t found in the list, so ESRCH ("No such process") is returned. It seems to me that there are (at least) two problems here: 1. The domain objects in xend''s list sometimes seem to stick around longer than they should after a destroy operation. 2. Xend will create a duplicate domain if asked, and therefore will corrupt its own internal list. I''m testing a patch right now that will cause xend to do a quick sanity check before creating a domain to make sure that the list does not currently contain a domain object of the same name. -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Apparently Analagous Threads
- [RESEND] [PATCH] Call dominfo.device_delete instead of non-existant dominfo.device_destroy
- [PATCH] Call dominfo.device_delete instead of non-existant dominfo.device_destroy
- [PATCH] add SHUTDOWN OPTIONS to xmdomain.cfg man page
- integration of tzinfo with icalendar
- [PATCH] small format changes in xm.1 to make sedf more readable