Starting about 2 weeks ago Xend has started to fall over a lot doing what
should be normal actions, so much so that our xm-test suite more or less
fails all the tests as it keeps apparently knocking over xend on relatively
simple operations like create, and list.
What is most bizarre now is this "no such process" error generated by
Xend.
Here is a segment of our xm-test results that shows this:
...
[dom0] Running `xm create /dev/null ramdisk=/root/xm-test/ramdisk/initrd.img
kernel=/boot/vmlinuz-2.6.12-xenU root=/dev/ram0 name=default memory=64''
Using config file "/dev/null".
Traceback (most recent call last):
File "/usr/sbin/xm", line 10, in ?
main.main(sys.argv)
File
"/root/xen-unstable/dist/install/usr/lib/python/xen/xm/main.py", line
724, in main
handle_xend_error(argv[1], args[0], ex)
File
"/root/xen-unstable/dist/install/usr/lib/python/xen/xm/main.py", line
162, in handle_xend_error
raise ex
xen.xend.XendProtocol.XendError: (3, ''No such process'')
Unable to create domain
...
Any thoughts on why this is might be the case?
-Sean
--
__________________________________________________________________
Sean Dague Mid-Hudson Valley
sean at dague dot net Linux Users Group
http://dague.net http://mhvlug.org
There is no silver bullet. Plus, werewolves make better neighbors
than zombies, and they tend to keep the vampire population down.
__________________________________________________________________
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
SD> Any thoughts on why this is might be the case?
So, as far as I can tell, there is some state being kept in xend,
which causes the problem. In my testing, I create and destroy a
domain repeatedly with the same name. Sometimes a destroy operation
marks the domain in XenDomainDict as "terminated", but
doesn''t
actually remove it. Then, xend allows another domain by the same name
to be created, thus corrupting xend''s internal domain list. Next, the
create routines in xend try to unpause the domain referenced by the
name, which turns up the record from the list of the old domain, and
therefore the old domid. The unpause routine makes a call to libxc to
unpause the old domid, which isn''t found in the list, so ESRCH
("No
such process") is returned.
It seems to me that there are (at least) two problems here:
1. The domain objects in xend''s list sometimes seem to stick around
longer than they should after a destroy operation.
2. Xend will create a duplicate domain if asked, and therefore will
corrupt its own internal list.
I''m testing a patch right now that will cause xend to do a quick
sanity check before creating a domain to make sure that the list does
not currently contain a domain object of the same name.
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@us.ibm.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Reasonably Related Threads
- [RESEND] [PATCH] Call dominfo.device_delete instead of non-existant dominfo.device_destroy
- [PATCH] Call dominfo.device_delete instead of non-existant dominfo.device_destroy
- [PATCH] add SHUTDOWN OPTIONS to xmdomain.cfg man page
- integration of tzinfo with icalendar
- [PATCH] small format changes in xm.1 to make sedf more readable