Richard W.M. Jones
2014-Jun-07 11:51 UTC
Re: [Libguestfs] [openstack-dev] [Nova] nova-compute deadlock
On Sat, May 31, 2014 at 01:25:04AM +0800, Qin Zhao wrote:> Hi all, > > When I run Icehouse code, I encountered a strange problem. The nova-compute > service becomes stuck, when I boot instances. I report this bug in > https://bugs.launchpad.net/nova/+bug/1313477. > > After thinking several days, I feel I know its root cause. This bug should > be a deadlock problem cause by pipe fd leaking. I draw a diagram to > illustrate this problem. > https://docs.google.com/drawings/d/1pItX9urLd6fmjws3BVovXQvRg_qMdTHS-0JhYfSkkVc/pub?w=960&h=720 > > However, I have not find a very good solution to prevent this deadlock. > This problem is related with Python runtime, libguestfs, and eventlet. The > situation is a little complicated. Is there any expert who can help me to > look for a solution? I will appreciate for your help!Thanks for the useful diagram. libguestfs itself is very careful to open all file descriptors with O_CLOEXEC (atomically if the OS supports that), so I'm fairly confident that the bug is in Python 2, not in libguestfs. Another thing to say is that g.shutdown() sends a kill 9 signal to the subprocess. Furthermore you can obtain the qemu PID (g.get_pid()) and send any signal you want to the process. I wonder if a simpler way to fix this wouldn't be something like adding a tiny C extension to the Python code to use pipe2 to open the Python pipe with O_CLOEXEC atomically? Are we allowed Python extensions in OpenStack? BTW do feel free to CC libguestfs@redhat.com on any libguestfs problems you have. You don't need to subscribe to the list. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v