I was having trouble restarting xend; oftentimes, after I stopped it, it
wouldn''t start again. The problem was that xend was unable to open
/dev/xen/evtchn because the device was already open.
When xend brought up a network interface, I had it set to call a network
script that configured DHCP for the new interface. Since dhcpd doesn''t
support reloading its configuration, I need to stop and restart dhcpd.
Since this is done from a script called by xend, the dhcpd process will
trace its ancestry back to xend.
It turns out that xend doesn''t set the close-on-exec flag for most of
its file descriptors, and so those descriptors are passed to all child
processes. This means that my dhcpd process ends up looking like this
(output from lsof):
dhcpd3 766 root cwd DIR 8,1 4096 2 /
dhcpd3 766 root rtd DIR 8,1 4096 2 /
dhcpd3 766 root txt REG 8,1 604248 89841 /usr/sbin/dhcpd3
dhcpd3 766 root mem REG 8,1 90152 29553 /lib/ld-2.3.2.so
dhcpd3 766 root mem REG 8,1 1243856 29446 /lib/libc-2.3.2.so
dhcpd3 766 root mem REG 8,1 34520 29453
/lib/libnss_files-2.3.2.so
dhcpd3 766 root mem REG 8,1 13976 29451
/lib/libnss_dns-2.3.2.so
dhcpd3 766 root mem REG 8,1 64924 29464
/lib/libresolv-2.3.2.so
dhcpd3 766 root 3u CHR 10,200 104968 /dev/xen/evtchn
dhcpd3 766 root 4r FIFO 0,5 36242 pipe
dhcpd3 766 root 5w FIFO 0,5 36242 pipe
dhcpd3 766 root 6u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 7w REG 8,1 156889 44594 /var/log/xend.log
dhcpd3 766 root 8u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 9u CHR 1,1 107604 /dev/mem
dhcpd3 766 root 10u unix 0xc66213e0 37020 socket
dhcpd3 766 root 11u raw 37019
00000000:0001->00000000:0000 st=07
dhcpd3 766 root 12u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 13u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 14u CHR 1,1 107604 /dev/mem
dhcpd3 766 root 15u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 16w REG 8,1 470 61948
/var/lib/dhcp3/dhcpd.leases
dhcpd3 766 root 17u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 18u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 19u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 20u CHR 1,1 107604 /dev/mem
dhcpd3 766 root 21u IPv4 37026 UDP *:bootps
dhcpd3 766 root 22u REG 0,2 0 4569 /proc/xen/privcmd
dhcpd3 766 root 23u CHR 1,1 107604 /dev/mem
dhcpd3 766 root 24u sock 0,0 37023 can''t
identify protocol
dhcpd3 766 root 25u sock 0,0 37024 can''t
identify protocol
dhcpd3 766 root 26u sock 0,0 37025 can''t
identify protocol
A few of those descriptors are opened by dhcpd itself. But xend is
leaking, at the very least, /dev/xen/evtchn, /proc/xen/privcmd,
/var/log/xend.log, /dev/mem, and some pipes. (It looks like just about
everything in xend except sockets.)
Most of those don''t cause any direct problems, but /dev/xen/evtchn can
only be opened once, it seems. So until I kill dhcpd, I can''t restart
xend.
The attached patch is a partial fix for this problem--it modifies
tools/python/xen/lowlevel/xu/xu.c to set the FD_CLOEXEC flag on
descriptors opened in that file. This only fixes the problem for
/dev/xen/evtchn and /dev/mem, since the other files are opened
elsewhere, but at least this allows xend to be restarted cleanly.
--Michael Vrable