Hallo group,
I am using gentoo (kernel 3.11.7) together with xen 4.3.1. I have found
a bug in the vif-bridge script which I reported to the gentoo bugzilla.
Ian Delaney, the maintainer of the gentoo xen-packages (on copy here),
suggested to bring this to the the attention of the xen ML as the fix
should benefit other distributions as well.
The bug report (together with a suggested fix further below) is also
available on https://bugs.gentoo.org/show_bug.cgi?id=502570, but I have
included the relevant bits and pieces here for convenience and for you
guys to be able to comment if and when required.
If this rather needs to go to the xen-devel ML, I am sure Ian Campbell
(or somebody else) will shortly be around and move it or asks me to
resend to the other list.
====== Start of Bug report and suggested fix ======Upon shutting down a domU
under XEN the script
"/etc/xen/scripts/vif-bridge" is invoked with an "offline"
argument.
This is for the recommended setup of connecting domUs to the dom0
through a bridged device named xenbr0. The relevant snippet of code
reads as follows:
-------------------------------------------
case "$command" in
online)
setup_virtual_bridge_port "$dev"
mtu="`ip link show $bridge | awk '/mtu/ { print $5
}'`"
if [ -n "$mtu" ] && [ "$mtu" -gt 0 ]
then
ip link set $dev mtu $mtu || :
fi
add_to_bridge "$bridge" "$dev"
;;
offline)
do_without_error brctl delif "$bridge" "$dev"
do_without_error ifconfig "$dev" down
;;
add)
setup_virtual_bridge_port "$dev"
add_to_bridge "$bridge" "$dev"
;;
esac
-------------------------------------------
The function "do_without error" called from the "offline)"
pattern in
the "case" statement is defined in
/etc/xen/scripts/xen-hotplug-common.sh which is indirectly sourced
through /etc/xen/scripts/vif-common.sh and reads as follows:
-------------------------------------------
do_without_error() {
"$@" 2>/dev/null || log debug "$@ failed"
}
-------------------------------------------
The call 'do_without_error brctl delif "$bridge"
"$dev"' obviously executes
brctl delif "$bridge" "$dev"
and the call 'do_without_error brctl delif "$bridge"
"$dev"' executes
ifconfig "$dev" down
- both discarding any error output, but in case of any error (i.e. exit
code <> 0) still logging a failed message to syslog as follows:
-------------------------------------------
Feb 26 22:14:29 vm-host logger: /etc/xen/scripts/vif-bridge: brctl delif
xenbr0 vif1.0 failed
Feb 26 22:14:29 vm-host logger: /etc/xen/scripts/vif-bridge: ifconfig
vif1.0 down failed
-------------------------------------------
Upon investigating it seems that the problem is related to the fact that
the network device (at least for paravirtualized guests using the
netfront/netback device model) has already been destroyed by the dom0
kernel when the script is being run. This is evidenced by the following
entries in syslog preceding the above quoted error messages:
-------------------------------------------
Feb 26 22:14:29 vm-host kernel: [ 6169.989895] xenbr0: port 1(vif1.0)
entered disabled state
Feb 26 22:14:29 vm-host kernel: [ 6170.007496] xenbr0: port 1(vif1.0)
entered disabled state
Feb 26 22:14:29 vm-host kernel: [ 6170.007568] device vif1.0 left
promiscuous mode
Feb 26 22:14:29 vm-host kernel: [ 6170.007571] xenbr0: port 1(vif1.0)
entered disabled state
-------------------------------------------
These findings are further underpinned by the relevant error messages
provided by the function "do_without_error" (captured by redirecting
stderr to a file rather than to /dev/null) which are as follows:
-------------------------------------------
for brctl: "interface vif1.0 does not exist!"
for ifconfig: "vif1.0: ERROR while getting interface flags: No such
device"
-------------------------------------------
Suggested fix:
for brctl: check whether the interface still exists and is also still
linked to the bridge prior to invoking the brctl command
for ifconfig: check whether the interface still exists and is also still
up prior to invoking the ifconfig command as follows:
-------------------------------------------
case "$command" in
online)
setup_virtual_bridge_port "$dev"
mtu="`ip link show $bridge | awk '/mtu/ { print $5
}'`"
if [ -n "$mtu" ] && [ "$mtu" -gt 0 ]
then
ip link set $dev mtu $mtu || :
fi
add_to_bridge "$bridge" "$dev"
;;
offline)
if brctl show "$bridge" | grep "$dev" >
/dev/null 2>&1 ; then
do_without_error brctl delif "$bridge" "$dev"
fi
if ifconfig -s "$dev" > /dev/null 2>&1 ; then
do_without_error ifconfig "$dev" down
fi
;;
add)
setup_virtual_bridge_port "$dev"
add_to_bridge "$bridge" "$dev"
;;
esac
-------------------------------------------
In terms of functionality my suggested fix does not change anything as
in case the interface is still linked to the bridge (is still up) -
which might be the case for PCI-passed through devices from dom0 to a
domU - the removal from the bridge (bringing the interface down) is
performed exactly as before. It however does away the nasty error
message in the syslog.
====== End of Bug report and suggested fix ======
Thanks and regards,
Atom2