Ryan Harper
2005-Aug-18 21:01 UTC
[Xen-devel] [PATCH] patches: workaround for br_del_if race
This patch provides a workaround for bugzilla #90 which shows up far too often when creating and then destroying lots of domUs and dom0 is SMP. Details are in the [1]bug. With this patch, I now can create/destroy domains in a tight loop for hours where previously every 3 to 10 cycles would blow up. Please apply. 1. http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=90 -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com diffstat output: workaround_double_br_del_if.patch | 11 +++++++++++ 1 files changed, 11 insertions(+) Signed-off-by: Ryan Harper <ryanh@us.ibm.com> --- diff -r dfbeb7da829f patches/linux-2.6.12/workaround_double_br_del_if.patch --- /dev/null Thu Aug 18 19:51:46 2005 +++ b/patches/linux-2.6.12/workaround_double_br_del_if.patch Thu Aug 18 15:53:37 2005 @@ -0,0 +1,11 @@ +--- linux-2.6.12/net/bridge/br_if.c 2005-06-17 14:48:29.000000000 -0500 ++++ linux-2.6.12-xen0-smp/net/bridge/br_if.c 2005-08-18 15:17:27.302615846 -0500 +@@ -382,7 +382,7 @@ + { + struct net_bridge_port *p = dev->br_port; + +- if (!p || p->br != br) ++ if (!p || p->br != br || p->state == BR_STATE_DISABLED) + return -EINVAL; + + br_sysfs_removeif(p); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Aug-19 08:34 UTC
Re: [Xen-devel] [PATCH] patches: workaround for br_del_if race
On 18 Aug 2005, at 22:01, Ryan Harper wrote:> This patch provides a workaround for bugzilla #90 which shows up far > too > often when creating and then destroying lots of domUs and dom0 is SMP. > Details are in the [1]bug. With this patch, I now can create/destroy > domains in a tight loop for hours where previously every 3 to 10 cycles > would blow up.Please also submit this to the Ethernet bridge maintainer. Details are in the Linux MAINTAINERS file. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ryan Harper
2005-Aug-19 16:46 UTC
Re: [Xen-devel] [PATCH] patches: workaround for br_del_if race
* Keir Fraser <Keir.Fraser@cl.cam.ac.uk> [2005-08-19 07:37]:> > On 18 Aug 2005, at 22:01, Ryan Harper wrote: > > >This patch provides a workaround for bugzilla #90 which shows up far > >too > >often when creating and then destroying lots of domUs and dom0 is SMP. > >Details are in the [1]bug. With this patch, I now can create/destroy > >domains in a tight loop for hours where previously every 3 to 10 cycles > >would blow up. > > Please also submit this to the Ethernet bridge maintainer. Details are > in the Linux MAINTAINERS file.I''ve had a little discussion on [1]netdev about this but since this isn''t the proper fix I''m doing some more digging. The real race is between when a call_rcu() callback runs and when the netif_destroy() calls unregister_netdev(). When we get an oops, the brctl delif IOCTL has run, and br_del_if() has been called setting the port state to DISABLED, and then queues up an rcu callback, destroy_npb(), which will set dev->br_port = NULL. add_del_if() // IOCTL handler from brctl delif // xen-br0 $VIF br_del_if() del_nbp() br_stp_disable_port() // Set port->state = BR_STATE_DISABLED call_rcu(destroy_nbp_rcu) // setup rcu callback to run // destory_nbp() which will set // dev->br_port = NULL After the xen scripts have called the brctl command, xend sends the disconnect and destroy control messages, which trigger: netif_destroy() unregister_netdev() unregister_netdevice() notify_call_chain(NETDEV_UNREGISTER) br_device_event() // The first thing done here is to check // the device''s br_port to see if it is // NULL If the destory_nbp_rcu() callback isn''t fired before br_device_event() checks dev->br_port, then the NULL check fails and a second call to br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref counts of dentrys. [2] Before I go back to netdev, I wanted to check if there is anything we should be doing to be more defensive or does this seem to be something the bridge code should handle (error out, whatever)? 1. http://oss.sgi.com/archives/netdev/2005-08/msg00097.html 2. Routines are in linux/net/bridge/{br_if.c, br_ioctl.c, br_notify.c} -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ryan Harper
2005-Aug-19 18:54 UTC
[Xen-devel] [PATCH] scripts, patches: remove workaround, skip brtcl delif
Attached is a patch that needs more testing, but I''ve not been able to recreate the race-condition with this patch applied. It does the following: 1. Remove workaround patch 2. Update scripts/network-bridge and scripts/vif-bridge to not call brctl delif When a domU is shutdown/destroyed and the netif is destroyed, the notify_call_chain triggered from unregister_netdevice() will trigger the bridge event handler and which will call the proper code to remove the device from the bridge. I can''t see any reason why brtcl delif should be called when taking out a domain if the call chain will delete the interface from the bridge when the vif is destroyed automatically. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com diffstat output: a/patches/linux-2.6.12/workaround_double_br_del_if.patch | 11 ----------- tools/examples/network-bridge | 3 --- tools/examples/vif-bridge | 6 ++++-- 3 files changed, 4 insertions(+), 16 deletions(-) Signed-off-by: Ryan Harper <ryanh@us.ibm.com> --- diff -r 188c782fa9bb tools/examples/vif-bridge --- a/tools/examples/vif-bridge Fri Aug 19 13:05:31 2005 +++ b/tools/examples/vif-bridge Fri Aug 19 13:31:04 2005 @@ -74,8 +74,10 @@ exit fi -# Add/remove vif to/from bridge. -brctl ${brcmd} ${bridge} ${vif} +# Add vif to bridge. vifs are auto-removed from bridge +if [ "${brcmd}" == "addif" ] ; then + brctl ${brcmd} ${bridge} ${vif} +fi ifconfig ${vif} $OP if [ ${ip} ] ; then diff -r 188c782fa9bb tools/examples/network-bridge --- a/tools/examples/network-bridge Fri Aug 19 13:05:31 2005 +++ b/tools/examples/network-bridge Fri Aug 19 13:31:04 2005 @@ -222,10 +222,7 @@ return fi - brctl delif ${bridge} ${netdev} - if ifconfig veth0 2>/dev/null | grep -q veth0 ; then - brctl delif ${bridge} vif0.0 ifconfig vif0.0 down mac=`ifconfig veth0 | grep HWadd | sed -e ''s/.*\(..:..:..:..:..:..\).*/\1/''` ifconfig ${netdev} down diff -r 188c782fa9bb patches/linux-2.6.12/workaround_double_br_del_if.patch --- a/patches/linux-2.6.12/workaround_double_br_del_if.patch Fri Aug 19 13:05:31 2005 +++ /dev/null Fri Aug 19 13:31:04 2005 @@ -1,11 +0,0 @@ ---- linux-2.6.12/net/bridge/br_if.c 2005-06-17 14:48:29.000000000 -0500 -+++ linux-2.6.12-xen0-smp/net/bridge/br_if.c 2005-08-18 15:17:27.302615846 -0500 -@@ -382,7 +382,7 @@ - { - struct net_bridge_port *p = dev->br_port; - -- if (!p || p->br != br) -+ if (!p || p->br != br || p->state == BR_STATE_DISABLED) - return -EINVAL; - - br_sysfs_removeif(p); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Aug-20 09:21 UTC
Re: [Xen-devel] [PATCH] patches: workaround for br_del_if race
> If the destory_nbp_rcu() callback isn''t fired before br_device_event() > checks dev->br_port, then the NULL check fails and a second call to > br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref > counts of dentrys. [2] > > Before I go back to netdev, I wanted to check if there is anything we > should be doing to be more defensive or does this seem to be something > the bridge code should handle (error out, whatever)?I think this is an etherbridge bug. They''ve set up two ways to enter br_del_if() but haven''t implemented proper synchronisation in that function. The fact that br_del_if has been called once already but has only ''half deleted'' the bridge port is an implementation detail of the etherbridge --- network interfaces shouldn''t have to code around that. I expect that all other network drivers have this same problem but it''s just really rare to unregister_netdev() a real NIC so noone''s hit it before. -- Keir> 1. http://oss.sgi.com/archives/netdev/2005-08/msg00097.html > 2. Routines are in linux/net/bridge/{br_if.c, br_ioctl.c, br_notify.c} > > -- > Ryan Harper > Software Engineer; Linux Technology Center > IBM Corp., Austin, Tx > (512) 838-9253 T/L: 678-9253 > ryanh@us.ibm.com >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Aug-20 09:34 UTC
[Xen-devel] Re: [PATCH] scripts, patches: remove workaround, skip brtcl delif
> I can''t see any reason why brtcl delif should be called when taking out > a domain if the call chain will delete the interface from the bridge > when the vif is destroyed automatically.There should be no reason not to either. I think I prefer the etherbridge bug fix. What was wrong with that fix that the maintainers wouldn''t accept it? If they understand it enough to see that the fix is not 100% correct, surely they can suggest a better one? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel