Luis R. Rodriguez
2014-Feb-15 02:59 UTC
[Bridge] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
From: "Luis R. Rodriguez" <mcgrof at suse.com> It doesn't make sense for some interfaces to become a root bridge at any point in time. One example is virtual backend interfaces which rely on other entities on the bridge for actual physical connectivity. They only provide virtual access. Device drivers that know they should never become part of the root bridge have been using a trick of setting their MAC address to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead of using these hacks lets the interfaces annotate its intent and generalizes a solution for multiple drivers, while letting the drivers use a random MAC address or one prefixed with a proper OUI. This sort of hack is used by both qemu and xen for their backend interfaces. Cc: Stephen Hemminger <stephen at networkplumber.org> Cc: bridge at lists.linux-foundation.org Cc: netdev at vger.kernel.org Cc: linux-kernel at vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof at suse.com> --- include/uapi/linux/if.h | 1 + net/bridge/br_if.c | 2 ++ net/bridge/br_private.h | 1 + net/bridge/br_stp_if.c | 2 ++ 4 files changed, 6 insertions(+) diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h index d758163..8d10382 100644 --- a/include/uapi/linux/if.h +++ b/include/uapi/linux/if.h @@ -84,6 +84,7 @@ #define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address * change when it's running */ #define IFF_MACVLAN 0x200000 /* Macvlan device */ +#define IFF_BRIDGE_NON_ROOT 0x400000 /* Don't consider for root bridge */ #define IF_GET_IFACE 0x0001 /* for querying only */ diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 4bf02ad..a745415 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -228,6 +228,8 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br, br_init_port(p); p->state = BR_STATE_DISABLED; br_stp_port_timer_init(p); + if (dev->priv_flags & IFF_BRIDGE_NON_ROOT) + p->flags |= BR_DONT_ROOT; br_multicast_add_port(p); return p; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 045d56e..a89e8ad 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -173,6 +173,7 @@ struct net_bridge_port #define BR_ADMIN_COST 0x00000010 #define BR_LEARNING 0x00000020 #define BR_FLOOD 0x00000040 +#define BR_DONT_ROOT 0x00000080 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING struct bridge_mcast_query ip4_query; diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index 656a6f3..12fd848 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -228,6 +228,8 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br) return false; list_for_each_entry(p, &br->port_list, list) { + if (p->flags & BR_DONT_ROOT) + continue; if (addr == br_mac_zero || memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0) addr = p->dev->dev_addr; -- 1.8.5.2
Ben Hutchings
2014-Feb-16 18:56 UTC
Re: [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:> From: "Luis R. Rodriguez" <mcgrof@suse.com> > > It doesn't make sense for some interfaces to become a root bridgeI think you mean 'root port'.> at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. > > Device drivers that know they should never become part of the > root bridge have been using a trick of setting their MAC address > to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > of using these hacks lets the interfaces annotate its intent and > generalizes a solution for multiple drivers, while letting the > drivers use a random MAC address or one prefixed with a proper OUI. > This sort of hack is used by both qemu and xen for their backend > interfaces. > > Cc: Stephen Hemminger <stephen@networkplumber.org> > Cc: bridge@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > --- > include/uapi/linux/if.h | 1 + > net/bridge/br_if.c | 2 ++ > net/bridge/br_private.h | 1 + > net/bridge/br_stp_if.c | 2 ++ > 4 files changed, 6 insertions(+) > > diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h > index d758163..8d10382 100644 > --- a/include/uapi/linux/if.h > +++ b/include/uapi/linux/if.h > @@ -84,6 +84,7 @@ > #define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address > * change when it's running */ > #define IFF_MACVLAN 0x200000 /* Macvlan device */ > +#define IFF_BRIDGE_NON_ROOT 0x400000 /* Don't consider for root bridge */[...] Does it really make sense to add a flag that says exactly which special behaviour you want, or would it be better to define the flag as a passive property, which other drivers/protocols then use as a condition for special behaviour? The fact that you also define the IFF_BRIDGE_SKIP_IP flag, and set it on exactly the same devices, makes me think that they should actually be a single flag. I don't know how that flag should be named or described, though. Ben. -- Ben Hutchings Any sufficiently advanced bug is indistinguishable from a feature.
Ben Hutchings
2014-Feb-16 18:56 UTC
[Bridge] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:> From: "Luis R. Rodriguez" <mcgrof at suse.com> > > It doesn't make sense for some interfaces to become a root bridgeI think you mean 'root port'.> at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. > > Device drivers that know they should never become part of the > root bridge have been using a trick of setting their MAC address > to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > of using these hacks lets the interfaces annotate its intent and > generalizes a solution for multiple drivers, while letting the > drivers use a random MAC address or one prefixed with a proper OUI. > This sort of hack is used by both qemu and xen for their backend > interfaces. > > Cc: Stephen Hemminger <stephen at networkplumber.org> > Cc: bridge at lists.linux-foundation.org > Cc: netdev at vger.kernel.org > Cc: linux-kernel at vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof at suse.com> > --- > include/uapi/linux/if.h | 1 + > net/bridge/br_if.c | 2 ++ > net/bridge/br_private.h | 1 + > net/bridge/br_stp_if.c | 2 ++ > 4 files changed, 6 insertions(+) > > diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h > index d758163..8d10382 100644 > --- a/include/uapi/linux/if.h > +++ b/include/uapi/linux/if.h > @@ -84,6 +84,7 @@ > #define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address > * change when it's running */ > #define IFF_MACVLAN 0x200000 /* Macvlan device */ > +#define IFF_BRIDGE_NON_ROOT 0x400000 /* Don't consider for root bridge */[...] Does it really make sense to add a flag that says exactly which special behaviour you want, or would it be better to define the flag as a passive property, which other drivers/protocols then use as a condition for special behaviour? The fact that you also define the IFF_BRIDGE_SKIP_IP flag, and set it on exactly the same devices, makes me think that they should actually be a single flag. I don't know how that flag should be named or described, though. Ben. -- Ben Hutchings Any sufficiently advanced bug is indistinguishable from a feature. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: This is a digitally signed message part URL: <http://lists.linuxfoundation.org/pipermail/bridge/attachments/20140216/bf0c3610/attachment.sig>
Stephen Hemminger
2014-Feb-16 18:57 UTC
[Bridge] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On Fri, 14 Feb 2014 18:59:37 -0800 "Luis R. Rodriguez" <mcgrof at do-not-panic.com> wrote:> From: "Luis R. Rodriguez" <mcgrof at suse.com> > > It doesn't make sense for some interfaces to become a root bridge > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. > > Device drivers that know they should never become part of the > root bridge have been using a trick of setting their MAC address > to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > of using these hacks lets the interfaces annotate its intent and > generalizes a solution for multiple drivers, while letting the > drivers use a random MAC address or one prefixed with a proper OUI. > This sort of hack is used by both qemu and xen for their backend > interfaces. > > Cc: Stephen Hemminger <stephen at networkplumber.org> > Cc: bridge at lists.linux-foundation.org > Cc: netdev at vger.kernel.org > Cc: linux-kernel at vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof at suse.com>This is already supported in a more standard way via the root block flag.
Zoltan Kiss
2014-Feb-17 17:52 UTC
Re: [Xen-devel] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On 15/02/14 02:59, Luis R. Rodriguez wrote:> From: "Luis R. Rodriguez" <mcgrof@suse.com> > > It doesn't make sense for some interfaces to become a root bridge > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access.It is possible that a guest bridge together to VIF, either from the same Dom0 bridge or from different ones. In that case using STP on VIFs sound sensible to me. Zoli
Zoltan Kiss
2014-Feb-17 17:52 UTC
[Bridge] [Xen-devel] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On 15/02/14 02:59, Luis R. Rodriguez wrote:> From: "Luis R. Rodriguez" <mcgrof at suse.com> > > It doesn't make sense for some interfaces to become a root bridge > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access.It is possible that a guest bridge together to VIF, either from the same Dom0 bridge or from different ones. In that case using STP on VIFs sound sensible to me. Zoli
Luis R. Rodriguez
2014-Feb-18 21:02 UTC
Re: [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On Sun, Feb 16, 2014 at 10:57 AM, Stephen Hemminger <stephen@networkplumber.org> wrote:> On Fri, 14 Feb 2014 18:59:37 -0800 > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > >> From: "Luis R. Rodriguez" <mcgrof@suse.com> >> >> It doesn't make sense for some interfaces to become a root bridge >> at any point in time. One example is virtual backend interfaces >> which rely on other entities on the bridge for actual physical >> connectivity. They only provide virtual access. >> >> Device drivers that know they should never become part of the >> root bridge have been using a trick of setting their MAC address >> to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead >> of using these hacks lets the interfaces annotate its intent and >> generalizes a solution for multiple drivers, while letting the >> drivers use a random MAC address or one prefixed with a proper OUI. >> This sort of hack is used by both qemu and xen for their backend >> interfaces. >> >> Cc: Stephen Hemminger <stephen@networkplumber.org> >> Cc: bridge@lists.linux-foundation.org >> Cc: netdev@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > > This is already supported in a more standard way via the root > block flag.Great! For documentation purposes the root_block flag is a sysfs attribute, added via 3.8 through commit 1007dd1a. The respective interface flag is IFLA_BRPORT_PROTECT and can be set via the iproute2 bridge utility or through sysfs: mcgrof@garbanzo ~/linux (git::master)$ find /sys/ -name root_block /sys/devices/pci0000:00/0000:00:04.0/0000:02:00.0/net/eth0/brport/root_block /sys/devices/vif-3-0/net/vif3.0/brport/root_block /sys/devices/virtual/net/vif3.0-emu/brport/root_block mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat /sys/devices/vif-3-0/net/vif3.0/brport/root_block 0 mcgrof@garbanzo ~/devel/iproute2 (git::master)$ sudo bridge link set dev vif3.0 root_block on mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat /sys/devices/vif-3-0/net/vif3.0/brport/root_block 1 So if we'd want to avoid using the MAC address hack alternative to skip a root port userspace would need to be updated to simply set this attribute after adding the device to the bridge. Based on Zoltan's feedback there seems to be use cases to not enable this always for all xen-netback interfaces though as such we can just punt this to userspace for the topologies that require this. The original motivation for this series was to avoid the IPv6 duplicate address incurred by the MAC address hack for avoiding the root bridge. Given that Zoltan also noted a use case whereby IPv4 and IPv6 addresses can be assigned to the backend interfaces we should be able to avoid the duplicate address situation for IPv6 by using a proper random MAC address *once* userspace has been updated also to use IFLA_BRPORT_PROTECT. New userspace can't and won't need to set this flag for older kernels (older than 3.8) as root_block is not implemented on those kernels and the MAC address hack would still be used there. This strategy however does put a requirement on new kernels to use new userspace as otherwise the MAC address workaround would not be in place and root_block would not take effect. Luis
Luis R. Rodriguez
2014-Feb-19 16:45 UTC
[Bridge] [Xen-devel] [RFC v2 1/4] bridge: enable interfaces to opt out from becoming the root bridge
On Mon, Feb 17, 2014 at 9:52 AM, Zoltan Kiss <zoltan.kiss at citrix.com> wrote:> On 15/02/14 02:59, Luis R. Rodriguez wrote: >> >> From: "Luis R. Rodriguez" <mcgrof at suse.com> >> >> It doesn't make sense for some interfaces to become a root bridge >> at any point in time. One example is virtual backend interfaces >> which rely on other entities on the bridge for actual physical >> connectivity. They only provide virtual access. > > It is possible that a guest bridge together to VIF, either from the same > Dom0 bridge or from different ones. In that case using STP on VIFs sound > sensible to me.You seem to describe a case whereby it can make sense for xen-netback interfaces to end up becoming the root port of a bridge. Can you elaborate a little more on that as it was unclear the use case. Additionally if such cases exist then under the current upstream implementation one would simply need to change the MAC address in order to enable a vif to become the root port. Stephen noted there is a way to avoid nominating an interface for a root port through the root block flag. We should use that instead of the MAC address hacks. Let's keep in mind that part of the motivation for this series is to avoid a duplicate IPv6 address left in place by use cases whereby the MAC address of the backend vif was left static. The use case your are explaining likely describes the more prevalent use case where address conflicts can occur, perhaps when administrators for got to change the backend MAC address. If we embrace a random MAC address we'd avoid that issue, and but we'd need to update userspace to use the root block on topologies where desired. Luis