Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 00/11] Cleanup in brport flags switchdev offload for DSA
From: Vladimir Oltean <vladimir.oltean at nxp.com> The initial goal of this series was to have better support for standalone ports mode and multiple bridges on the DSA drivers like ocelot/felix and sja1105. Proper support for standalone mode requires disabling address learning, which in turn requires interaction with the switchdev notifier, which is actually where most of the patches are. I also noticed that most of the drivers are actually talking either to firmware or SPI/MDIO connected devices from the brport flags switchdev attribute handler, so it makes sense to actually make it sleepable instead of atomic. Vladimir Oltean (11): net: switchdev: propagate extack to port attributes net: bridge: offload all port flags at once in br_setport net: bridge: don't print in br_switchdev_set_port_flag net: dsa: configure proper brport flags when ports leave the bridge net: squash switchdev attributes PRE_BRIDGE_FLAGS and BRIDGE_FLAGS net: dsa: kill .port_egress_floods overengineering net: prep switchdev drivers for concurrent SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS net: bridge: put SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS on the blocking call chain net: mscc: ocelot: use separate flooding PGID for broadcast net: mscc: ocelot: offload bridge port flags to device net: dsa: sja1105: offload bridge port flags to device drivers/net/dsa/b53/b53_common.c | 20 +- drivers/net/dsa/mv88e6xxx/chip.c | 21 +- drivers/net/dsa/ocelot/felix.c | 10 + drivers/net/dsa/sja1105/sja1105.h | 2 + drivers/net/dsa/sja1105/sja1105_main.c | 212 +++++++++++++++++- drivers/net/dsa/sja1105/sja1105_spi.c | 6 + .../marvell/prestera/prestera_switchdev.c | 54 +++-- .../mellanox/mlxsw/spectrum_switchdev.c | 90 ++++---- drivers/net/ethernet/mscc/ocelot.c | 72 +++++- drivers/net/ethernet/mscc/ocelot_net.c | 7 +- drivers/net/ethernet/rocker/rocker.h | 2 +- drivers/net/ethernet/rocker/rocker_main.c | 24 +- drivers/net/ethernet/rocker/rocker_ofdpa.c | 26 ++- drivers/net/ethernet/ti/cpsw_switchdev.c | 35 ++- drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 43 ++-- include/net/dsa.h | 7 +- include/net/switchdev.h | 14 +- include/soc/mscc/ocelot.h | 18 +- net/bridge/br_netlink.c | 162 ++++++------- net/bridge/br_private.h | 6 +- net/bridge/br_switchdev.c | 33 ++- net/bridge/br_sysfs_if.c | 21 +- net/dsa/dsa_priv.h | 8 +- net/dsa/port.c | 76 ++++--- net/dsa/slave.c | 10 +- net/switchdev/switchdev.c | 11 +- 26 files changed, 654 insertions(+), 336 deletions(-) -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 01/11] net: switchdev: propagate extack to port attributes
From: Vladimir Oltean <vladimir.oltean at nxp.com> When a struct switchdev_attr is notified through switchdev, there is no way to report informational messages, unlike for struct switchdev_obj. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> Reviewed-by: Ido Schimmel <idosch at nvidia.com> --- Changes in v3: None. Changes in v2: Patch is new. .../ethernet/marvell/prestera/prestera_switchdev.c | 3 ++- .../net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 3 ++- drivers/net/ethernet/mscc/ocelot_net.c | 3 ++- drivers/net/ethernet/ti/cpsw_switchdev.c | 3 ++- include/net/switchdev.h | 6 ++++-- net/dsa/slave.c | 3 ++- net/switchdev/switchdev.c | 11 ++++++++--- 7 files changed, 22 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c index 8c2b03151736..2c1619715a4b 100644 --- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c +++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c @@ -695,7 +695,8 @@ static int prestera_port_attr_stp_state_set(struct prestera_port *port, } static int prestera_port_obj_attr_set(struct net_device *dev, - const struct switchdev_attr *attr) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) { struct prestera_port *port = netdev_priv(dev); int err = 0; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c index 20c4f3c2cf23..18e4f1cd5587 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -887,7 +887,8 @@ mlxsw_sp_port_attr_br_mrouter_set(struct mlxsw_sp_port *mlxsw_sp_port, } static int mlxsw_sp_port_attr_set(struct net_device *dev, - const struct switchdev_attr *attr) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) { struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev); int err; diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c index 8f12fa45b1b5..f9da4aa39444 100644 --- a/drivers/net/ethernet/mscc/ocelot_net.c +++ b/drivers/net/ethernet/mscc/ocelot_net.c @@ -1005,7 +1005,8 @@ static void ocelot_port_attr_mc_set(struct ocelot *ocelot, int port, bool mc) } static int ocelot_port_attr_set(struct net_device *dev, - const struct switchdev_attr *attr) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) { struct ocelot_port_private *priv = netdev_priv(dev); struct ocelot *ocelot = priv->port.ocelot; diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c index 9967cf985728..13524cbaa8b6 100644 --- a/drivers/net/ethernet/ti/cpsw_switchdev.c +++ b/drivers/net/ethernet/ti/cpsw_switchdev.c @@ -83,7 +83,8 @@ static int cpsw_port_attr_br_flags_pre_set(struct net_device *netdev, } static int cpsw_port_attr_set(struct net_device *ndev, - const struct switchdev_attr *attr) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) { struct cpsw_priv *priv = netdev_priv(ndev); int ret; diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 88fcac140966..84c765312001 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -283,7 +283,8 @@ int switchdev_handle_port_attr_set(struct net_device *dev, struct switchdev_notifier_port_attr_info *port_attr_info, bool (*check_cb)(const struct net_device *dev), int (*set_cb)(struct net_device *dev, - const struct switchdev_attr *attr)); + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack)); #else static inline void switchdev_deferred_process(void) @@ -374,7 +375,8 @@ switchdev_handle_port_attr_set(struct net_device *dev, struct switchdev_notifier_port_attr_info *port_attr_info, bool (*check_cb)(const struct net_device *dev), int (*set_cb)(struct net_device *dev, - const struct switchdev_attr *attr)) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack)) { return 0; } diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 431bdbdd8473..8f4c7c232e2c 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -271,7 +271,8 @@ static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) } static int dsa_slave_port_attr_set(struct net_device *dev, - const struct switchdev_attr *attr) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) { struct dsa_port *dp = dsa_slave_to_port(dev); int ret; diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c index 94113ca29dcf..0b84f076591e 100644 --- a/net/switchdev/switchdev.c +++ b/net/switchdev/switchdev.c @@ -488,14 +488,18 @@ static int __switchdev_handle_port_attr_set(struct net_device *dev, struct switchdev_notifier_port_attr_info *port_attr_info, bool (*check_cb)(const struct net_device *dev), int (*set_cb)(struct net_device *dev, - const struct switchdev_attr *attr)) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack)) { + struct netlink_ext_ack *extack; struct net_device *lower_dev; struct list_head *iter; int err = -EOPNOTSUPP; + extack = switchdev_notifier_info_to_extack(&port_attr_info->info); + if (check_cb(dev)) { - err = set_cb(dev, port_attr_info->attr); + err = set_cb(dev, port_attr_info->attr, extack); if (err != -EOPNOTSUPP) port_attr_info->handled = true; return err; @@ -525,7 +529,8 @@ int switchdev_handle_port_attr_set(struct net_device *dev, struct switchdev_notifier_port_attr_info *port_attr_info, bool (*check_cb)(const struct net_device *dev), int (*set_cb)(struct net_device *dev, - const struct switchdev_attr *attr)) + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack)) { int err; -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 02/11] net: bridge: offload all port flags at once in br_setport
From: Vladimir Oltean <vladimir.oltean at nxp.com> If for example this command: ip link set swp0 type bridge_slave flood off mcast_flood off learning off succeeded at configuring BR_FLOOD and BR_MCAST_FLOOD but not at BR_LEARNING, there would be no attempt to revert the partial state in any way. Arguably, if the user changes more than one flag through the same netlink command, this one _should_ be all or nothing, which means it should be passed through switchdev as all or nothing. We also move the br->lock handling inside br_setport in anticipation of a future patch which will temporarily drop the lock around br_switchdev_set_port_flag, since we would like that switchdev notification to be emitted in blocking context. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: Don't attempt to drop br->lock around br_switchdev_set_port_flag now, move that part to a later patch. Changes in v2: Patch is new. Changes in v2: Patch is new. net/bridge/br_netlink.c | 145 ++++++++++++++------------------------ net/bridge/br_switchdev.c | 7 +- 2 files changed, 58 insertions(+), 94 deletions(-) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index bd3962da345a..4e64775bd8fb 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -853,103 +853,76 @@ static int br_set_port_state(struct net_bridge_port *p, u8 state) } /* Set/clear or port flags based on attribute */ -static int br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[], - int attrtype, unsigned long mask) +static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[], + int attrtype, unsigned long mask) { - unsigned long flags; - int err; - if (!tb[attrtype]) - return 0; + return; if (nla_get_u8(tb[attrtype])) - flags = p->flags | mask; + p->flags |= mask; else - flags = p->flags & ~mask; - - err = br_switchdev_set_port_flag(p, flags, mask); - if (err) - return err; - - p->flags = flags; - return 0; + p->flags &= ~mask; } /* Process bridge protocol info on port */ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) { - unsigned long old_flags = p->flags; - bool br_vlan_tunnel_old = false; + unsigned long old_flags, changed_mask; + bool br_vlan_tunnel_old; int err; - err = br_set_port_flag(p, tb, IFLA_BRPORT_MODE, BR_HAIRPIN_MODE); - if (err) - return err; + spin_lock_bh(&p->br->lock); - err = br_set_port_flag(p, tb, IFLA_BRPORT_GUARD, BR_BPDU_GUARD); - if (err) - return err; + old_flags = p->flags; + br_vlan_tunnel_old = (old_flags & BR_VLAN_TUNNEL) ? true : false; - err = br_set_port_flag(p, tb, IFLA_BRPORT_FAST_LEAVE, BR_MULTICAST_FAST_LEAVE); - if (err) - return err; + br_set_port_flag(p, tb, IFLA_BRPORT_MODE, BR_HAIRPIN_MODE); + br_set_port_flag(p, tb, IFLA_BRPORT_GUARD, BR_BPDU_GUARD); + br_set_port_flag(p, tb, IFLA_BRPORT_FAST_LEAVE, + BR_MULTICAST_FAST_LEAVE); + br_set_port_flag(p, tb, IFLA_BRPORT_PROTECT, BR_ROOT_BLOCK); + br_set_port_flag(p, tb, IFLA_BRPORT_LEARNING, BR_LEARNING); + br_set_port_flag(p, tb, IFLA_BRPORT_UNICAST_FLOOD, BR_FLOOD); + br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_FLOOD, BR_MCAST_FLOOD); + br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_TO_UCAST, + BR_MULTICAST_TO_UNICAST); + br_set_port_flag(p, tb, IFLA_BRPORT_BCAST_FLOOD, BR_BCAST_FLOOD); + br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP, BR_PROXYARP); + br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP_WIFI, BR_PROXYARP_WIFI); + br_set_port_flag(p, tb, IFLA_BRPORT_VLAN_TUNNEL, BR_VLAN_TUNNEL); + br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); + br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); - err = br_set_port_flag(p, tb, IFLA_BRPORT_PROTECT, BR_ROOT_BLOCK); - if (err) - return err; + changed_mask = old_flags ^ p->flags; - err = br_set_port_flag(p, tb, IFLA_BRPORT_LEARNING, BR_LEARNING); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_UNICAST_FLOOD, BR_FLOOD); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_FLOOD, BR_MCAST_FLOOD); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_MCAST_TO_UCAST, BR_MULTICAST_TO_UNICAST); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_BCAST_FLOOD, BR_BCAST_FLOOD); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP, BR_PROXYARP); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_PROXYARP_WIFI, BR_PROXYARP_WIFI); - if (err) - return err; - - br_vlan_tunnel_old = (p->flags & BR_VLAN_TUNNEL) ? true : false; - err = br_set_port_flag(p, tb, IFLA_BRPORT_VLAN_TUNNEL, BR_VLAN_TUNNEL); - if (err) - return err; + err = br_switchdev_set_port_flag(p, p->flags, changed_mask); + if (err) { + p->flags = old_flags; + goto out; + } if (br_vlan_tunnel_old && !(p->flags & BR_VLAN_TUNNEL)) nbp_vlan_tunnel_info_flush(p); + br_port_flags_change(p, changed_mask); + if (tb[IFLA_BRPORT_COST]) { err = br_stp_set_path_cost(p, nla_get_u32(tb[IFLA_BRPORT_COST])); if (err) - return err; + goto out; } if (tb[IFLA_BRPORT_PRIORITY]) { err = br_stp_set_port_priority(p, nla_get_u16(tb[IFLA_BRPORT_PRIORITY])); if (err) - return err; + goto out; } if (tb[IFLA_BRPORT_STATE]) { err = br_set_port_state(p, nla_get_u8(tb[IFLA_BRPORT_STATE])); if (err) - return err; + goto out; } if (tb[IFLA_BRPORT_FLUSH]) @@ -961,7 +934,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) err = br_multicast_set_port_router(p, mcast_router); if (err) - return err; + goto out; } if (tb[IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT]) { @@ -970,27 +943,20 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) hlimit = nla_get_u32(tb[IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT]); err = br_multicast_eht_set_hosts_limit(p, hlimit); if (err) - return err; + goto out; } #endif if (tb[IFLA_BRPORT_GROUP_FWD_MASK]) { u16 fwd_mask = nla_get_u16(tb[IFLA_BRPORT_GROUP_FWD_MASK]); - if (fwd_mask & BR_GROUPFWD_MACPAUSE) - return -EINVAL; + if (fwd_mask & BR_GROUPFWD_MACPAUSE) { + err = -EINVAL; + goto out; + } p->group_fwd_mask = fwd_mask; } - err = br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, - BR_NEIGH_SUPPRESS); - if (err) - return err; - - err = br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); - if (err) - return err; - if (tb[IFLA_BRPORT_BACKUP_PORT]) { struct net_device *backup_dev = NULL; u32 backup_ifindex; @@ -999,17 +965,21 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) if (backup_ifindex) { backup_dev = __dev_get_by_index(dev_net(p->dev), backup_ifindex); - if (!backup_dev) - return -ENOENT; + if (!backup_dev) { + err = -ENOENT; + goto out; + } } err = nbp_backup_change(p, backup_dev); if (err) - return err; + goto out; } - br_port_flags_change(p, old_flags ^ p->flags); - return 0; +out: + spin_unlock_bh(&p->br->lock); + + return err; } /* Change state and parameters on port. */ @@ -1045,9 +1015,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags, if (err) return err; - spin_lock_bh(&p->br->lock); err = br_setport(p, tb); - spin_unlock_bh(&p->br->lock); } else { /* Binary compatibility with old RSTP */ if (nla_len(protinfo) < sizeof(u8)) @@ -1134,17 +1102,10 @@ static int br_port_slave_changelink(struct net_device *brdev, struct nlattr *data[], struct netlink_ext_ack *extack) { - struct net_bridge *br = netdev_priv(brdev); - int ret; - if (!data) return 0; - spin_lock_bh(&br->lock); - ret = br_setport(br_port_get_rtnl(dev), data); - spin_unlock_bh(&br->lock); - - return ret; + return br_setport(br_port_get_rtnl(dev), data); } static int br_port_fill_slave_info(struct sk_buff *skb, diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index a9c23ef83443..c004ade25ac0 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -65,16 +65,19 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, struct switchdev_attr attr = { .orig_dev = p->dev, .id = SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, - .u.brport_flags = mask, }; struct switchdev_notifier_port_attr_info info = { .attr = &attr, }; int err; - if (mask & ~BR_PORT_FLAGS_HW_OFFLOAD) + flags &= BR_PORT_FLAGS_HW_OFFLOAD; + mask &= BR_PORT_FLAGS_HW_OFFLOAD; + if (!mask) return 0; + attr.u.brport_flags = mask; + /* We run from atomic context here */ err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev, &info.info, NULL); -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 03/11] net: bridge: don't print in br_switchdev_set_port_flag
From: Vladimir Oltean <vladimir.oltean at nxp.com> Currently br_switchdev_set_port_flag has two options for error handling and neither is good: - The driver returns -EOPNOTSUPP in PRE_BRIDGE_FLAGS if it doesn't support offloading that flag, and this gets silently ignored and converted to an errno of 0. Nobody does this. - The driver returns some other error code, like -EINVAL, in PRE_BRIDGE_FLAGS, and br_switchdev_set_port_flag shouts loudly. The problem is that we'd like to offload some port flags during bridge join and leave, but also not have the bridge shout at us if those fail. But on the other hand we'd like the user to know that we can't offload something when they set that through netlink. And since we can't have the driver return -EOPNOTSUPP or -EINVAL depending on whether it's called by the user or internally by the bridge, let's just add an extack argument to br_switchdev_set_port_flag and propagate it to its callers. Then, when we need offloading to really fail silently, this can simply be passed a NULL argument. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: - Deal with the br_switchdev_set_port_flag call from sysfs too. Changes in v2: - br_set_port_flag now returns void, so no extack there. - don't overwrite extack in br_switchdev_set_port_flag if already populated. net/bridge/br_netlink.c | 9 +++++---- net/bridge/br_private.h | 6 ++++-- net/bridge/br_switchdev.c | 13 +++++++------ net/bridge/br_sysfs_if.c | 7 +++++-- 4 files changed, 21 insertions(+), 14 deletions(-) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 4e64775bd8fb..b7731614c036 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -866,7 +866,8 @@ static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[], } /* Process bridge protocol info on port */ -static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) +static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], + struct netlink_ext_ack *extack) { unsigned long old_flags, changed_mask; bool br_vlan_tunnel_old; @@ -896,7 +897,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[]) changed_mask = old_flags ^ p->flags; - err = br_switchdev_set_port_flag(p, p->flags, changed_mask); + err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack); if (err) { p->flags = old_flags; goto out; @@ -1015,7 +1016,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags, if (err) return err; - err = br_setport(p, tb); + err = br_setport(p, tb, extack); } else { /* Binary compatibility with old RSTP */ if (nla_len(protinfo) < sizeof(u8)) @@ -1105,7 +1106,7 @@ static int br_port_slave_changelink(struct net_device *brdev, if (!data) return 0; - return br_setport(br_port_get_rtnl(dev), data); + return br_setport(br_port_get_rtnl(dev), data, extack); } static int br_port_fill_slave_info(struct sk_buff *skb, diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index d242ba668e47..a1639d41188b 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -1575,7 +1575,8 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p, const struct sk_buff *skb); int br_switchdev_set_port_flag(struct net_bridge_port *p, unsigned long flags, - unsigned long mask); + unsigned long mask, + struct netlink_ext_ack *extack); void br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type); int br_switchdev_port_vlan_add(struct net_device *dev, u16 vid, u16 flags, @@ -1605,7 +1606,8 @@ static inline bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p, static inline int br_switchdev_set_port_flag(struct net_bridge_port *p, unsigned long flags, - unsigned long mask) + unsigned long mask, + struct netlink_ext_ack *extack) { return 0; } diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index c004ade25ac0..ac8dead86bf2 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -60,7 +60,8 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p, int br_switchdev_set_port_flag(struct net_bridge_port *p, unsigned long flags, - unsigned long mask) + unsigned long mask, + struct netlink_ext_ack *extack) { struct switchdev_attr attr = { .orig_dev = p->dev, @@ -80,14 +81,15 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, /* We run from atomic context here */ err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev, - &info.info, NULL); + &info.info, extack); err = notifier_to_errno(err); if (err == -EOPNOTSUPP) return 0; if (err) { - br_warn(p->br, "bridge flag offload is not supported %u(%s)\n", - (unsigned int)p->port_no, p->dev->name); + if (extack && !extack->_msg) + NL_SET_ERR_MSG_MOD(extack, + "bridge flag offload is not supported"); return -EOPNOTSUPP; } @@ -97,8 +99,7 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, err = switchdev_port_attr_set(p->dev, &attr); if (err) { - br_warn(p->br, "error setting offload flag on port %u(%s)\n", - (unsigned int)p->port_no, p->dev->name); + NL_SET_ERR_MSG_MOD(extack, "error setting offload flag on port"); return err; } diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c index 5aea9427ffe1..72e92376eef1 100644 --- a/net/bridge/br_sysfs_if.c +++ b/net/bridge/br_sysfs_if.c @@ -59,6 +59,7 @@ static BRPORT_ATTR(_name, 0644, \ static int store_flag(struct net_bridge_port *p, unsigned long v, unsigned long mask) { + struct netlink_ext_ack extack = {0}; unsigned long flags = p->flags; int err; @@ -68,9 +69,11 @@ static int store_flag(struct net_bridge_port *p, unsigned long v, flags &= ~mask; if (flags != p->flags) { - err = br_switchdev_set_port_flag(p, flags, mask); - if (err) + err = br_switchdev_set_port_flag(p, flags, mask, &extack); + if (err) { + netdev_err(p->dev, "%s\n", extack._msg); return err; + } p->flags = flags; br_port_flags_change(p, mask); -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 04/11] net: dsa: configure proper brport flags when ports leave the bridge
From: Vladimir Oltean <vladimir.oltean at nxp.com> For a DSA switch port operating in standalone mode, address learning doesn't make much sense since that is a bridge function. In fact, address learning even breaks setups such as this one: +---------------------------------------------+ | | | +-------------------+ | | | br0 | send receive | | +--------+-+--------+ +--------+ +--------+ | | | | | | | | | | | | | swp0 | | swp1 | | swp2 | | swp3 | | | | | | | | | | | | +-+--------+-+--------+-+--------+-+--------+-+ | ^ | ^ | | | | | +-----------+ | | | +--------------------------------+ because if the switch has a single FDB (can offload a single bridge) then source address learning on swp3 can "steal" the source MAC address of swp2 from br0's FDB, because learning frames coming from swp2 will be done twice: first on the swp1 ingress port, second on the swp3 ingress port. So the hardware FDB will become out of sync with the software bridge, and when swp2 tries to send one more packet towards swp1, the ASIC will attempt to short-circuit the forwarding path and send it directly to swp3 (since that's the last port it learned that address on), which it obviously can't, because swp3 operates in standalone mode. So DSA drivers operating in standalone mode should still configure a list of bridge port flags even when they are standalone. Currently DSA attempts to call dsa_port_bridge_flags with 0, which disables egress flooding of unknown unicast and multicast, something which doesn't make much sense. For the switches that implement .port_egress_floods - b53 and mv88e6xxx, it probably doesn't matter too much either, since they can possibly inject traffic from the CPU into a standalone port, regardless of MAC DA, even if egress flooding is turned off for that port, but certainly not all DSA switches can do that - sja1105, for example, can't. So it makes sense to use a better common default there, such as "flood everything". It should also be noted that what DSA calls "dsa_port_bridge_flags()" is a degenerate name for just calling .port_egress_floods(), since nothing else is implemented - not learning, in particular. But disabling address learning, something that this driver is also coding up for, will be supported by individual drivers once .port_egress_floods is replaced with a more generic .port_bridge_flags. Previous attempts to code up this logic have been in the common bridge layer, but as pointed out by Ido Schimmel, there are corner cases that are missed when doing that: https://patchwork.kernel.org/project/netdevbpf/patch/20210209151936.97382-5-olteanv at gmail.com/ So, at least for now, let's leave DSA in charge of setting port flags before and after the bridge join and leave. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: Patch is new, logically it was moved from the bridge layer to the DSA layer. net/dsa/port.c | 45 ++++++++++++++++++++++++++++++++++++++------- 1 file changed, 38 insertions(+), 7 deletions(-) diff --git a/net/dsa/port.c b/net/dsa/port.c index 5e079a61528e..9f65ba11ad00 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -122,6 +122,27 @@ void dsa_port_disable(struct dsa_port *dp) rtnl_unlock(); } +static void dsa_port_change_brport_flags(struct dsa_port *dp, + bool bridge_offload) +{ + unsigned long mask, flags; + int flag, err; + + mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; + if (bridge_offload) + flags = mask; + else + flags = mask & ~BR_LEARNING; + + for_each_set_bit(flag, &mask, 32) { + err = dsa_port_pre_bridge_flags(dp, BIT(flag)); + if (err) + continue; + + dsa_port_bridge_flags(dp, flags & BIT(flag)); + } +} + int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br) { struct dsa_notifier_bridge_info info = { @@ -132,10 +153,10 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br) }; int err; - /* Set the flooding mode before joining the port in the switch */ - err = dsa_port_bridge_flags(dp, BR_FLOOD | BR_MCAST_FLOOD); - if (err) - return err; + /* Notify the port driver to set its configurable flags in a way that + * matches the initial settings of a bridge port. + */ + dsa_port_change_brport_flags(dp, true); /* Here the interface is already bridged. Reflect the current * configuration so that drivers can program their chips accordingly. @@ -146,7 +167,7 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br) /* The bridging is rolled back on error */ if (err) { - dsa_port_bridge_flags(dp, 0); + dsa_port_change_brport_flags(dp, false); dp->bridge_dev = NULL; } @@ -172,8 +193,18 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br) if (err) pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n"); - /* Port is leaving the bridge, disable flooding */ - dsa_port_bridge_flags(dp, 0); + /* Configure the port for standalone mode (no address learning, + * flood everything). + * The bridge only emits SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS events + * when the user requests it through netlink or sysfs, but not + * automatically at port join or leave, so we need to handle resetting + * the brport flags ourselves. But we even prefer it that way, because + * otherwise, some setups might never get the notification they need, + * for example, when a port leaves a LAG that offloads the bridge, + * it becomes standalone, but as far as the bridge is concerned, no + * port ever left. + */ + dsa_port_change_brport_flags(dp, false); /* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer, * so allow it to be in BR_STATE_FORWARDING to be kept functional -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 05/11] net: squash switchdev attributes PRE_BRIDGE_FLAGS and BRIDGE_FLAGS
From: Vladimir Oltean <vladimir.oltean at nxp.com> There does not appear to be any strong reason why br_switchdev_set_port_flag issues a separate notification for checking the supported brport flags rather than just attempting to apply them and propagating the error if that fails. However, there is a reason why this switchdev API is counterproductive for a driver writer, and that is because although br_switchdev_set_port_flag gets passed a "flags" and a "mask", those are passed piecemeal to the driver, so while the PRE_BRIDGE_FLAGS listener knows what changed because it has the "mask", the BRIDGE_FLAGS listener doesn't, because it only has the final value. This means that "edge detection" needs to be done by each individual BRIDGE_FLAGS listener by XOR-ing the old and the new flags, which in turn means that copying the flags into a driver private variable is strictly necessary. This can be solved by passing the "flags" and the "mask" together into a single switchdev attribute, and it also reduces some boilerplate in the drivers that offload this. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: - Reworked mlxsw to check mask before performing any change. - Also adapted the newly introduced dsa_port_change_brport_flags to the new API. Changes in v2: - Renamed "val" to "flags". - Reworked drivers to check mask before performing any change. .../marvell/prestera/prestera_switchdev.c | 29 +++++---- .../mellanox/mlxsw/spectrum_switchdev.c | 59 +++++++++---------- drivers/net/ethernet/rocker/rocker_main.c | 24 ++------ drivers/net/ethernet/ti/cpsw_switchdev.c | 32 ++++------ drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 43 +++++++------- include/net/switchdev.h | 8 ++- net/bridge/br_switchdev.c | 15 +---- net/dsa/dsa_priv.h | 4 +- net/dsa/port.c | 43 ++++++-------- net/dsa/slave.c | 3 - 10 files changed, 109 insertions(+), 151 deletions(-) diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c index 2c1619715a4b..a797a7ff0cfe 100644 --- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c +++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c @@ -581,24 +581,32 @@ int prestera_bridge_port_event(struct net_device *dev, unsigned long event, static int prestera_port_attr_br_flags_set(struct prestera_port *port, struct net_device *dev, - unsigned long flags) + struct switchdev_brport_flags flags) { struct prestera_bridge_port *br_port; int err; + if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) + err = -EINVAL; + br_port = prestera_bridge_port_by_dev(port->sw->swdev, dev); if (!br_port) return 0; - err = prestera_hw_port_flood_set(port, flags & BR_FLOOD); - if (err) - return err; + if (flags.mask & BR_FLOOD) { + err = prestera_hw_port_flood_set(port, flags.val & BR_FLOOD); + if (err) + return err; + } - err = prestera_hw_port_learning_set(port, flags & BR_LEARNING); - if (err) - return err; + if (flags.mask & BR_LEARNING) { + err = prestera_hw_port_learning_set(port, + flags.val & BR_LEARNING); + if (err) + return err; + } - memcpy(&br_port->flags, &flags, sizeof(flags)); + memcpy(&br_port->flags, &flags.val, sizeof(flags.val)); return 0; } @@ -706,11 +714,6 @@ static int prestera_port_obj_attr_set(struct net_device *dev, err = prestera_port_attr_stp_state_set(port, attr->orig_dev, attr->u.stp_state); break; - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - if (attr->u.brport_flags & - ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) - err = -EINVAL; - break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: err = prestera_port_attr_br_flags_set(port, attr->orig_dev, attr->u.brport_flags); diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c index 18e4f1cd5587..7af79e3a3c7a 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -653,51 +653,52 @@ mlxsw_sp_bridge_port_learning_set(struct mlxsw_sp_port *mlxsw_sp_port, return err; } -static int mlxsw_sp_port_attr_br_pre_flags_set(struct mlxsw_sp_port - *mlxsw_sp_port, - unsigned long brport_flags) -{ - if (brport_flags & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) - return -EINVAL; - - return 0; -} - static int mlxsw_sp_port_attr_br_flags_set(struct mlxsw_sp_port *mlxsw_sp_port, struct net_device *orig_dev, - unsigned long brport_flags) + struct switchdev_brport_flags flags) { struct mlxsw_sp_bridge_port *bridge_port; int err; + if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) + return -EINVAL; + bridge_port = mlxsw_sp_bridge_port_find(mlxsw_sp_port->mlxsw_sp->bridge, orig_dev); if (!bridge_port) return 0; - err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port, - MLXSW_SP_FLOOD_TYPE_UC, - brport_flags & BR_FLOOD); - if (err) - return err; + if (flags.mask & BR_FLOOD) { + err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, + bridge_port, + MLXSW_SP_FLOOD_TYPE_UC, + flags.val & BR_FLOOD); + if (err) + return err; + } - err = mlxsw_sp_bridge_port_learning_set(mlxsw_sp_port, bridge_port, - brport_flags & BR_LEARNING); - if (err) - return err; + if (flags.mask & BR_LEARNING) { + err = mlxsw_sp_bridge_port_learning_set(mlxsw_sp_port, + bridge_port, + flags.val & BR_LEARNING); + if (err) + return err; + } if (bridge_port->bridge_device->multicast_enabled) goto out; - err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port, - MLXSW_SP_FLOOD_TYPE_MC, - brport_flags & - BR_MCAST_FLOOD); - if (err) - return err; + if (flags.mask & BR_MCAST_FLOOD) { + err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, + bridge_port, + MLXSW_SP_FLOOD_TYPE_MC, + flags.val & BR_MCAST_FLOOD); + if (err) + return err; + } out: - memcpy(&bridge_port->flags, &brport_flags, sizeof(brport_flags)); + memcpy(&bridge_port->flags, &flags.val, sizeof(flags.val)); return 0; } @@ -899,10 +900,6 @@ static int mlxsw_sp_port_attr_set(struct net_device *dev, attr->orig_dev, attr->u.stp_state); break; - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - err = mlxsw_sp_port_attr_br_pre_flags_set(mlxsw_sp_port, - attr->u.brport_flags); - break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: err = mlxsw_sp_port_attr_br_flags_set(mlxsw_sp_port, attr->orig_dev, diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c index 740a715c49c6..898abf3d14d0 100644 --- a/drivers/net/ethernet/rocker/rocker_main.c +++ b/drivers/net/ethernet/rocker/rocker_main.c @@ -1575,8 +1575,8 @@ rocker_world_port_attr_bridge_flags_support_get(const struct rocker_port * } static int -rocker_world_port_attr_pre_bridge_flags_set(struct rocker_port *rocker_port, - unsigned long brport_flags) +rocker_world_port_attr_bridge_flags_set(struct rocker_port *rocker_port, + struct switchdev_brport_flags flags) { struct rocker_world_ops *wops = rocker_port->rocker->wops; unsigned long brport_flags_s; @@ -1590,22 +1590,10 @@ rocker_world_port_attr_pre_bridge_flags_set(struct rocker_port *rocker_port, if (err) return err; - if (brport_flags & ~brport_flags_s) + if (flags.mask & ~brport_flags_s) return -EINVAL; - return 0; -} - -static int -rocker_world_port_attr_bridge_flags_set(struct rocker_port *rocker_port, - unsigned long brport_flags) -{ - struct rocker_world_ops *wops = rocker_port->rocker->wops; - - if (!wops->port_attr_bridge_flags_set) - return -EOPNOTSUPP; - - return wops->port_attr_bridge_flags_set(rocker_port, brport_flags); + return wops->port_attr_bridge_flags_set(rocker_port, flags.val); } static int @@ -2056,10 +2044,6 @@ static int rocker_port_attr_set(struct net_device *dev, err = rocker_world_port_attr_stp_state_set(rocker_port, attr->u.stp_state); break; - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - err = rocker_world_port_attr_pre_bridge_flags_set(rocker_port, - attr->u.brport_flags); - break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: err = rocker_world_port_attr_bridge_flags_set(rocker_port, attr->u.brport_flags); diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c index 13524cbaa8b6..5d8ec34f82ad 100644 --- a/drivers/net/ethernet/ti/cpsw_switchdev.c +++ b/drivers/net/ethernet/ti/cpsw_switchdev.c @@ -57,27 +57,25 @@ static int cpsw_port_stp_state_set(struct cpsw_priv *priv, u8 state) static int cpsw_port_attr_br_flags_set(struct cpsw_priv *priv, struct net_device *orig_dev, - unsigned long brport_flags) + struct switchdev_brport_flags flags) { struct cpsw_common *cpsw = priv->cpsw; - bool unreg_mcast_add = false; - if (brport_flags & BR_MCAST_FLOOD) - unreg_mcast_add = true; - dev_dbg(priv->dev, "BR_MCAST_FLOOD: %d port %u\n", - unreg_mcast_add, priv->emac_port); + if (flags.mask & ~(BR_LEARNING | BR_MCAST_FLOOD)) + return -EINVAL; - cpsw_ale_set_unreg_mcast(cpsw->ale, BIT(priv->emac_port), - unreg_mcast_add); + if (flags.mask & BR_MCAST_FLOOD) { + bool unreg_mcast_add = false; - return 0; -} + if (flags.val & BR_MCAST_FLOOD) + unreg_mcast_add = true; -static int cpsw_port_attr_br_flags_pre_set(struct net_device *netdev, - unsigned long flags) -{ - if (flags & ~(BR_LEARNING | BR_MCAST_FLOOD)) - return -EINVAL; + dev_dbg(priv->dev, "BR_MCAST_FLOOD: %d port %u\n", + unreg_mcast_add, priv->emac_port); + + cpsw_ale_set_unreg_mcast(cpsw->ale, BIT(priv->emac_port), + unreg_mcast_add); + } return 0; } @@ -92,10 +90,6 @@ static int cpsw_port_attr_set(struct net_device *ndev, dev_dbg(priv->dev, "attr: id %u port: %u\n", attr->id, priv->emac_port); switch (attr->id) { - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - ret = cpsw_port_attr_br_flags_pre_set(ndev, - attr->u.brport_flags); - break; case SWITCHDEV_ATTR_ID_PORT_STP_STATE: ret = cpsw_port_stp_state_set(priv, attr->u.stp_state); dev_dbg(priv->dev, "stp state: %u\n", attr->u.stp_state); diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c index ca3d07fe7f58..f675a2ba4dce 100644 --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c @@ -908,31 +908,32 @@ static int dpaa2_switch_port_attr_stp_state_set(struct net_device *netdev, return dpaa2_switch_port_set_stp_state(port_priv, state); } -static int dpaa2_switch_port_attr_br_flags_pre_set(struct net_device *netdev, - unsigned long flags) -{ - if (flags & ~(BR_LEARNING | BR_FLOOD)) - return -EINVAL; - - return 0; -} - -static int dpaa2_switch_port_attr_br_flags_set(struct net_device *netdev, - unsigned long flags) +static int +dpaa2_switch_port_attr_br_flags_set(struct net_device *netdev, + struct switchdev_brport_flags flags) { struct ethsw_port_priv *port_priv = netdev_priv(netdev); int err = 0; - /* Learning is enabled per switch */ - err = dpaa2_switch_set_learning(port_priv->ethsw_data, - !!(flags & BR_LEARNING)); - if (err) - goto exit; + if (flags.mask & ~(BR_LEARNING | BR_FLOOD)) + return -EINVAL; + + if (flags.mask & BR_LEARNING) { + /* Learning is enabled per switch */ + err = dpaa2_switch_set_learning(port_priv->ethsw_data, + !!(flags.val & BR_LEARNING)); + if (err) + return err; + } - err = dpaa2_switch_port_set_flood(port_priv, !!(flags & BR_FLOOD)); + if (flags.mask & BR_FLOOD) { + err = dpaa2_switch_port_set_flood(port_priv, + !!(flags.val & BR_FLOOD)); + if (err) + return err; + } -exit: - return err; + return 0; } static int dpaa2_switch_port_attr_set(struct net_device *netdev, @@ -945,10 +946,6 @@ static int dpaa2_switch_port_attr_set(struct net_device *netdev, err = dpaa2_switch_port_attr_stp_state_set(netdev, attr->u.stp_state); break; - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - err = dpaa2_switch_port_attr_br_flags_pre_set(netdev, - attr->u.brport_flags); - break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: err = dpaa2_switch_port_attr_br_flags_set(netdev, attr->u.brport_flags); diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 84c765312001..aa9cad9bad7d 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -20,7 +20,6 @@ enum switchdev_attr_id { SWITCHDEV_ATTR_ID_UNDEFINED, SWITCHDEV_ATTR_ID_PORT_STP_STATE, SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, - SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, SWITCHDEV_ATTR_ID_PORT_MROUTER, SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME, SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING, @@ -33,6 +32,11 @@ enum switchdev_attr_id { #endif }; +struct switchdev_brport_flags { + unsigned long val; + unsigned long mask; +}; + struct switchdev_attr { struct net_device *orig_dev; enum switchdev_attr_id id; @@ -41,7 +45,7 @@ struct switchdev_attr { void (*complete)(struct net_device *dev, int err, void *priv); union { u8 stp_state; /* PORT_STP_STATE */ - unsigned long brport_flags; /* PORT_{PRE}_BRIDGE_FLAGS */ + struct switchdev_brport_flags brport_flags; /* PORT_BRIDGE_FLAGS */ bool mrouter; /* PORT_MROUTER */ clock_t ageing_time; /* BRIDGE_AGEING_TIME */ bool vlan_filtering; /* BRIDGE_VLAN_FILTERING */ diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index ac8dead86bf2..92d207c8a30e 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -65,7 +65,7 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, { struct switchdev_attr attr = { .orig_dev = p->dev, - .id = SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS, + .id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, }; struct switchdev_notifier_port_attr_info info = { .attr = &attr, @@ -77,7 +77,8 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, if (!mask) return 0; - attr.u.brport_flags = mask; + attr.u.brport_flags.val = flags; + attr.u.brport_flags.mask = mask; /* We run from atomic context here */ err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev, @@ -93,16 +94,6 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, return -EOPNOTSUPP; } - attr.id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS; - attr.flags = SWITCHDEV_F_DEFER; - attr.u.brport_flags = flags; - - err = switchdev_port_attr_set(p->dev, &attr); - if (err) { - NL_SET_ERR_MSG_MOD(extack, "error setting offload flag on port"); - return err; - } - return 0; } diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 8a1bcb2b4208..63770e421e4d 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -174,8 +174,8 @@ int dsa_port_mdb_add(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); int dsa_port_mdb_del(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); -int dsa_port_pre_bridge_flags(const struct dsa_port *dp, unsigned long flags); -int dsa_port_bridge_flags(const struct dsa_port *dp, unsigned long flags); +int dsa_port_bridge_flags(const struct dsa_port *dp, + struct switchdev_brport_flags flags); int dsa_port_mrouter(struct dsa_port *dp, bool mrouter); int dsa_port_vlan_add(struct dsa_port *dp, const struct switchdev_obj_port_vlan *vlan); diff --git a/net/dsa/port.c b/net/dsa/port.c index 9f65ba11ad00..736c5debcb96 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -125,21 +125,22 @@ void dsa_port_disable(struct dsa_port *dp) static void dsa_port_change_brport_flags(struct dsa_port *dp, bool bridge_offload) { - unsigned long mask, flags; - int flag, err; + struct switchdev_brport_flags flags; + int flag; - mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; + flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; if (bridge_offload) - flags = mask; + flags.val = flags.mask; else - flags = mask & ~BR_LEARNING; + flags.val = flags.mask & ~BR_LEARNING; - for_each_set_bit(flag, &mask, 32) { - err = dsa_port_pre_bridge_flags(dp, BIT(flag)); - if (err) - continue; + for_each_set_bit(flag, &flags.mask, 32) { + struct switchdev_brport_flags tmp; - dsa_port_bridge_flags(dp, flags & BIT(flag)); + tmp.val = flags.val & BIT(flag); + tmp.mask = BIT(flag); + + dsa_port_bridge_flags(dp, tmp); } } @@ -423,28 +424,18 @@ int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock) return 0; } -int dsa_port_pre_bridge_flags(const struct dsa_port *dp, unsigned long flags) +int dsa_port_bridge_flags(const struct dsa_port *dp, + struct switchdev_brport_flags flags) { struct dsa_switch *ds = dp->ds; + int port = dp->index; if (!ds->ops->port_egress_floods || - (flags & ~(BR_FLOOD | BR_MCAST_FLOOD))) + (flags.mask & ~(BR_FLOOD | BR_MCAST_FLOOD))) return -EINVAL; - return 0; -} - -int dsa_port_bridge_flags(const struct dsa_port *dp, unsigned long flags) -{ - struct dsa_switch *ds = dp->ds; - int port = dp->index; - int err = 0; - - if (ds->ops->port_egress_floods) - err = ds->ops->port_egress_floods(ds, port, flags & BR_FLOOD, - flags & BR_MCAST_FLOOD); - - return err; + return ds->ops->port_egress_floods(ds, port, flags.val & BR_FLOOD, + flags.val & BR_MCAST_FLOOD); } int dsa_port_mrouter(struct dsa_port *dp, bool mrouter) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 8f4c7c232e2c..0e1f8f1d4e2c 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -290,9 +290,6 @@ static int dsa_slave_port_attr_set(struct net_device *dev, case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME: ret = dsa_port_ageing_time(dp, attr->u.ageing_time); break; - case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: - ret = dsa_port_pre_bridge_flags(dp, attr->u.brport_flags); - break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: ret = dsa_port_bridge_flags(dp, attr->u.brport_flags); break; -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 06/11] net: dsa: kill .port_egress_floods overengineering
From: Vladimir Oltean <vladimir.oltean at nxp.com> The bridge offloads the port flags through a single bit mask using switchdev, which among others, contains learning and flooding settings. The commit 57652796aa97 ("net: dsa: add support for bridge flags") missed one crucial aspect of the SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS API when designing the API one level lower, towards the drivers. This is that the bitmask of passed brport flags never has more than one bit set at a time. On the other hand, the prototype passed to the driver is .port_egress_floods(int port, bool unicast, bool multicast), which configures two flags at a time. DSA currently checks if .port_egress_floods is implemented, and if it is, reports both BR_FLOOD and BR_MCAST_FLOOD as supported. So the driver has no choice if it wants to inform the bridge that, for example, it can't configure unicast flooding independently of multicast flooding - the DSA mid layer is standing in the way. Or the other way around: a new driver wants to start configuring BR_BCAST_FLOOD separately, but what do we do with the rest, which only support unicast and multicast flooding? Do we report broadcast flooding configuration as supported for those too, and silently do nothing? Secondly, currently DSA deems the driver too dumb to deserve knowing that a SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it just calls .port_egress_floods for the CPU port. When we'll add support for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real problem because the flood settings will need to be held statefully in the DSA middle layer, otherwise changing the mrouter port attribute will impact the flooding attribute. And that's _assuming_ that the underlying hardware doesn't have anything else to do when a multicast router attaches to a port than flood unknown traffic to it. If it does, there will need to be a dedicated .port_set_mrouter anyway. Lastly, we have DSA drivers that have a backlink into a pure switchdev driver (felix -> ocelot). It seems reasonable that the other switchdev drivers should not have to suffer from the oddities of DSA overengineering, so keeping DSA a pass-through layer makes more sense there. To simplify the brport flags situation we just delete .port_egress_floods and we introduce a simple .port_bridge_flags which is passed to the driver. Also, the logic from dsa_port_mrouter is removed and a .port_set_mrouter is created. Functionally speaking, we simply move the calls to .port_egress_floods one step lower, in the two drivers that implement it: mv88e6xxx and b53, so things should work just as before. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: - Pass extack through the newly introduce dsa_port_change_brport_flags. Changes in v2: - Reordered with previous patch such that we don't need to introduce .port_pre_bridge_flags - Pass extack to drivers. drivers/net/dsa/b53/b53_common.c | 20 +++++++++++++++++++- drivers/net/dsa/mv88e6xxx/chip.c | 21 ++++++++++++++++++++- include/net/dsa.h | 7 +++++-- net/dsa/dsa_priv.h | 6 ++++-- net/dsa/port.c | 20 +++++++++----------- net/dsa/slave.c | 4 ++-- 6 files changed, 59 insertions(+), 19 deletions(-) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 23fc7225c8d1..d480493cb64d 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1948,6 +1948,23 @@ int b53_br_egress_floods(struct dsa_switch *ds, int port, } EXPORT_SYMBOL(b53_br_egress_floods); +static int b53_br_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + if (flags.mask & ~(BR_FLOOD | BR_MCAST_FLOOD)) + return -EINVAL; + + return b53_br_egress_floods(ds, port, flags.val & BR_FLOOD, + flags.val & BR_MCAST_FLOOD); +} + +static int b53_set_mrouter(struct dsa_switch *ds, int port, bool mrouter, + struct netlink_ext_ack *extack) +{ + return b53_br_egress_floods(ds, port, true, mrouter); +} + static bool b53_possible_cpu_port(struct dsa_switch *ds, int port) { /* Broadcom switches will accept enabling Broadcom tags on the @@ -2187,9 +2204,10 @@ static const struct dsa_switch_ops b53_switch_ops = { .set_mac_eee = b53_set_mac_eee, .port_bridge_join = b53_br_join, .port_bridge_leave = b53_br_leave, + .port_bridge_flags = b53_br_flags, + .port_set_mrouter = b53_set_mrouter, .port_stp_state_set = b53_br_set_stp_state, .port_fast_age = b53_br_fast_age, - .port_egress_floods = b53_br_egress_floods, .port_vlan_filtering = b53_vlan_filtering, .port_vlan_add = b53_vlan_add, .port_vlan_del = b53_vlan_del, diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index ae0b490f00cd..b230bfcc4050 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -5380,6 +5380,24 @@ static int mv88e6xxx_port_egress_floods(struct dsa_switch *ds, int port, return err; } +static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + if (flags.mask & ~(BR_FLOOD | BR_MCAST_FLOOD)) + return -EINVAL; + + return mv88e6xxx_port_egress_floods(ds, port, flags.val & BR_FLOOD, + flags.val & BR_MCAST_FLOOD); +} + +static int mv88e6xxx_port_set_mrouter(struct dsa_switch *ds, int port, + bool mrouter, + struct netlink_ext_ack *extack) +{ + return mv88e6xxx_port_egress_floods(ds, port, true, mrouter); +} + static bool mv88e6xxx_lag_can_offload(struct dsa_switch *ds, struct net_device *lag, struct netdev_lag_upper_info *info) @@ -5678,7 +5696,8 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = { .set_ageing_time = mv88e6xxx_set_ageing_time, .port_bridge_join = mv88e6xxx_port_bridge_join, .port_bridge_leave = mv88e6xxx_port_bridge_leave, - .port_egress_floods = mv88e6xxx_port_egress_floods, + .port_bridge_flags = mv88e6xxx_port_bridge_flags, + .port_set_mrouter = mv88e6xxx_port_set_mrouter, .port_stp_state_set = mv88e6xxx_port_stp_state_set, .port_fast_age = mv88e6xxx_port_fast_age, .port_vlan_filtering = mv88e6xxx_port_vlan_filtering, diff --git a/include/net/dsa.h b/include/net/dsa.h index 60acb9fca124..09aa28e667c7 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -621,8 +621,11 @@ struct dsa_switch_ops { void (*port_stp_state_set)(struct dsa_switch *ds, int port, u8 state); void (*port_fast_age)(struct dsa_switch *ds, int port); - int (*port_egress_floods)(struct dsa_switch *ds, int port, - bool unicast, bool multicast); + int (*port_bridge_flags)(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack); + int (*port_set_mrouter)(struct dsa_switch *ds, int port, bool mrouter, + struct netlink_ext_ack *extack); /* * VLAN support diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 63770e421e4d..8125806ee135 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -175,8 +175,10 @@ int dsa_port_mdb_add(const struct dsa_port *dp, int dsa_port_mdb_del(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); int dsa_port_bridge_flags(const struct dsa_port *dp, - struct switchdev_brport_flags flags); -int dsa_port_mrouter(struct dsa_port *dp, bool mrouter); + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack); +int dsa_port_mrouter(struct dsa_port *dp, bool mrouter, + struct netlink_ext_ack *extack); int dsa_port_vlan_add(struct dsa_port *dp, const struct switchdev_obj_port_vlan *vlan); int dsa_port_vlan_del(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 736c5debcb96..545f83c7b193 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -140,7 +140,7 @@ static void dsa_port_change_brport_flags(struct dsa_port *dp, tmp.val = flags.val & BIT(flag); tmp.mask = BIT(flag); - dsa_port_bridge_flags(dp, tmp); + dsa_port_bridge_flags(dp, tmp, NULL); } } @@ -425,28 +425,26 @@ int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock) } int dsa_port_bridge_flags(const struct dsa_port *dp, - struct switchdev_brport_flags flags) + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) { struct dsa_switch *ds = dp->ds; - int port = dp->index; - if (!ds->ops->port_egress_floods || - (flags.mask & ~(BR_FLOOD | BR_MCAST_FLOOD))) + if (!ds->ops->port_bridge_flags) return -EINVAL; - return ds->ops->port_egress_floods(ds, port, flags.val & BR_FLOOD, - flags.val & BR_MCAST_FLOOD); + return ds->ops->port_bridge_flags(ds, dp->index, flags, extack); } -int dsa_port_mrouter(struct dsa_port *dp, bool mrouter) +int dsa_port_mrouter(struct dsa_port *dp, bool mrouter, + struct netlink_ext_ack *extack) { struct dsa_switch *ds = dp->ds; - int port = dp->index; - if (!ds->ops->port_egress_floods) + if (!ds->ops->port_set_mrouter) return -EOPNOTSUPP; - return ds->ops->port_egress_floods(ds, port, true, mrouter); + return ds->ops->port_set_mrouter(ds, dp->index, mrouter, extack); } int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 0e1f8f1d4e2c..4a979245e059 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -291,10 +291,10 @@ static int dsa_slave_port_attr_set(struct net_device *dev, ret = dsa_port_ageing_time(dp, attr->u.ageing_time); break; case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: - ret = dsa_port_bridge_flags(dp, attr->u.brport_flags); + ret = dsa_port_bridge_flags(dp, attr->u.brport_flags, extack); break; case SWITCHDEV_ATTR_ID_BRIDGE_MROUTER: - ret = dsa_port_mrouter(dp->cpu_dp, attr->u.mrouter); + ret = dsa_port_mrouter(dp->cpu_dp, attr->u.mrouter, extack); break; default: ret = -EOPNOTSUPP; -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 07/11] net: prep switchdev drivers for concurrent SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
From: Vladimir Oltean <vladimir.oltean at nxp.com> Because the bridge will start offloading SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS while not serialized by any lock such as the br->lock spinlock, existing drivers that treat that attribute and cache the brport flags might no longer work correctly. The issue is that the brport flags are a single unsigned long bitmask, and the bridge only guarantees the validity of the changed bits, not the full state. So when offloading two concurrent switchdev attributes, such as one for BR_LEARNING and another for BR_FLOOD, it might happen that the flags having BR_FLOOD are written into the cached value, and this in turn disables the BR_LEARNING bit which was set previously. We can fix this across the board by keeping individual boolean variables for each brport flag. Note that mlxsw and prestera were setting the BR_LEARNING_SYNC flag too, but that appears to be just dead code, so I removed it. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: Patch is new. .../marvell/prestera/prestera_switchdev.c | 32 ++++++++++------ .../mellanox/mlxsw/spectrum_switchdev.c | 38 ++++++++++++------- drivers/net/ethernet/rocker/rocker.h | 2 +- drivers/net/ethernet/rocker/rocker_main.c | 2 +- drivers/net/ethernet/rocker/rocker_ofdpa.c | 26 +++++++------ net/bridge/br_switchdev.c | 3 +- 6 files changed, 62 insertions(+), 41 deletions(-) diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c index a797a7ff0cfe..76030c4c1546 100644 --- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c +++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c @@ -49,7 +49,9 @@ struct prestera_bridge_port { struct prestera_bridge *bridge; struct list_head vlan_list; refcount_t ref_count; - unsigned long flags; + bool learning; + bool ucast_flood; + bool mcast_flood; u8 stp_state; }; @@ -339,8 +341,9 @@ prestera_bridge_port_create(struct prestera_bridge *bridge, if (!br_port) return NULL; - br_port->flags = BR_LEARNING | BR_FLOOD | BR_LEARNING_SYNC | - BR_MCAST_FLOOD; + br_port->learning = true; + br_port->ucast_flood = true; + br_port->mcast_flood = true; br_port->stp_state = BR_STATE_DISABLED; refcount_set(&br_port->ref_count, 1); br_port->bridge = bridge; @@ -404,11 +407,11 @@ prestera_bridge_1d_port_join(struct prestera_bridge_port *br_port) if (err) return err; - err = prestera_hw_port_flood_set(port, br_port->flags & BR_FLOOD); + err = prestera_hw_port_flood_set(port, br_port->ucast_flood); if (err) goto err_port_flood_set; - err = prestera_hw_port_learning_set(port, br_port->flags & BR_LEARNING); + err = prestera_hw_port_learning_set(port, br_port->learning); if (err) goto err_port_learning_set; @@ -594,19 +597,24 @@ static int prestera_port_attr_br_flags_set(struct prestera_port *port, return 0; if (flags.mask & BR_FLOOD) { - err = prestera_hw_port_flood_set(port, flags.val & BR_FLOOD); + bool ucast_flood = !!(flags.val & BR_FLOOD); + + err = prestera_hw_port_flood_set(port, ucast_flood); if (err) return err; + + br_port->ucast_flood = ucast_flood; } if (flags.mask & BR_LEARNING) { - err = prestera_hw_port_learning_set(port, - flags.val & BR_LEARNING); + bool learning = !!(flags.val & BR_LEARNING); + + err = prestera_hw_port_learning_set(port, learning); if (err) return err; - } - memcpy(&br_port->flags, &flags.val, sizeof(flags.val)); + br_port->learning = learning; + } return 0; } @@ -899,11 +907,11 @@ prestera_port_vlan_bridge_join(struct prestera_port_vlan *port_vlan, if (port_vlan->br_port) return 0; - err = prestera_hw_port_flood_set(port, br_port->flags & BR_FLOOD); + err = prestera_hw_port_flood_set(port, br_port->ucast_flood); if (err) return err; - err = prestera_hw_port_learning_set(port, br_port->flags & BR_LEARNING); + err = prestera_hw_port_learning_set(port, br_port->learning); if (err) goto err_port_learning_set; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c index 7af79e3a3c7a..0c00754e0cb5 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -62,9 +62,11 @@ struct mlxsw_sp_bridge_port { struct list_head vlans_list; unsigned int ref_count; u8 stp_state; - unsigned long flags; bool mrouter; bool lagged; + bool ucast_flood; + bool mcast_flood; + bool learning; union { u16 lag_id; u16 system_port; @@ -349,8 +351,9 @@ mlxsw_sp_bridge_port_create(struct mlxsw_sp_bridge_device *bridge_device, bridge_port->dev = brport_dev; bridge_port->bridge_device = bridge_device; bridge_port->stp_state = BR_STATE_DISABLED; - bridge_port->flags = BR_LEARNING | BR_FLOOD | BR_LEARNING_SYNC | - BR_MCAST_FLOOD; + bridge_port->learning = true; + bridge_port->ucast_flood = true; + bridge_port->mcast_flood = true; INIT_LIST_HEAD(&bridge_port->vlans_list); list_add(&bridge_port->list, &bridge_device->ports_list); bridge_port->ref_count = 1; @@ -669,36 +672,45 @@ static int mlxsw_sp_port_attr_br_flags_set(struct mlxsw_sp_port *mlxsw_sp_port, return 0; if (flags.mask & BR_FLOOD) { + bool ucast_flood = !!(flags.val & BR_FLOOD); + err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port, MLXSW_SP_FLOOD_TYPE_UC, - flags.val & BR_FLOOD); + ucast_flood); if (err) return err; + + bridge_port->ucast_flood = ucast_flood; } if (flags.mask & BR_LEARNING) { + bool learning = !!(flags.val & BR_LEARNING); + err = mlxsw_sp_bridge_port_learning_set(mlxsw_sp_port, - bridge_port, - flags.val & BR_LEARNING); + bridge_port, learning); if (err) return err; + + bridge_port->learning = learning; } if (bridge_port->bridge_device->multicast_enabled) - goto out; + return 0; if (flags.mask & BR_MCAST_FLOOD) { + bool mcast_flood = !!(flags.val & BR_MCAST_FLOOD); + err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port, MLXSW_SP_FLOOD_TYPE_MC, - flags.val & BR_MCAST_FLOOD); + mcast_flood); if (err) return err; + + bridge_port->mcast_flood = mcast_flood; } -out: - memcpy(&bridge_port->flags, &flags.val, sizeof(flags.val)); return 0; } @@ -796,7 +808,7 @@ static bool mlxsw_sp_mc_flood(const struct mlxsw_sp_bridge_port *bridge_port) bridge_device = bridge_port->bridge_device; return bridge_device->multicast_enabled ? bridge_port->mrouter : - bridge_port->flags & BR_MCAST_FLOOD; + bridge_port->mcast_flood; } static int mlxsw_sp_port_mc_disabled_set(struct mlxsw_sp_port *mlxsw_sp_port, @@ -962,7 +974,7 @@ mlxsw_sp_port_vlan_fid_join(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan, return PTR_ERR(fid); err = mlxsw_sp_fid_flood_set(fid, MLXSW_SP_FLOOD_TYPE_UC, local_port, - bridge_port->flags & BR_FLOOD); + bridge_port->ucast_flood); if (err) goto err_fid_uc_flood_set; @@ -1043,7 +1055,7 @@ mlxsw_sp_port_vlan_bridge_join(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan, return err; err = mlxsw_sp_port_vid_learning_set(mlxsw_sp_port, vid, - bridge_port->flags & BR_LEARNING); + bridge_port->learning); if (err) goto err_port_vid_learning_set; diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h index 315a6e5c0f59..2f533c44fe88 100644 --- a/drivers/net/ethernet/rocker/rocker.h +++ b/drivers/net/ethernet/rocker/rocker.h @@ -103,7 +103,7 @@ struct rocker_world_ops { int (*port_attr_stp_state_set)(struct rocker_port *rocker_port, u8 state); int (*port_attr_bridge_flags_set)(struct rocker_port *rocker_port, - unsigned long brport_flags); + struct switchdev_brport_flags flags); int (*port_attr_bridge_flags_support_get)(const struct rocker_port * rocker_port, unsigned long * diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c index 898abf3d14d0..b1a67c6423a8 100644 --- a/drivers/net/ethernet/rocker/rocker_main.c +++ b/drivers/net/ethernet/rocker/rocker_main.c @@ -1593,7 +1593,7 @@ rocker_world_port_attr_bridge_flags_set(struct rocker_port *rocker_port, if (flags.mask & ~brport_flags_s) return -EINVAL; - return wops->port_attr_bridge_flags_set(rocker_port, flags.val); + return wops->port_attr_bridge_flags_set(rocker_port, flags); } static int diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c index 967a634ee9ac..eb36c2a2da7b 100644 --- a/drivers/net/ethernet/rocker/rocker_ofdpa.c +++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c @@ -198,7 +198,7 @@ struct ofdpa_port { struct net_device *bridge_dev; __be16 internal_vlan_id; int stp_state; - u32 brport_flags; + bool learning; unsigned long ageing_time; bool ctrls[OFDPA_CTRL_MAX]; unsigned long vlan_bitmap[OFDPA_VLAN_BITMAP_LEN]; @@ -2423,7 +2423,7 @@ static int ofdpa_port_pre_init(struct rocker_port *rocker_port) ofdpa_port->rocker_port = rocker_port; ofdpa_port->dev = rocker_port->dev; ofdpa_port->pport = rocker_port->pport; - ofdpa_port->brport_flags = BR_LEARNING; + ofdpa_port->learning = true; ofdpa_port->ageing_time = BR_DEFAULT_AGEING_TIME; return 0; } @@ -2433,8 +2433,7 @@ static int ofdpa_port_init(struct rocker_port *rocker_port) struct ofdpa_port *ofdpa_port = rocker_port->wpriv; int err; - rocker_port_set_learning(rocker_port, - !!(ofdpa_port->brport_flags & BR_LEARNING)); + rocker_port_set_learning(rocker_port, ofdpa_port->learning); err = ofdpa_port_ig_tbl(ofdpa_port, 0); if (err) { @@ -2488,20 +2487,23 @@ static int ofdpa_port_attr_stp_state_set(struct rocker_port *rocker_port, } static int ofdpa_port_attr_bridge_flags_set(struct rocker_port *rocker_port, - unsigned long brport_flags) + struct switchdev_brport_flags flags) { struct ofdpa_port *ofdpa_port = rocker_port->wpriv; - unsigned long orig_flags; - int err = 0; + int err; - orig_flags = ofdpa_port->brport_flags; - ofdpa_port->brport_flags = brport_flags; + if (flags.mask & BR_LEARNING) { + bool learning = !!(flags.val & BR_LEARNING); - if ((orig_flags ^ ofdpa_port->brport_flags) & BR_LEARNING) err = rocker_port_set_learning(ofdpa_port->rocker_port, - !!(ofdpa_port->brport_flags & BR_LEARNING)); + learning); + if (err) + return err; - return err; + ofdpa_port->learning = learning; + } + + return 0; } static int diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 92d207c8a30e..dbd94156960f 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -72,12 +72,11 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, }; int err; - flags &= BR_PORT_FLAGS_HW_OFFLOAD; mask &= BR_PORT_FLAGS_HW_OFFLOAD; if (!mask) return 0; - attr.u.brport_flags.val = flags; + attr.u.brport_flags.val = flags & mask; attr.u.brport_flags.mask = mask; /* We run from atomic context here */ -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 08/11] net: bridge: put SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS on the blocking call chain
From: Vladimir Oltean <vladimir.oltean at nxp.com> Since we would like br_switchdev_set_port_flag to not use an atomic notifier, it should be called from outside spinlock context. We can temporarily drop br->lock, but that creates some concurrency complications (example below is given for sysfs): - There might be an "echo 1 > multicast_flood" simultaneous with an "echo 0 > multicast_flood". The result of this is nondeterministic either way, so I'm not too concerned as long as the result is consistent (no other flags have changed). - There might be an "echo 1 > multicast_flood" simultaneous with an "echo 0 > learning". My expectation is that none of the two writes are "eaten", and the final flags contain BR_MCAST_FLOOD=1 and BR_LEARNING=0 regardless of the order of execution. That is actually possible if, on the commit path, we don't do a trivial "p->flags = flags" which might overwrite bits outside of our mask, but instead we just change the flags corresponding to our mask. Now that br_switchdev_set_port_flag is never called from under br->lock, it runs in sleepable context. All switchdev drivers handle SWITCHDEV_PORT_ATTR_SET as both blocking and atomic, so no changes are needed on that front. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: - Drop the br->lock around br_switchdev_set_port_flag in this patch, for both sysfs and netlink. - Only set/restore the masked bits in p->flags to avoid concurrency issues. Changes in v2: Patch is new. net/bridge/br_netlink.c | 10 +++++++--- net/bridge/br_switchdev.c | 5 ++--- net/bridge/br_sysfs_if.c | 22 ++++++++++++++-------- 3 files changed, 23 insertions(+), 14 deletions(-) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index b7731614c036..8f09106966c4 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -869,7 +869,7 @@ static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[], static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], struct netlink_ext_ack *extack) { - unsigned long old_flags, changed_mask; + unsigned long flags, old_flags, changed_mask; bool br_vlan_tunnel_old; int err; @@ -896,10 +896,14 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); changed_mask = old_flags ^ p->flags; + flags = p->flags; - err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack); + spin_unlock_bh(&p->br->lock); + err = br_switchdev_set_port_flag(p, flags, changed_mask, extack); + spin_lock_bh(&p->br->lock); if (err) { - p->flags = old_flags; + p->flags &= ~changed_mask; + p->flags |= (old_flags & changed_mask); goto out; } diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index dbd94156960f..a79164ee65b9 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -79,9 +79,8 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, attr.u.brport_flags.val = flags & mask; attr.u.brport_flags.mask = mask; - /* We run from atomic context here */ - err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev, - &info.info, extack); + err = call_switchdev_blocking_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev, + &info.info, extack); err = notifier_to_errno(err); if (err == -EOPNOTSUPP) return 0; diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c index 72e92376eef1..3f21fdd1cdaa 100644 --- a/net/bridge/br_sysfs_if.c +++ b/net/bridge/br_sysfs_if.c @@ -68,16 +68,22 @@ static int store_flag(struct net_bridge_port *p, unsigned long v, else flags &= ~mask; - if (flags != p->flags) { - err = br_switchdev_set_port_flag(p, flags, mask, &extack); - if (err) { - netdev_err(p->dev, "%s\n", extack._msg); - return err; - } + if (flags == p->flags) + return 0; - p->flags = flags; - br_port_flags_change(p, mask); + spin_unlock_bh(&p->br->lock); + err = br_switchdev_set_port_flag(p, flags, mask, &extack); + spin_lock_bh(&p->br->lock); + if (err) { + netdev_err(p->dev, "%s\n", extack._msg); + return err; } + + p->flags &= ~mask; + p->flags |= (flags & mask); + + br_port_flags_change(p, mask); + return 0; } -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 09/11] net: mscc: ocelot: use separate flooding PGID for broadcast
From: Vladimir Oltean <vladimir.oltean at nxp.com> In preparation of offloading the bridge port flags which have independent settings for unknown multicast and for broadcast, we should also start reserving one destination Port Group ID for the flooding of broadcast packets, to allow configuring it individually. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: None. Changes in v2: None. drivers/net/ethernet/mscc/ocelot.c | 13 ++++++++----- include/soc/mscc/ocelot.h | 15 ++++++++------- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index f8b85ab8be5d..8c1052346b58 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1662,7 +1662,7 @@ int ocelot_init(struct ocelot *ocelot) /* Setup flooding PGIDs */ for (i = 0; i < ocelot->num_flooding_pgids; i++) ocelot_write_rix(ocelot, ANA_FLOODING_FLD_MULTICAST(PGID_MC) | - ANA_FLOODING_FLD_BROADCAST(PGID_MC) | + ANA_FLOODING_FLD_BROADCAST(PGID_BC) | ANA_FLOODING_FLD_UNICAST(PGID_UC), ANA_FLOODING, i); ocelot_write(ocelot, ANA_FLOODING_IPMC_FLD_MC6_DATA(PGID_MCIPV6) | @@ -1683,15 +1683,18 @@ int ocelot_init(struct ocelot *ocelot) ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, PGID_SRC + port); } - /* Allow broadcast MAC frames. */ for_each_nonreserved_multicast_dest_pgid(ocelot, i) { u32 val = ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports - 1, 0)); ocelot_write_rix(ocelot, val, ANA_PGID_PGID, i); } - ocelot_write_rix(ocelot, - ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)), - ANA_PGID_PGID, PGID_MC); + /* Allow broadcast and unknown L2 multicast to the CPU. */ + ocelot_rmw_rix(ocelot, ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)), + ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)), + ANA_PGID_PGID, PGID_MC); + ocelot_rmw_rix(ocelot, ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)), + ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)), + ANA_PGID_PGID, PGID_BC); ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, PGID_MCIPV4); ocelot_write_rix(ocelot, 0, ANA_PGID_PGID, PGID_MCIPV6); diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h index d0d48e9620fb..7ee85527cb5f 100644 --- a/include/soc/mscc/ocelot.h +++ b/include/soc/mscc/ocelot.h @@ -54,16 +54,17 @@ * PGID_CPU: used for whitelisting certain MAC addresses, such as the addresses * of the switch port net devices, towards the CPU port module. * PGID_UC: the flooding destinations for unknown unicast traffic. - * PGID_MC: the flooding destinations for broadcast and non-IP multicast - * traffic. + * PGID_MC: the flooding destinations for non-IP multicast traffic. * PGID_MCIPV4: the flooding destinations for IPv4 multicast traffic. * PGID_MCIPV6: the flooding destinations for IPv6 multicast traffic. + * PGID_BC: the flooding destinations for broadcast traffic. */ -#define PGID_CPU 59 -#define PGID_UC 60 -#define PGID_MC 61 -#define PGID_MCIPV4 62 -#define PGID_MCIPV6 63 +#define PGID_CPU 58 +#define PGID_UC 59 +#define PGID_MC 60 +#define PGID_MCIPV4 61 +#define PGID_MCIPV6 62 +#define PGID_BC 63 #define for_each_unicast_dest_pgid(ocelot, pgid) \ for ((pgid) = 0; \ -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 10/11] net: mscc: ocelot: offload bridge port flags to device
From: Vladimir Oltean <vladimir.oltean at nxp.com> We should not be unconditionally enabling address learning, since doing that is actively detrimential when a port is standalone and not offloading a bridge. Namely, if a port in the switch is standalone and others are offloading the bridge, then we could enter a situation where we learn an address towards the standalone port, but the bridged ports could not forward the packet there, because the CPU is the only path between the standalone and the bridged ports. The solution of course is to not enable address learning unless the bridge asks for it. We need to set up the initial port flags for no learning and flooding everything, then the bridge takes over. The flood configuration was already configured ok in ocelot_init, we just need to disable learning in ocelot_init_port. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> Reviewed-by: Alexandre Belloni <alexandre.belloni at bootlin.com> --- Changes in v3: None. Changes in v2: - Disable learning in ocelot_init_port. - Keep a single bool ocelot_port->learn_ena instead of ocelot_port->brport_flags. - Stop touching the brport_flags from ocelot_port_bridge_leave (which was a leftover). drivers/net/dsa/ocelot/felix.c | 10 +++++ drivers/net/ethernet/mscc/ocelot.c | 59 +++++++++++++++++++++++++- drivers/net/ethernet/mscc/ocelot_net.c | 4 ++ include/soc/mscc/ocelot.h | 3 ++ 4 files changed, 75 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 1bd5aea12b25..4ff18415ef95 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -553,6 +553,15 @@ static void felix_bridge_stp_state_set(struct dsa_switch *ds, int port, return ocelot_bridge_stp_state_set(ocelot, port, state); } +static int felix_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags val, + struct netlink_ext_ack *extack) +{ + struct ocelot *ocelot = ds->priv; + + return ocelot_port_bridge_flags(ocelot, port, val); +} + static int felix_bridge_join(struct dsa_switch *ds, int port, struct net_device *br) { @@ -1358,6 +1367,7 @@ const struct dsa_switch_ops felix_switch_ops = { .port_fdb_del = felix_fdb_del, .port_mdb_add = felix_mdb_add, .port_mdb_del = felix_mdb_del, + .port_bridge_flags = felix_bridge_flags, .port_bridge_join = felix_bridge_join, .port_bridge_leave = felix_bridge_leave, .port_lag_join = felix_lag_join, diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 8c1052346b58..65bc7d59d2c9 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -984,6 +984,7 @@ EXPORT_SYMBOL(ocelot_apply_bridge_fwd_mask); void ocelot_bridge_stp_state_set(struct ocelot *ocelot, int port, u8 state) { + struct ocelot_port *ocelot_port = ocelot->ports[port]; u32 port_cfg; if (!(BIT(port) & ocelot->bridge_mask)) @@ -996,7 +997,8 @@ void ocelot_bridge_stp_state_set(struct ocelot *ocelot, int port, u8 state) ocelot->bridge_fwd_mask |= BIT(port); fallthrough; case BR_STATE_LEARNING: - port_cfg |= ANA_PORT_PORT_CFG_LEARN_ENA; + if (ocelot_port->learn_ena) + port_cfg |= ANA_PORT_PORT_CFG_LEARN_ENA; break; default: @@ -1480,6 +1482,57 @@ int ocelot_get_max_mtu(struct ocelot *ocelot, int port) } EXPORT_SYMBOL(ocelot_get_max_mtu); +int ocelot_port_bridge_flags(struct ocelot *ocelot, int port, + struct switchdev_brport_flags flags) +{ + struct ocelot_port *ocelot_port = ocelot->ports[port]; + + if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | + BR_BCAST_FLOOD)) + return -EINVAL; + + if (flags.mask & BR_LEARNING) { + u32 val = 0; + + ocelot_port->learn_ena = !!(flags.val & BR_LEARNING); + if (ocelot_port->learn_ena) + val = ANA_PORT_PORT_CFG_LEARN_ENA; + + ocelot_rmw_gix(ocelot, val, ANA_PORT_PORT_CFG_LEARN_ENA, + ANA_PORT_PORT_CFG, port); + } + + if (flags.mask & BR_FLOOD) { + u32 val = 0; + + if (flags.val & BR_FLOOD) + val = BIT(port); + + ocelot_rmw_rix(ocelot, val, BIT(port), ANA_PGID_PGID, PGID_UC); + } + + if (flags.mask & BR_MCAST_FLOOD) { + u32 val = 0; + + if (flags.val & BR_MCAST_FLOOD) + val = BIT(port); + + ocelot_rmw_rix(ocelot, val, BIT(port), ANA_PGID_PGID, PGID_MC); + } + + if (flags.mask & BR_BCAST_FLOOD) { + u32 val = 0; + + if (flags.val & BR_BCAST_FLOOD) + val = BIT(port); + + ocelot_rmw_rix(ocelot, val, BIT(port), ANA_PGID_PGID, PGID_BC); + } + + return 0; +} +EXPORT_SYMBOL(ocelot_port_bridge_flags); + void ocelot_init_port(struct ocelot *ocelot, int port) { struct ocelot_port *ocelot_port = ocelot->ports[port]; @@ -1524,6 +1577,10 @@ void ocelot_init_port(struct ocelot *ocelot, int port) ANA_PORT_DROP_CFG_DROP_MC_SMAC_ENA, ANA_PORT_DROP_CFG, port); + /* Disable source address learning for standalone mode */ + ocelot_rmw_gix(ocelot, 0, ANA_PORT_PORT_CFG_LEARN_ENA, + ANA_PORT_PORT_CFG, port); + /* Set default VLAN and tag type to 8021Q. */ ocelot_rmw_gix(ocelot, REW_PORT_VLAN_CFG_PORT_TPID(ETH_P_8021Q), REW_PORT_VLAN_CFG_PORT_TPID_M, diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c index f9da4aa39444..9944d79c6685 100644 --- a/drivers/net/ethernet/mscc/ocelot_net.c +++ b/drivers/net/ethernet/mscc/ocelot_net.c @@ -1026,6 +1026,10 @@ static int ocelot_port_attr_set(struct net_device *dev, case SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED: ocelot_port_attr_mc_set(ocelot, port, !attr->u.mc_disabled); break; + case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: + err = ocelot_port_bridge_flags(ocelot, port, + attr->u.brport_flags); + break; default: err = -EOPNOTSUPP; break; diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h index 7ee85527cb5f..e6aacf65fa1e 100644 --- a/include/soc/mscc/ocelot.h +++ b/include/soc/mscc/ocelot.h @@ -612,6 +612,7 @@ struct ocelot_port { u8 *xmit_template; bool is_dsa_8021q_cpu; + bool learn_ena; struct net_device *bond; bool lag_tx_active; @@ -764,6 +765,8 @@ void ocelot_adjust_link(struct ocelot *ocelot, int port, int ocelot_port_vlan_filtering(struct ocelot *ocelot, int port, bool enabled); void ocelot_bridge_stp_state_set(struct ocelot *ocelot, int port, u8 state); void ocelot_apply_bridge_fwd_mask(struct ocelot *ocelot); +int ocelot_port_bridge_flags(struct ocelot *ocelot, int port, + struct switchdev_brport_flags val); int ocelot_port_bridge_join(struct ocelot *ocelot, int port, struct net_device *bridge); int ocelot_port_bridge_leave(struct ocelot *ocelot, int port, -- 2.25.1
Vladimir Oltean
2021-Feb-10 09:14 UTC
[Bridge] [PATCH v3 net-next 11/11] net: dsa: sja1105: offload bridge port flags to device
From: Vladimir Oltean <vladimir.oltean at nxp.com> The chip can configure unicast flooding, broadcast flooding and learning. Learning is per port, while flooding is per {ingress, egress} port pair and we need to configure the same value for all possible ingress ports towards the requested one. While multicast flooding is not officially supported, we can hack it by using a feature of the second generation (P/Q/R/S) devices, which is that FDB entries are maskable, and multicast addresses always have an odd first octet. So by putting a match-all for 00:01:00:00:00:00 addr and 00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is always checked last, and does not take precedence in front of any other MDB. So it behaves effectively as an unknown multicast entry. For the first generation switches, this feature is not available, so unknown multicast will always be treated the same as unknown unicast. So the only thing we can do is request the user to offload the settings for these 2 flags in tandem, i.e. ip link set swp2 type bridge_slave flood off Error: sja1105: This chip cannot configure multicast flooding independently of unicast. ip link set swp2 type bridge_slave flood off mcast_flood off ip link set swp2 type bridge_slave mcast_flood on Error: sja1105: This chip cannot configure multicast flooding independently of unicast. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- Changes in v3: None. Changes in v2: Patch is new. drivers/net/dsa/sja1105/sja1105.h | 2 + drivers/net/dsa/sja1105/sja1105_main.c | 212 +++++++++++++++++++++++-- drivers/net/dsa/sja1105/sja1105_spi.c | 6 + 3 files changed, 209 insertions(+), 11 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h index d582308c2401..15a0893d0ff1 100644 --- a/drivers/net/dsa/sja1105/sja1105.h +++ b/drivers/net/dsa/sja1105/sja1105.h @@ -94,6 +94,7 @@ struct sja1105_info { * pop it when it's equal to TPID2. */ u16 qinq_tpid; + bool can_limit_mcast_flood; int (*reset_cmd)(struct dsa_switch *ds); int (*setup_rgmii_delay)(const void *ctx, int port); /* Prototypes from include/net/dsa.h */ @@ -204,6 +205,7 @@ struct sja1105_private { bool rgmii_rx_delay[SJA1105_NUM_PORTS]; bool rgmii_tx_delay[SJA1105_NUM_PORTS]; bool best_effort_vlan_filtering; + unsigned long learn_ena; const struct sja1105_info *info; struct gpio_desc *reset_gpio; struct spi_device *spidev; diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 282253543f3b..8373cc1f5df1 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -25,6 +25,8 @@ #include "sja1105_sgmii.h" #include "sja1105_tas.h" +#define SJA1105_UNKNOWN_MULTICAST 0x010000000000ull + static const struct dsa_switch_ops sja1105_switch_ops; static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len, @@ -42,15 +44,10 @@ static void sja1105_port_allow_traffic(struct sja1105_l2_forwarding_entry *l2_fwd, int from, int to, bool allow) { - if (allow) { - l2_fwd[from].bc_domain |= BIT(to); + if (allow) l2_fwd[from].reach_port |= BIT(to); - l2_fwd[from].fl_domain |= BIT(to); - } else { - l2_fwd[from].bc_domain &= ~BIT(to); + else l2_fwd[from].reach_port &= ~BIT(to); - l2_fwd[from].fl_domain &= ~BIT(to); - } } /* Structure used to temporarily transport device tree @@ -220,17 +217,43 @@ static int sja1105_init_mii_settings(struct sja1105_private *priv, static int sja1105_init_static_fdb(struct sja1105_private *priv) { + struct sja1105_l2_lookup_entry *l2_lookup; struct sja1105_table *table; + int port; table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP]; - /* We only populate the FDB table through dynamic - * L2 Address Lookup entries + /* We only populate the FDB table through dynamic L2 Address Lookup + * entries, except for a special entry at the end which is a catch-all + * for unknown multicast and will be used to control flooding domain. */ if (table->entry_count) { kfree(table->entries); table->entry_count = 0; } + + if (!priv->info->can_limit_mcast_flood) + return 0; + + table->entries = kcalloc(1, table->ops->unpacked_entry_size, + GFP_KERNEL); + if (!table->entries) + return -ENOMEM; + + table->entry_count = 1; + l2_lookup = table->entries; + + /* All L2 multicast addresses have an odd first octet */ + l2_lookup[0].macaddr = SJA1105_UNKNOWN_MULTICAST; + l2_lookup[0].mask_macaddr = SJA1105_UNKNOWN_MULTICAST; + l2_lookup[0].lockeds = true; + l2_lookup[0].index = SJA1105_MAX_L2_LOOKUP_COUNT - 1; + + /* Flood multicast to every port by default */ + for (port = 0; port < priv->ds->num_ports; port++) + if (!dsa_is_unused_port(priv->ds, port)) + l2_lookup[0].destports |= BIT(port); + return 0; } @@ -390,6 +413,12 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv) sja1105_port_allow_traffic(l2fwd, i, upstream, true); sja1105_port_allow_traffic(l2fwd, upstream, i, true); + + l2fwd[i].bc_domain = BIT(upstream); + l2fwd[i].fl_domain = BIT(upstream); + + l2fwd[upstream].bc_domain |= BIT(i); + l2fwd[upstream].fl_domain |= BIT(i); } /* Next 8 entries define VLAN PCP mapping from ingress to egress. * Create a one-to-one mapping. @@ -1514,6 +1543,12 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port, */ if (!(l2_lookup.destports & BIT(port))) continue; + + /* We need to hide the FDB entry for unknown multicast */ + if (l2_lookup.macaddr == SJA1105_UNKNOWN_MULTICAST && + l2_lookup.mask_macaddr == SJA1105_UNKNOWN_MULTICAST) + continue; + u64_to_ether_addr(l2_lookup.macaddr, macaddr); /* We need to hide the dsa_8021q VLANs from the user. */ @@ -1605,12 +1640,12 @@ static void sja1105_bridge_stp_state_set(struct dsa_switch *ds, int port, case BR_STATE_LEARNING: mac[port].ingress = true; mac[port].egress = false; - mac[port].dyn_learn = true; + mac[port].dyn_learn = !!(priv->learn_ena & BIT(port)); break; case BR_STATE_FORWARDING: mac[port].ingress = true; mac[port].egress = true; - mac[port].dyn_learn = true; + mac[port].dyn_learn = !!(priv->learn_ena & BIT(port)); break; default: dev_err(ds->dev, "invalid STP state: %d\n", state); @@ -3239,6 +3274,160 @@ static void sja1105_port_policer_del(struct dsa_switch *ds, int port) sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING); } +static int sja1105_port_set_learning(struct sja1105_private *priv, int port, + bool enabled) +{ + struct sja1105_mac_config_entry *mac; + int rc; + + mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries; + + mac[port].dyn_learn = !!(priv->learn_ena & BIT(port)); + + rc = sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port, + &mac[port], true); + if (rc) + return rc; + + if (enabled) + priv->learn_ena |= BIT(port); + else + priv->learn_ena &= ~BIT(port); + + return 0; +} + +/* Common function for unicast and broadcast flood configuration. + * Flooding is configured between each {ingress, egress} port pair, and since + * the bridge's semantics are those of "egress flooding", it means we must + * enable flooding towards this port from all ingress ports that are in the + * same bridge. In practice, we just enable flooding from all possible ingress + * ports regardless of whether they're in the same bridge or not, since the + * reach_port configuration will not allow flooded frames to leak across + * bridging domains anyway. + */ +static int sja1105_port_ucast_bcast_flood(struct sja1105_private *priv, int to, + struct switchdev_brport_flags flags) +{ + struct sja1105_l2_forwarding_entry *l2_fwd; + int from, rc; + + l2_fwd = priv->static_config.tables[BLK_IDX_L2_FORWARDING].entries; + + for (from = 0; from < priv->ds->num_ports; from++) { + if (dsa_is_unused_port(priv->ds, from)) + continue; + if (from == to) + continue; + + /* Unicast */ + if (flags.mask & BR_FLOOD) { + if (flags.val & BR_FLOOD) + l2_fwd[from].fl_domain |= BIT(to); + else + l2_fwd[from].fl_domain &= ~BIT(to); + } + /* Broadcast */ + if (flags.mask & BR_BCAST_FLOOD) { + if (flags.val & BR_BCAST_FLOOD) + l2_fwd[from].bc_domain |= BIT(to); + else + l2_fwd[from].bc_domain &= ~BIT(to); + } + + rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_FORWARDING, + from, &l2_fwd[from], true); + if (rc < 0) + return rc; + } + + return 0; +} + +static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + struct sja1105_l2_lookup_entry *l2_lookup; + struct sja1105_table *table; + int match; + + table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP]; + l2_lookup = table->entries; + + for (match = 0; match < table->entry_count; match++) + if (l2_lookup[match].macaddr == SJA1105_UNKNOWN_MULTICAST && + l2_lookup[match].mask_macaddr == SJA1105_UNKNOWN_MULTICAST) + break; + + if (match == table->entry_count) { + NL_SET_ERR_MSG_MOD(extack, + "Could not find FDB entry for unknown multicast"); + return -ENOSPC; + } + + if (flags.val & BR_MCAST_FLOOD) + l2_lookup[match].destports |= BIT(to); + else + l2_lookup[match].destports &= ~BIT(to); + + return sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP, + l2_lookup[match].index, + &l2_lookup[match], + true); +} + +static int sja1105_port_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + struct sja1105_private *priv = ds->priv; + int rc; + + if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | + BR_BCAST_FLOOD)) + return -EINVAL; + + if (flags.mask & BR_LEARNING) { + bool learn_ena = !!(flags.val & BR_LEARNING); + + rc = sja1105_port_set_learning(priv, port, learn_ena); + if (rc) + return rc; + } + + if (flags.mask & (BR_FLOOD | BR_BCAST_FLOOD)) { + rc = sja1105_port_ucast_bcast_flood(priv, port, flags); + if (rc) + return rc; + } + + if (flags.mask & (BR_FLOOD | BR_MCAST_FLOOD) && + !priv->info->can_limit_mcast_flood) { + bool multicast = !!(flags.val & BR_MCAST_FLOOD); + bool unicast = !!(flags.val & BR_FLOOD); + + if (unicast != multicast) { + NL_SET_ERR_MSG_MOD(extack, + "This chip cannot configure multicast flooding independently of unicast"); + return -EINVAL; + } + } + + /* For chips that can't offload BR_MCAST_FLOOD independently, there + * is nothing to do here, we ensured the configuration is in sync by + * offloading BR_FLOOD. + */ + if (flags.mask & BR_MCAST_FLOOD && priv->info->can_limit_mcast_flood) { + rc = sja1105_port_mcast_flood(priv, port, flags, + extack); + if (rc) + return rc; + } + + return 0; +} + static const struct dsa_switch_ops sja1105_switch_ops = { .get_tag_protocol = sja1105_get_tag_protocol, .setup = sja1105_setup, @@ -3262,6 +3451,7 @@ static const struct dsa_switch_ops sja1105_switch_ops = { .port_fdb_del = sja1105_fdb_del, .port_bridge_join = sja1105_bridge_join, .port_bridge_leave = sja1105_bridge_leave, + .port_bridge_flags = sja1105_port_bridge_flags, .port_stp_state_set = sja1105_bridge_stp_state_set, .port_vlan_filtering = sja1105_vlan_filtering, .port_vlan_add = sja1105_vlan_add, diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c index 591c5734747d..f7a1514f81e8 100644 --- a/drivers/net/dsa/sja1105/sja1105_spi.c +++ b/drivers/net/dsa/sja1105/sja1105_spi.c @@ -512,6 +512,7 @@ const struct sja1105_info sja1105e_info = { .static_ops = sja1105e_table_ops, .dyn_ops = sja1105et_dyn_ops, .qinq_tpid = ETH_P_8021Q, + .can_limit_mcast_flood = false, .ptp_ts_bits = 24, .ptpegr_ts_bytes = 4, .num_cbs_shapers = SJA1105ET_MAX_CBS_COUNT, @@ -529,6 +530,7 @@ const struct sja1105_info sja1105t_info = { .static_ops = sja1105t_table_ops, .dyn_ops = sja1105et_dyn_ops, .qinq_tpid = ETH_P_8021Q, + .can_limit_mcast_flood = false, .ptp_ts_bits = 24, .ptpegr_ts_bytes = 4, .num_cbs_shapers = SJA1105ET_MAX_CBS_COUNT, @@ -546,6 +548,7 @@ const struct sja1105_info sja1105p_info = { .static_ops = sja1105p_table_ops, .dyn_ops = sja1105pqrs_dyn_ops, .qinq_tpid = ETH_P_8021AD, + .can_limit_mcast_flood = true, .ptp_ts_bits = 32, .ptpegr_ts_bytes = 8, .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT, @@ -564,6 +567,7 @@ const struct sja1105_info sja1105q_info = { .static_ops = sja1105q_table_ops, .dyn_ops = sja1105pqrs_dyn_ops, .qinq_tpid = ETH_P_8021AD, + .can_limit_mcast_flood = true, .ptp_ts_bits = 32, .ptpegr_ts_bytes = 8, .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT, @@ -582,6 +586,7 @@ const struct sja1105_info sja1105r_info = { .static_ops = sja1105r_table_ops, .dyn_ops = sja1105pqrs_dyn_ops, .qinq_tpid = ETH_P_8021AD, + .can_limit_mcast_flood = true, .ptp_ts_bits = 32, .ptpegr_ts_bytes = 8, .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT, @@ -601,6 +606,7 @@ const struct sja1105_info sja1105s_info = { .dyn_ops = sja1105pqrs_dyn_ops, .regs = &sja1105pqrs_regs, .qinq_tpid = ETH_P_8021AD, + .can_limit_mcast_flood = true, .ptp_ts_bits = 32, .ptpegr_ts_bytes = 8, .num_cbs_shapers = SJA1105PQRS_MAX_CBS_COUNT, -- 2.25.1
Nikolay Aleksandrov
2021-Feb-10 10:31 UTC
[Bridge] [PATCH v3 net-next 00/11] Cleanup in brport flags switchdev offload for DSA
On 10/02/2021 11:14, Vladimir Oltean wrote:> From: Vladimir Oltean <vladimir.oltean at nxp.com> > > The initial goal of this series was to have better support for > standalone ports mode and multiple bridges on the DSA drivers like > ocelot/felix and sja1105. Proper support for standalone mode requires > disabling address learning, which in turn requires interaction with the > switchdev notifier, which is actually where most of the patches are. > > I also noticed that most of the drivers are actually talking either to > firmware or SPI/MDIO connected devices from the brport flags switchdev > attribute handler, so it makes sense to actually make it sleepable > instead of atomic. >Hi Vladimir, Let's take a step back for a moment and discuss the bridge unlock/lock sequences that come with this set. I'd really like to avoid those as they're a recipe for future problems. The only good way to achieve that currently is to keep the PRE_FLAGS call and do that in unsleepable context but move the FLAGS call after the flags have been changed (if they have changed obviously). That would make the code read much easier since we'll have all our lock/unlock sequences in the same code blocks and won't play games to get sleepable context. Please let's think and work in that direction, rather than having: + spin_lock_bh(&p->br->lock); + if (err) { + netdev_err(p->dev, "%s\n", extack._msg); + return err; } + which immediately looks like a bug even though after some code checking we can verify it's ok. WDYT? I plan to get rid of most of the br->lock since it's been abused for a very long time because it's essentially STP lock, but people have started using it for other things and I plan to fix that when I get more time. Thanks, Nik