Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 00/15] RX filtering in DSA
From: Vladimir Oltean <vladimir.oltean at nxp.com> This is my 5th stab at creating a list of unicast and multicast addresses that the DSA CPU ports must trap. I am reusing a lot of Tobias's work which he submitted here: https://patchwork.kernel.org/project/netdevbpf/cover/20210116012515.3152-1-tobias at waldekranz.com/ My additions to Tobias' work come in the form of taking some care that additions and removals of host addresses are properly balanced, so that we can do reference counting on them for cross-chip setups and multiple bridges spanning the same switch (I am working on an NXP board where both are real requirements). During the last attempted submission of multiple CPU ports for DSA: https://patchwork.kernel.org/project/netdevbpf/cover/20210410133454.4768-1-ansuelsmth at gmail.com/ it became clear that the concept of multiple CPU ports would not be compatible with the idea of address learning on those CPU ports (when those CPU ports are statically assigned to user ports, not in a LAG) unless the switch supports complete FDB isolation, which most switches do not. So DSA needs to manage in software all addresses that are installed on the CPU port(s), which is what this patch set does. Compared to all earlier attempts, this series does not fiddle with how DSA operates the ports in standalone mode at all, just when bridged. We need to sort that out properly, then any optimization that comes in standalone mode (i.e. IFF_UNICAST_FLT) can come later. Changes in v5: - added READ_ONCE and WRITE_ONCE for fdb->dst - removed a paranoid WARN_ON in DSA - added some documentation regarding how 'bridge fdb' is supposed to be used with DSA Tobias Waldekranz (2): net: bridge: switchdev: send FDB notifications for host addresses net: dsa: include bridge addresses which are local in the host fdb list Vladimir Oltean (13): net: bridge: use READ_ONCE() and WRITE_ONCE() compiler barriers for fdb->dst net: bridge: allow br_fdb_replay to be called for the bridge device net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del net: dsa: introduce dsa_is_upstream_port and dsa_switch_is_upstream_of net: dsa: introduce a separate cross-chip notifier type for host MDBs net: dsa: reference count the MDB entries at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: replay the local bridge FDB entries pointing to the bridge dev too .../networking/dsa/configuration.rst | 68 +++++ include/net/dsa.h | 39 +++ net/bridge/br_fdb.c | 37 ++- net/bridge/br_private.h | 7 +- net/bridge/br_switchdev.c | 12 +- net/dsa/dsa2.c | 14 + net/dsa/dsa_priv.h | 14 + net/dsa/port.c | 86 ++++++ net/dsa/slave.c | 102 +++---- net/dsa/switch.c | 273 +++++++++++++++++- 10 files changed, 573 insertions(+), 79 deletions(-) -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 01/15] net: bridge: use READ_ONCE() and WRITE_ONCE() compiler barriers for fdb->dst
From: Vladimir Oltean <vladimir.oltean at nxp.com> Annotate the writer side of fdb->dst: - fdb_create() - br_fdb_update() - fdb_add_entry() - br_fdb_external_learn_add() with WRITE_ONCE() and the reader side: - br_fdb_test_addr() - br_fdb_update() - fdb_fill_info() - fdb_add_entry() - fdb_delete_by_addr_and_port() - br_fdb_external_learn_add() - br_switchdev_fdb_notify() with compiler barriers such that the readers do not attempt to reload fdb->dst multiple times, leading to potentially different destination ports when the fdb entry is updated concurrently. This is especially important in read-side sections where fdb->dst is used more than once, but let's convert all accesses for the sake of uniformity. Suggested-by: Nikolay Aleksandrov <nikolay at nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: patch is new net/bridge/br_fdb.c | 28 +++++++++++++++++----------- net/bridge/br_switchdev.c | 7 ++++--- 2 files changed, 21 insertions(+), 14 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 16f9434fdb5d..dc3ecf2d5637 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -440,9 +440,14 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr) if (!port) ret = 0; else { + const struct net_bridge_port *dst = NULL; + fdb = br_fdb_find_rcu(port->br, addr, 0); - ret = fdb && fdb->dst && fdb->dst->dev != dev && - fdb->dst->state == BR_STATE_FORWARDING; + if (fdb) + dst = READ_ONCE(fdb->dst); + + ret = dst && dst->dev != dev && + dst->state == BR_STATE_FORWARDING; } rcu_read_unlock(); @@ -509,7 +514,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br, fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC); if (fdb) { memcpy(fdb->key.addr.addr, addr, ETH_ALEN); - fdb->dst = source; + WRITE_ONCE(fdb->dst, source); fdb->key.vlan_id = vid; fdb->flags = flags; fdb->updated = fdb->used = jiffies; @@ -600,10 +605,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, } /* fastpath: update of existing entry */ - if (unlikely(source != fdb->dst && + if (unlikely(source != READ_ONCE(fdb->dst) && !test_bit(BR_FDB_STICKY, &fdb->flags))) { br_switchdev_fdb_notify(fdb, RTM_DELNEIGH); - fdb->dst = source; + WRITE_ONCE(fdb->dst, source); fdb_modified = true; /* Take over HW learned entry */ if (unlikely(test_bit(BR_FDB_ADDED_BY_EXT_LEARN, @@ -650,6 +655,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, const struct net_bridge_fdb_entry *fdb, u32 portid, u32 seq, int type, unsigned int flags) { + const struct net_bridge_port *dst = READ_ONCE(fdb->dst); unsigned long now = jiffies; struct nda_cacheinfo ci; struct nlmsghdr *nlh; @@ -665,7 +671,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_pad2 = 0; ndm->ndm_flags = 0; ndm->ndm_type = 0; - ndm->ndm_ifindex = fdb->dst ? fdb->dst->dev->ifindex : br->dev->ifindex; + ndm->ndm_ifindex = dst ? dst->dev->ifindex : br->dev->ifindex; ndm->ndm_state = fdb_to_nud(br, fdb); if (test_bit(BR_FDB_OFFLOADED, &fdb->flags)) @@ -964,8 +970,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, if (flags & NLM_F_EXCL) return -EEXIST; - if (fdb->dst != source) { - fdb->dst = source; + if (READ_ONCE(fdb->dst) != source) { + WRITE_ONCE(fdb->dst, source); modified = true; } } @@ -1132,7 +1138,7 @@ static int fdb_delete_by_addr_and_port(struct net_bridge *br, struct net_bridge_fdb_entry *fdb; fdb = br_fdb_find(br, addr, vlan); - if (!fdb || fdb->dst != p) + if (!fdb || READ_ONCE(fdb->dst) != p) return -ENOENT; fdb_delete(br, fdb, true); @@ -1281,8 +1287,8 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, } else { fdb->updated = jiffies; - if (fdb->dst != p) { - fdb->dst = p; + if (READ_ONCE(fdb->dst) != p) { + WRITE_ONCE(fdb->dst, p); modified = true; } diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index a5e601e41cb9..192293fe37fd 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -110,6 +110,7 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, void br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) { + const struct net_bridge_port *dst = READ_ONCE(fdb->dst); struct switchdev_notifier_fdb_info info = { .addr = fdb->key.addr.addr, .vid = fdb->key.vlan_id, @@ -118,17 +119,17 @@ br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) .offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags), }; - if (!fdb->dst) + if (!dst) return; switch (type) { case RTM_DELNEIGH: call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE, - fdb->dst->dev, &info.info, NULL); + dst->dev, &info.info, NULL); break; case RTM_NEWNEIGH: call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE, - fdb->dst->dev, &info.info, NULL); + dst->dev, &info.info, NULL); break; } } -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 02/15] net: bridge: switchdev: send FDB notifications for host addresses
From: Tobias Waldekranz <tobias at waldekranz.com> Treat addresses added to the bridge itself in the same way as regular ports and send out a notification so that drivers may sync it down to the hardware FDB. Signed-off-by: Tobias Waldekranz <tobias at waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: rebased on top of the READ_ONCE change net/bridge/br_fdb.c | 4 ++-- net/bridge/br_private.h | 7 ++++--- net/bridge/br_switchdev.c | 11 +++++------ 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index dc3ecf2d5637..bad7e84d76af 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -607,7 +607,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, /* fastpath: update of existing entry */ if (unlikely(source != READ_ONCE(fdb->dst) && !test_bit(BR_FDB_STICKY, &fdb->flags))) { - br_switchdev_fdb_notify(fdb, RTM_DELNEIGH); + br_switchdev_fdb_notify(br, fdb, RTM_DELNEIGH); WRITE_ONCE(fdb->dst, source); fdb_modified = true; /* Take over HW learned entry */ @@ -800,7 +800,7 @@ static void fdb_notify(struct net_bridge *br, int err = -ENOBUFS; if (swdev_notify) - br_switchdev_fdb_notify(fdb, type); + br_switchdev_fdb_notify(br, fdb, type); skb = nlmsg_new(fdb_nlmsg_size(), GFP_ATOMIC); if (skb == NULL) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index a684d0cfc58c..2b48b204205e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -1654,8 +1654,8 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, unsigned long flags, unsigned long mask, struct netlink_ext_ack *extack); -void br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, - int type); +void br_switchdev_fdb_notify(struct net_bridge *br, + const struct net_bridge_fdb_entry *fdb, int type); int br_switchdev_port_vlan_add(struct net_device *dev, u16 vid, u16 flags, struct netlink_ext_ack *extack); int br_switchdev_port_vlan_del(struct net_device *dev, u16 vid); @@ -1702,7 +1702,8 @@ static inline int br_switchdev_port_vlan_del(struct net_device *dev, u16 vid) } static inline void -br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) +br_switchdev_fdb_notify(struct net_bridge *br, + const struct net_bridge_fdb_entry *fdb, int type) { } diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 192293fe37fd..d3adee0f91f9 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -108,9 +108,11 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p, } void -br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) +br_switchdev_fdb_notify(struct net_bridge *br, + const struct net_bridge_fdb_entry *fdb, int type) { const struct net_bridge_port *dst = READ_ONCE(fdb->dst); + struct net_device *dev = dst ? dst->dev : br->dev; struct switchdev_notifier_fdb_info info = { .addr = fdb->key.addr.addr, .vid = fdb->key.vlan_id, @@ -119,17 +121,14 @@ br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) .offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags), }; - if (!dst) - return; - switch (type) { case RTM_DELNEIGH: call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE, - dst->dev, &info.info, NULL); + dev, &info.info, NULL); break; case RTM_NEWNEIGH: call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE, - dst->dev, &info.info, NULL); + dev, &info.info, NULL); break; } } -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 03/15] net: bridge: allow br_fdb_replay to be called for the bridge device
From: Vladimir Oltean <vladimir.oltean at nxp.com> When a port joins a bridge which already has local FDB entries pointing to the bridge device itself, we would like to offload those, so allow the "dev" argument to be equal to the bridge too. The code already does what we need in that case. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: split in 2 separate checks, one for br_dev and one for dev, because it looks better net/bridge/br_fdb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index bad7e84d76af..2b862cffc03a 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -760,7 +760,10 @@ int br_fdb_replay(const struct net_device *br_dev, const struct net_device *dev, unsigned long action; int err = 0; - if (!netif_is_bridge_master(br_dev) || !netif_is_bridge_port(dev)) + if (!netif_is_bridge_master(br_dev)) + return -EINVAL; + + if (!netif_is_bridge_port(dev) && !netif_is_bridge_master(dev)) return -EINVAL; br = netdev_priv(br_dev); -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 04/15] net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del
From: Vladimir Oltean <vladimir.oltean at nxp.com> We want to add reference counting for FDB entries in cross-chip topologies, and in order for that to have any chance of working and not be unbalanced (leading to entries which are never deleted), we need to ensure that higher layers are sane, because if they aren't, it's garbage in, garbage out. For example, if we add a bridge FDB entry twice, the bridge properly errors out: $ bridge fdb add dev swp0 00:01:02:03:04:07 master static $ bridge fdb add dev swp0 00:01:02:03:04:07 master static RTNETLINK answers: File exists However, the same thing cannot be said about the bridge bypass operations: $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ bridge fdb add dev swp0 00:01:02:03:04:07 $ echo $? 0 But one 'bridge fdb del' is enough to remove the entry, no matter how many times it was added. The bridge bypass operations are impossible to maintain in these circumstances and lack of support for reference counting the cross-chip notifiers is holding us back from making further progress, so just drop support for them. The only way left for users to install static bridge FDB entries is the proper one, using the "master static" flags. With this change, rtnl_fdb_add() falls back to calling ndo_dflt_fdb_add() which uses the duplicate-exclusive variant of dev_uc_add(): dev_uc_add_excl(). Because DSA does not (yet) declare IFF_UNICAST_FLT, this results in us going to promiscuous mode: $ bridge fdb add dev swp0 00:01:02:03:04:05 [ 28.206743] device swp0 entered promiscuous mode $ bridge fdb add dev swp0 00:01:02:03:04:05 RTNETLINK answers: File exists So even if it does not completely fail, there is at least some indication that it is behaving differently from before, and closer to user space expectations, I would argue (the lack of a "local|static" specifier defaults to "local", or "host-only", so dev_uc_add() is a reasonable default implementation). If the generic implementation of .ndo_fdb_add provided by Vlad Yasevich is a proof of anything, it only proves that the implementation provided by DSA was always wrong, by not looking at "ndm->ndm_state & NUD_NOARP" (the "static" flag which means that the FDB entry points outwards) and "ndm->ndm_state & NUD_PERMANENT" (the "local" flag which means that the FDB entry points towards the host). It all used to mean the same thing to DSA. Update the documentation so that the users are not confused about what's going on. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: updated the documentation .../networking/dsa/configuration.rst | 68 +++++++++++++++++++ net/dsa/slave.c | 23 ------- 2 files changed, 68 insertions(+), 23 deletions(-) diff --git a/Documentation/networking/dsa/configuration.rst b/Documentation/networking/dsa/configuration.rst index 774f0e76c746..2b08f1a772d3 100644 --- a/Documentation/networking/dsa/configuration.rst +++ b/Documentation/networking/dsa/configuration.rst @@ -292,3 +292,71 @@ configuration. # bring up the bridge devices ip link set br0 up + +Forwarding database (FDB) management +------------------------------------ + +The existing DSA switches do not have the necessary hardware support to keep +the software FDB of the bridge in sync with the hardware tables, so the two +tables are managed separately (``bridge fdb show`` queries both, and depending +on whether the ``self`` or ``master`` flags are being used, a ``bridge fdb +add`` or ``bridge fdb del`` command acts upon entries from one or both tables). + +Up until kernel v4.14, DSA only supported user space management of bridge FDB +entries using the bridge bypass operations (which do not update the software +FDB, just the hardware one) using the ``self`` flag (which is optional and can +be omitted). + + .. code-block:: sh + + bridge fdb add dev swp0 00:01:02:03:04:05 self static + # or shorthand + bridge fdb add dev swp0 00:01:02:03:04:05 static + +Due to a bug, the bridge bypass FDB implementation provided by DSA did not +distinguish between ``static`` and ``local`` FDB entries (``static`` are meant +to be forwarded, while ``local`` are meant to be locally terminated, i.e. sent +to the host port). Instead, all FDB entries with the ``self`` flag (implicit or +explicit) are treated by DSA as ``static`` even if they are ``local``. + + .. code-block:: sh + + # This command: + bridge fdb add dev swp0 00:01:02:03:04:05 static + # behaves the same for DSA as this command: + bridge fdb add dev swp0 00:01:02:03:04:05 local + # or shorthand, because the 'local' flag is implicit if 'static' is not + # specified, it also behaves the same as: + bridge fdb add dev swp0 00:01:02:03:04:05 + +The last command is an incorrect way of adding a static bridge FDB entry to a +DSA switch using the bridge bypass operations, and works by mistake. Other +drivers will treat an FDB entry added by the same command as ``local`` and as +such, will not forward it, as opposed to DSA. + +Between kernel v4.14 and v5.14, DSA has supported in parallel two modes of +adding a bridge FDB entry to the switch: the bridge bypass discussed above, as +well as a new mode using the ``master`` flag which installs FDB entries in the +software bridge too. + + .. code-block:: sh + + bridge fdb add dev swp0 00:01:02:03:04:05 master static + +Since kernel v5.14, DSA has gained stronger integration with the bridge's +software FDB, and the support for its bridge bypass FDB implementation (using +the ``self`` flag) has been removed. This results in the following changes: + + .. code-block:: sh + + # This is the only valid way of adding an FDB entry that is supported, + # compatible with v4.14 kernels and later: + bridge fdb add dev swp0 00:01:02:03:04:05 master static + # This command is no longer buggy and the entry is properly treated as + # 'local' instead of being forwarded: + bridge fdb add dev swp0 00:01:02:03:04:05 + # This command no longer installs a static FDB entry to hardware: + bridge fdb add dev swp0 00:01:02:03:04:05 static + +Script writers are therefore encouraged to use the ``master static`` set of +flags when working with bridge FDB entries on DSA switch interfaces. diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 898ed9cf756f..64acb1e11cd7 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1651,27 +1651,6 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = { .self_test = dsa_slave_net_selftest, }; -/* legacy way, bypassing the bridge *****************************************/ -static int dsa_legacy_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], - struct net_device *dev, - const unsigned char *addr, u16 vid, - u16 flags, - struct netlink_ext_ack *extack) -{ - struct dsa_port *dp = dsa_slave_to_port(dev); - - return dsa_port_fdb_add(dp, addr, vid); -} - -static int dsa_legacy_fdb_del(struct ndmsg *ndm, struct nlattr *tb[], - struct net_device *dev, - const unsigned char *addr, u16 vid) -{ - struct dsa_port *dp = dsa_slave_to_port(dev); - - return dsa_port_fdb_del(dp, addr, vid); -} - static struct devlink_port *dsa_slave_get_devlink_port(struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); @@ -1713,8 +1692,6 @@ static const struct net_device_ops dsa_slave_netdev_ops = { .ndo_change_rx_flags = dsa_slave_change_rx_flags, .ndo_set_rx_mode = dsa_slave_set_rx_mode, .ndo_set_mac_address = dsa_slave_set_mac_address, - .ndo_fdb_add = dsa_legacy_fdb_add, - .ndo_fdb_del = dsa_legacy_fdb_del, .ndo_fdb_dump = dsa_slave_fdb_dump, .ndo_do_ioctl = dsa_slave_ioctl, .ndo_get_iflink = dsa_slave_get_iflink, -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 05/15] net: dsa: introduce dsa_is_upstream_port and dsa_switch_is_upstream_of
From: Vladimir Oltean <vladimir.oltean at nxp.com> In preparation for the new cross-chip notifiers for host addresses, let's introduce some more topology helpers which we are going to use to discern switches that are in our path towards the dedicated CPU port from switches that aren't. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none include/net/dsa.h | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/include/net/dsa.h b/include/net/dsa.h index ea47783d5695..5f632cfd33c7 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -491,6 +491,32 @@ static inline unsigned int dsa_upstream_port(struct dsa_switch *ds, int port) return dsa_towards_port(ds, cpu_dp->ds->index, cpu_dp->index); } +/* Return true if this is the local port used to reach the CPU port */ +static inline bool dsa_is_upstream_port(struct dsa_switch *ds, int port) +{ + if (dsa_is_unused_port(ds, port)) + return false; + + return port == dsa_upstream_port(ds, port); +} + +/* Return true if @upstream_ds is an upstream switch of @downstream_ds, meaning + * that the routing port from @downstream_ds to @upstream_ds is also the port + * which @downstream_ds uses to reach its dedicated CPU. + */ +static inline bool dsa_switch_is_upstream_of(struct dsa_switch *upstream_ds, + struct dsa_switch *downstream_ds) +{ + int routing_port; + + if (upstream_ds == downstream_ds) + return true; + + routing_port = dsa_routing_port(downstream_ds, upstream_ds->index); + + return dsa_is_upstream_port(downstream_ds, routing_port); +} + static inline bool dsa_port_is_vlan_filtering(const struct dsa_port *dp) { const struct dsa_switch *ds = dp->ds; -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 06/15] net: dsa: introduce a separate cross-chip notifier type for host MDBs
From: Vladimir Oltean <vladimir.oltean at nxp.com> Commit abd49535c380 ("net: dsa: execute dsa_switch_mdb_add only for routing port in cross-chip topologies") does a surprisingly good job even for the SWITCHDEV_OBJ_ID_HOST_MDB use case, where DSA simply translates a switchdev object received on dp into a cross-chip notifier for dp->cpu_dp. To visualize how that works, imagine the daisy chain topology below and consider a SWITCHDEV_OBJ_ID_HOST_MDB object emitted on sw2p0. How does the cross-chip notifier know to match on all the right ports (sw0p4, the dedicated CPU port, sw1p4, an upstream DSA link, and sw2p4, another upstream DSA link)? | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] The answer is simple: the dedicated CPU port of sw2p0 is sw0p4, and dsa_routing_port returns the upstream port for all switches. That is fine, but there are other topologies where this does not work as well. There are trees with "H" topologies in the wild, where there are 2 or more switches with DSA links between them, but every switch has its dedicated CPU port. For these topologies, it seems stupid for the neighbor switches to install an MDB entry on the routing port, since these multicast addresses are fundamentally different than the usual ones we support (and that is the justification for this patch, to introduce the concept of a termination plane multicast MAC address, as opposed to a forwarding plane multicast MAC address). For example, when a SWITCHDEV_OBJ_ID_HOST_MDB would get added to sw0p0, without this patch, it would get treated as a regular port MDB on sw0p2 and it would match on the ports below (including the sw1p3 routing port). | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ x ] [ ] [ ] [ ] With the patch, the host MDB notifier on sw0p0 matches only on the local switch, which is what we want for a termination plane address. | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ ] [ ] [ ] [ ] Name this new matching function "dsa_switch_host_address_match" since we will be reusing it soon for host FDB entries as well. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: removed a paranoid NULL pointer check and WARN_ON for the output of dsa_switch_find() net/dsa/dsa_priv.h | 6 +++++ net/dsa/port.c | 24 +++++++++++++++++++ net/dsa/slave.c | 10 ++------ net/dsa/switch.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+), 8 deletions(-) diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index c8712942002f..cd65933d269b 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -27,6 +27,8 @@ enum { DSA_NOTIFIER_LAG_LEAVE, DSA_NOTIFIER_MDB_ADD, DSA_NOTIFIER_MDB_DEL, + DSA_NOTIFIER_HOST_MDB_ADD, + DSA_NOTIFIER_HOST_MDB_DEL, DSA_NOTIFIER_VLAN_ADD, DSA_NOTIFIER_VLAN_DEL, DSA_NOTIFIER_MTU, @@ -214,6 +216,10 @@ int dsa_port_mdb_add(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); int dsa_port_mdb_del(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); +int dsa_port_host_mdb_add(const struct dsa_port *dp, + const struct switchdev_obj_port_mdb *mdb); +int dsa_port_host_mdb_del(const struct dsa_port *dp, + const struct switchdev_obj_port_mdb *mdb); int dsa_port_pre_bridge_flags(const struct dsa_port *dp, struct switchdev_brport_flags flags, struct netlink_ext_ack *extack); diff --git a/net/dsa/port.c b/net/dsa/port.c index 46089dd2b2ec..47f45f795f44 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -681,6 +681,30 @@ int dsa_port_mdb_del(const struct dsa_port *dp, return dsa_port_notify(dp, DSA_NOTIFIER_MDB_DEL, &info); } +int dsa_port_host_mdb_add(const struct dsa_port *dp, + const struct switchdev_obj_port_mdb *mdb) +{ + struct dsa_notifier_mdb_info info = { + .sw_index = dp->ds->index, + .port = dp->index, + .mdb = mdb, + }; + + return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_ADD, &info); +} + +int dsa_port_host_mdb_del(const struct dsa_port *dp, + const struct switchdev_obj_port_mdb *mdb) +{ + struct dsa_notifier_mdb_info info = { + .sw_index = dp->ds->index, + .port = dp->index, + .mdb = mdb, + }; + + return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_DEL, &info); +} + int dsa_port_vlan_add(struct dsa_port *dp, const struct switchdev_obj_port_vlan *vlan, struct netlink_ext_ack *extack) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 64acb1e11cd7..4b1d738bc3bc 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -418,10 +418,7 @@ static int dsa_slave_port_obj_add(struct net_device *dev, const void *ctx, if (!dsa_port_offloads_bridge(dp, obj->orig_dev)) return -EOPNOTSUPP; - /* DSA can directly translate this to a normal MDB add, - * but on the CPU port. - */ - err = dsa_port_mdb_add(dp->cpu_dp, SWITCHDEV_OBJ_PORT_MDB(obj)); + err = dsa_port_host_mdb_add(dp, SWITCHDEV_OBJ_PORT_MDB(obj)); break; case SWITCHDEV_OBJ_ID_PORT_VLAN: if (!dsa_port_offloads_bridge_port(dp, obj->orig_dev)) @@ -495,10 +492,7 @@ static int dsa_slave_port_obj_del(struct net_device *dev, const void *ctx, if (!dsa_port_offloads_bridge(dp, obj->orig_dev)) return -EOPNOTSUPP; - /* DSA can directly translate this to a normal MDB add, - * but on the CPU port. - */ - err = dsa_port_mdb_del(dp->cpu_dp, SWITCHDEV_OBJ_PORT_MDB(obj)); + err = dsa_port_host_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj)); break; case SWITCHDEV_OBJ_ID_PORT_VLAN: if (!dsa_port_offloads_bridge_port(dp, obj->orig_dev)) diff --git a/net/dsa/switch.c b/net/dsa/switch.c index c1e5afafe633..c40afd622331 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -154,6 +154,27 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds, return 0; } +/* Matches for all upstream-facing ports (the CPU port and all upstream-facing + * DSA links) that sit between the targeted port on which the notifier was + * emitted and its dedicated CPU port. + */ +static bool dsa_switch_host_address_match(struct dsa_switch *ds, int port, + int info_sw_index, int info_port) +{ + struct dsa_port *targeted_dp, *cpu_dp; + struct dsa_switch *targeted_ds; + + targeted_ds = dsa_switch_find(ds->dst->index, info_sw_index); + targeted_dp = dsa_to_port(targeted_ds, info_port); + cpu_dp = targeted_dp->cpu_dp; + + if (dsa_switch_is_upstream_of(ds, targeted_ds)) + return port == dsa_towards_port(ds, cpu_dp->ds->index, + cpu_dp->index); + + return false; +} + static int dsa_switch_fdb_add(struct dsa_switch *ds, struct dsa_notifier_fdb_info *info) { @@ -258,6 +279,39 @@ static int dsa_switch_mdb_del(struct dsa_switch *ds, return 0; } +static int dsa_switch_host_mdb_add(struct dsa_switch *ds, + struct dsa_notifier_mdb_info *info) +{ + int err = 0; + int port; + + if (!ds->ops->port_mdb_add) + return -EOPNOTSUPP; + + for (port = 0; port < ds->num_ports; port++) { + if (dsa_switch_host_address_match(ds, port, info->sw_index, + info->port)) { + err = ds->ops->port_mdb_add(ds, port, info->mdb); + if (err) + break; + } + } + + return err; +} + +static int dsa_switch_host_mdb_del(struct dsa_switch *ds, + struct dsa_notifier_mdb_info *info) +{ + if (!ds->ops->port_mdb_del) + return -EOPNOTSUPP; + + if (ds->index == info->sw_index) + return ds->ops->port_mdb_del(ds, info->port, info->mdb); + + return 0; +} + static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port, struct dsa_notifier_vlan_info *info) { @@ -441,6 +495,12 @@ static int dsa_switch_event(struct notifier_block *nb, case DSA_NOTIFIER_MDB_DEL: err = dsa_switch_mdb_del(ds, info); break; + case DSA_NOTIFIER_HOST_MDB_ADD: + err = dsa_switch_host_mdb_add(ds, info); + break; + case DSA_NOTIFIER_HOST_MDB_DEL: + err = dsa_switch_host_mdb_del(ds, info); + break; case DSA_NOTIFIER_VLAN_ADD: err = dsa_switch_vlan_add(ds, info); break; -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 07/15] net: dsa: reference count the MDB entries at the cross-chip notifier level
From: Vladimir Oltean <vladimir.oltean at nxp.com> Ever since the cross-chip notifiers were introduced, the design was meant to be simplistic and just get the job done without worrying too much about dangling resources left behind. For example, somebody installs an MDB entry on sw0p0 in this daisy chain topology. It gets installed using ds->ops->port_mdb_add() on sw0p0, sw1p4 and sw2p4. | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] Then the same person deletes that MDB entry. The cross-chip notifier for deletion only matches sw0p0: | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ ] Why? Because the DSA links are 'trunk' ports, if we just go ahead and delete the MDB from sw1p4 and sw2p4 directly, we might delete those multicast entries when they are still needed. Just consider the fact that somebody does: - add a multicast MAC address towards sw0p0 [ via the cross-chip notifiers it gets installed on the DSA links too ] - add the same multicast MAC address towards sw0p1 (another port of that same switch) - delete the same multicast MAC address from sw0p0. At this point, if we deleted the MAC address from the DSA links, it would be flooded, even though there is still an entry on switch 0 which needs it not to. So that is why deletions only match the targeted source port and nothing on DSA links. Of course, dangling resources means that the hardware tables will eventually run out given enough additions/removals, but hey, at least it's simple. But there is a bigger concern which needs to be addressed, and that is our support for SWITCHDEV_OBJ_ID_HOST_MDB. DSA simply translates such an object into a dsa_port_host_mdb_add() which ends up as ds->ops->port_mdb_add() on the upstream port, and a similar thing happens on deletion: dsa_port_host_mdb_del() will trigger ds->ops->port_mdb_del() on the upstream port. When there are 2 VLAN-unaware bridges spanning the same switch (which is a use case DSA proudly supports), each bridge will install its own SWITCHDEV_OBJ_ID_HOST_MDB entries. But upon deletion, DSA goes ahead and emits a DSA_NOTIFIER_MDB_DEL for dp->cpu_dp, which is shared between the user ports enslaved to br0 and the user ports enslaved to br1. Not good. The host-trapped multicast addresses installed by br1 will be deleted when any state changes in br0 (IGMP timers expire, or ports leave, etc). To avoid this, we could of course go the route of the zero-sum game and delete the DSA_NOTIFIER_MDB_DEL call for dp->cpu_dp. But the better design is to just admit that on shared ports like DSA links and CPU ports, we should be reference counting calls, even if this consumes some dynamic memory which DSA has traditionally avoided. On the flip side, the hardware tables of switches are limited in size, so it would be good if the OS managed them properly instead of having them eventually overflow. To address the memory usage concern, we only apply the refcounting of MDB entries on ports that are really shared (CPU ports and DSA links) and not on user ports. In a typical single-switch setup, this means only the CPU port (and the host MDB entries are not that many, really). The name of the newly introduced data structures (dsa_mac_addr) is chosen in such a way that will be reusable for host FDB entries (next patch). With this change, we can finally have the same matching logic for the MDB additions and deletions, as well as for their host-trapped variants. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none include/net/dsa.h | 12 ++++++ net/dsa/dsa2.c | 8 ++++ net/dsa/switch.c | 104 ++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 115 insertions(+), 9 deletions(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index 5f632cfd33c7..2c50546f9667 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -285,6 +285,11 @@ struct dsa_port { */ const struct dsa_netdevice_ops *netdev_ops; + /* List of MAC addresses that must be forwarded on this port. + * These are only valid on CPU ports and DSA links. + */ + struct list_head mdbs; + bool setup; }; @@ -299,6 +304,13 @@ struct dsa_link { struct list_head list; }; +struct dsa_mac_addr { + unsigned char addr[ETH_ALEN]; + u16 vid; + refcount_t refcount; + struct list_head list; +}; + struct dsa_switch { bool setup; diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 9000a8c84baf..2035d132682f 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -348,6 +348,8 @@ static int dsa_port_setup(struct dsa_port *dp) if (dp->setup) return 0; + INIT_LIST_HEAD(&dp->mdbs); + switch (dp->type) { case DSA_PORT_TYPE_UNUSED: dsa_port_disable(dp); @@ -443,6 +445,7 @@ static int dsa_port_devlink_setup(struct dsa_port *dp) static void dsa_port_teardown(struct dsa_port *dp) { struct devlink_port *dlp = &dp->devlink_port; + struct dsa_mac_addr *a, *tmp; if (!dp->setup) return; @@ -468,6 +471,11 @@ static void dsa_port_teardown(struct dsa_port *dp) break; } + list_for_each_entry_safe(a, tmp, &dp->mdbs, list) { + list_del(&a->list); + kfree(a); + } + dp->setup = false; } diff --git a/net/dsa/switch.c b/net/dsa/switch.c index c40afd622331..5439de029485 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -175,6 +175,84 @@ static bool dsa_switch_host_address_match(struct dsa_switch *ds, int port, return false; } +static struct dsa_mac_addr *dsa_mac_addr_find(struct list_head *addr_list, + const unsigned char *addr, + u16 vid) +{ + struct dsa_mac_addr *a; + + list_for_each_entry(a, addr_list, list) + if (ether_addr_equal(a->addr, addr) && a->vid == vid) + return a; + + return NULL; +} + +static int dsa_switch_do_mdb_add(struct dsa_switch *ds, int port, + const struct switchdev_obj_port_mdb *mdb) +{ + struct dsa_port *dp = dsa_to_port(ds, port); + struct dsa_mac_addr *a; + int err; + + /* No need to bother with refcounting for user ports */ + if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) + return ds->ops->port_mdb_add(ds, port, mdb); + + a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid); + if (a) { + refcount_inc(&a->refcount); + return 0; + } + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + + err = ds->ops->port_mdb_add(ds, port, mdb); + if (err) { + kfree(a); + return err; + } + + ether_addr_copy(a->addr, mdb->addr); + a->vid = mdb->vid; + refcount_set(&a->refcount, 1); + list_add_tail(&a->list, &dp->mdbs); + + return 0; +} + +static int dsa_switch_do_mdb_del(struct dsa_switch *ds, int port, + const struct switchdev_obj_port_mdb *mdb) +{ + struct dsa_port *dp = dsa_to_port(ds, port); + struct dsa_mac_addr *a; + int err; + + /* No need to bother with refcounting for user ports */ + if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) + return ds->ops->port_mdb_del(ds, port, mdb); + + a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid); + if (!a) + return -ENOENT; + + if (!refcount_dec_and_test(&a->refcount)) + return 0; + + err = ds->ops->port_mdb_del(ds, port, mdb); + if (err) { + refcount_inc(&a->refcount); + return err; + } + + list_del(&a->list); + kfree(a); + + return 0; +} + static int dsa_switch_fdb_add(struct dsa_switch *ds, struct dsa_notifier_fdb_info *info) { @@ -264,19 +342,18 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds, if (!ds->ops->port_mdb_add) return -EOPNOTSUPP; - return ds->ops->port_mdb_add(ds, port, info->mdb); + return dsa_switch_do_mdb_add(ds, port, info->mdb); } static int dsa_switch_mdb_del(struct dsa_switch *ds, struct dsa_notifier_mdb_info *info) { + int port = dsa_towards_port(ds, info->sw_index, info->port); + if (!ds->ops->port_mdb_del) return -EOPNOTSUPP; - if (ds->index == info->sw_index) - return ds->ops->port_mdb_del(ds, info->port, info->mdb); - - return 0; + return dsa_switch_do_mdb_del(ds, port, info->mdb); } static int dsa_switch_host_mdb_add(struct dsa_switch *ds, @@ -291,7 +368,7 @@ static int dsa_switch_host_mdb_add(struct dsa_switch *ds, for (port = 0; port < ds->num_ports; port++) { if (dsa_switch_host_address_match(ds, port, info->sw_index, info->port)) { - err = ds->ops->port_mdb_add(ds, port, info->mdb); + err = dsa_switch_do_mdb_add(ds, port, info->mdb); if (err) break; } @@ -303,13 +380,22 @@ static int dsa_switch_host_mdb_add(struct dsa_switch *ds, static int dsa_switch_host_mdb_del(struct dsa_switch *ds, struct dsa_notifier_mdb_info *info) { + int err = 0; + int port; + if (!ds->ops->port_mdb_del) return -EOPNOTSUPP; - if (ds->index == info->sw_index) - return ds->ops->port_mdb_del(ds, info->port, info->mdb); + for (port = 0; port < ds->num_ports; port++) { + if (dsa_switch_host_address_match(ds, port, info->sw_index, + info->port)) { + err = dsa_switch_do_mdb_del(ds, port, info->mdb); + if (err) + break; + } + } - return 0; + return err; } static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port, -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 08/15] net: dsa: introduce a separate cross-chip notifier type for host FDBs
From: Vladimir Oltean <vladimir.oltean at nxp.com> DSA treats some bridge FDB entries by trapping them to the CPU port. Currently, the only class of such entries are FDB addresses learnt by the software bridge on a foreign interface. However there are many more to be added: - FDB entries with the is_local flag (for termination) added by the bridge on the user ports (typically containing the MAC address of the bridge port) - FDB entries pointing towards the bridge net device (for termination). Typically these contain the MAC address of the bridge net device. - Static FDB entries installed on a foreign interface that is in the same bridge with a DSA user port. The reason why a separate cross-chip notifier for host FDBs is justified compared to normal FDBs is the same as in the case of host MDBs: the cross-chip notifier matching function in switch.c should avoid installing these entries on routing ports that route towards the targeted switch, but not towards the CPU. This is required in order to have proper support for H-like multi-chip topologies. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none net/dsa/dsa_priv.h | 7 +++++++ net/dsa/port.c | 26 ++++++++++++++++++++++++++ net/dsa/slave.c | 21 ++++++++++++++++----- net/dsa/switch.c | 41 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 90 insertions(+), 5 deletions(-) diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index cd65933d269b..36e667ea94db 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -20,6 +20,8 @@ enum { DSA_NOTIFIER_BRIDGE_LEAVE, DSA_NOTIFIER_FDB_ADD, DSA_NOTIFIER_FDB_DEL, + DSA_NOTIFIER_HOST_FDB_ADD, + DSA_NOTIFIER_HOST_FDB_DEL, DSA_NOTIFIER_HSR_JOIN, DSA_NOTIFIER_HSR_LEAVE, DSA_NOTIFIER_LAG_CHANGE, @@ -121,6 +123,7 @@ struct dsa_switchdev_event_work { */ unsigned char addr[ETH_ALEN]; u16 vid; + bool host_addr; }; /* DSA_NOTIFIER_HSR_* */ @@ -211,6 +214,10 @@ int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); +int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, + u16 vid); +int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, + u16 vid); int dsa_port_fdb_dump(struct dsa_port *dp, dsa_fdb_dump_cb_t *cb, void *data); int dsa_port_mdb_add(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb); diff --git a/net/dsa/port.c b/net/dsa/port.c index 47f45f795f44..1b80e0fbdfaa 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -646,6 +646,32 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, return dsa_port_notify(dp, DSA_NOTIFIER_FDB_DEL, &info); } +int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, + u16 vid) +{ + struct dsa_notifier_fdb_info info = { + .sw_index = dp->ds->index, + .port = dp->index, + .addr = addr, + .vid = vid, + }; + + return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_ADD, &info); +} + +int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, + u16 vid) +{ + struct dsa_notifier_fdb_info info = { + .sw_index = dp->ds->index, + .port = dp->index, + .addr = addr, + .vid = vid, + }; + + return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_DEL, &info); +} + int dsa_port_fdb_dump(struct dsa_port *dp, dsa_fdb_dump_cb_t *cb, void *data) { struct dsa_switch *ds = dp->ds; diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 4b1d738bc3bc..ac7f4f200ab1 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2315,8 +2315,12 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) rtnl_lock(); switch (switchdev_work->event) { case SWITCHDEV_FDB_ADD_TO_DEVICE: - err = dsa_port_fdb_add(dp, switchdev_work->addr, - switchdev_work->vid); + if (switchdev_work->host_addr) + err = dsa_port_host_fdb_add(dp, switchdev_work->addr, + switchdev_work->vid); + else + err = dsa_port_fdb_add(dp, switchdev_work->addr, + switchdev_work->vid); if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n", @@ -2328,8 +2332,12 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) break; case SWITCHDEV_FDB_DEL_TO_DEVICE: - err = dsa_port_fdb_del(dp, switchdev_work->addr, - switchdev_work->vid); + if (switchdev_work->host_addr) + err = dsa_port_host_fdb_del(dp, switchdev_work->addr, + switchdev_work->vid); + else + err = dsa_port_fdb_del(dp, switchdev_work->addr, + switchdev_work->vid); if (err) { dev_err(ds->dev, "port %d failed to delete %pM vid %d from fdb: %d\n", @@ -2375,6 +2383,7 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, struct net_device *dev = switchdev_notifier_info_to_dev(ptr); const struct switchdev_notifier_fdb_info *fdb_info; struct dsa_switchdev_event_work *switchdev_work; + bool host_addr = false; struct dsa_port *dp; int err; @@ -2412,7 +2421,8 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, if (!p) return NOTIFY_DONE; - dp = p->dp->cpu_dp; + dp = p->dp; + host_addr = true; if (!dp->ds->assisted_learning_on_cpu_port) return NOTIFY_DONE; @@ -2442,6 +2452,7 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; + switchdev_work->host_addr = host_addr; /* Hold a reference on the slave for dsa_fdb_offload_notify */ if (dsa_is_user_port(dp->ds, dp->index)) diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 5439de029485..219fc9baaa1c 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -253,6 +253,41 @@ static int dsa_switch_do_mdb_del(struct dsa_switch *ds, int port, return 0; } +static int dsa_switch_host_fdb_add(struct dsa_switch *ds, + struct dsa_notifier_fdb_info *info) +{ + int err = 0; + int port; + + if (!ds->ops->port_fdb_add) + return -EOPNOTSUPP; + + for (port = 0; port < ds->num_ports; port++) { + if (dsa_switch_host_address_match(ds, port, info->sw_index, + info->port)) { + err = ds->ops->port_fdb_add(ds, port, info->addr, + info->vid); + if (err) + break; + } + } + + return err; +} + +static int dsa_switch_host_fdb_del(struct dsa_switch *ds, + struct dsa_notifier_fdb_info *info) +{ + if (!ds->ops->port_fdb_del) + return -EOPNOTSUPP; + + if (ds->index == info->sw_index) + return ds->ops->port_fdb_del(ds, info->port, info->addr, + info->vid); + + return 0; +} + static int dsa_switch_fdb_add(struct dsa_switch *ds, struct dsa_notifier_fdb_info *info) { @@ -560,6 +595,12 @@ static int dsa_switch_event(struct notifier_block *nb, case DSA_NOTIFIER_FDB_DEL: err = dsa_switch_fdb_del(ds, info); break; + case DSA_NOTIFIER_HOST_FDB_ADD: + err = dsa_switch_host_fdb_add(ds, info); + break; + case DSA_NOTIFIER_HOST_FDB_DEL: + err = dsa_switch_host_fdb_del(ds, info); + break; case DSA_NOTIFIER_HSR_JOIN: err = dsa_switch_hsr_join(ds, info); break; -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 09/15] net: dsa: reference count the FDB addresses at the cross-chip notifier level
From: Vladimir Oltean <vladimir.oltean at nxp.com> The same concerns expressed for host MDB entries are valid for host FDBs just as well: - in the case of multiple bridges spanning the same switch chip, deleting a host FDB entry that belongs to one bridge will result in breakage to the other bridge - not deleting FDB entries across DSA links means that the switch's hardware tables will eventually run out, given enough wear&tear So do the same thing and introduce reference counting for CPU ports and DSA links using the same data structures as we have for MDB entries. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none include/net/dsa.h | 1 + net/dsa/dsa2.c | 6 ++++ net/dsa/switch.c | 88 +++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 88 insertions(+), 7 deletions(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index 2c50546f9667..33f40c1ec379 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -288,6 +288,7 @@ struct dsa_port { /* List of MAC addresses that must be forwarded on this port. * These are only valid on CPU ports and DSA links. */ + struct list_head fdbs; struct list_head mdbs; bool setup; diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 2035d132682f..185629f27f80 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -348,6 +348,7 @@ static int dsa_port_setup(struct dsa_port *dp) if (dp->setup) return 0; + INIT_LIST_HEAD(&dp->fdbs); INIT_LIST_HEAD(&dp->mdbs); switch (dp->type) { @@ -471,6 +472,11 @@ static void dsa_port_teardown(struct dsa_port *dp) break; } + list_for_each_entry_safe(a, tmp, &dp->fdbs, list) { + list_del(&a->list); + kfree(a); + } + list_for_each_entry_safe(a, tmp, &dp->mdbs, list) { list_del(&a->list); kfree(a); diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 219fc9baaa1c..af71b8638098 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -253,6 +253,71 @@ static int dsa_switch_do_mdb_del(struct dsa_switch *ds, int port, return 0; } +static int dsa_switch_do_fdb_add(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct dsa_port *dp = dsa_to_port(ds, port); + struct dsa_mac_addr *a; + int err; + + /* No need to bother with refcounting for user ports */ + if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) + return ds->ops->port_fdb_add(ds, port, addr, vid); + + a = dsa_mac_addr_find(&dp->fdbs, addr, vid); + if (a) { + refcount_inc(&a->refcount); + return 0; + } + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + + err = ds->ops->port_fdb_add(ds, port, addr, vid); + if (err) { + kfree(a); + return err; + } + + ether_addr_copy(a->addr, addr); + a->vid = vid; + refcount_set(&a->refcount, 1); + list_add_tail(&a->list, &dp->fdbs); + + return 0; +} + +static int dsa_switch_do_fdb_del(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct dsa_port *dp = dsa_to_port(ds, port); + struct dsa_mac_addr *a; + int err; + + /* No need to bother with refcounting for user ports */ + if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) + return ds->ops->port_fdb_del(ds, port, addr, vid); + + a = dsa_mac_addr_find(&dp->fdbs, addr, vid); + if (!a) + return -ENOENT; + + if (!refcount_dec_and_test(&a->refcount)) + return 0; + + err = ds->ops->port_fdb_del(ds, port, addr, vid); + if (err) { + refcount_inc(&a->refcount); + return err; + } + + list_del(&a->list); + kfree(a); + + return 0; +} + static int dsa_switch_host_fdb_add(struct dsa_switch *ds, struct dsa_notifier_fdb_info *info) { @@ -265,7 +330,7 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, for (port = 0; port < ds->num_ports; port++) { if (dsa_switch_host_address_match(ds, port, info->sw_index, info->port)) { - err = ds->ops->port_fdb_add(ds, port, info->addr, + err = dsa_switch_do_fdb_add(ds, port, info->addr, info->vid); if (err) break; @@ -278,14 +343,23 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, static int dsa_switch_host_fdb_del(struct dsa_switch *ds, struct dsa_notifier_fdb_info *info) { + int err = 0; + int port; + if (!ds->ops->port_fdb_del) return -EOPNOTSUPP; - if (ds->index == info->sw_index) - return ds->ops->port_fdb_del(ds, info->port, info->addr, - info->vid); + for (port = 0; port < ds->num_ports; port++) { + if (dsa_switch_host_address_match(ds, port, info->sw_index, + info->port)) { + err = dsa_switch_do_fdb_del(ds, port, info->addr, + info->vid); + if (err) + break; + } + } - return 0; + return err; } static int dsa_switch_fdb_add(struct dsa_switch *ds, @@ -296,7 +370,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP; - return ds->ops->port_fdb_add(ds, port, info->addr, info->vid); + return dsa_switch_do_fdb_add(ds, port, info->addr, info->vid); } static int dsa_switch_fdb_del(struct dsa_switch *ds, @@ -307,7 +381,7 @@ static int dsa_switch_fdb_del(struct dsa_switch *ds, if (!ds->ops->port_fdb_del) return -EOPNOTSUPP; - return ds->ops->port_fdb_del(ds, port, info->addr, info->vid); + return dsa_switch_do_fdb_del(ds, port, info->addr, info->vid); } static int dsa_switch_hsr_join(struct dsa_switch *ds, -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 10/15] net: dsa: install the host MDB and FDB entries in the master's RX filter
From: Vladimir Oltean <vladimir.oltean at nxp.com> If the DSA master implements strict address filtering, then the unicast and multicast addresses kept by the DSA CPU ports should be synchronized with the address lists of the DSA master. Note that we want the synchronization of the master's address lists even if the DSA switch doesn't support unicast/multicast database operations, on the premises that the packets will be flooded to the CPU in that case, and we should still instruct the master to receive them. This is why we do the dev_uc_add() etc first, even if dsa_port_notify() returns -EOPNOTSUPP. In turn, dev_uc_add() and friends return error only if memory allocation fails, so it is probably ok to check and propagate that error code and not just ignore it. Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: do the dev_uc_add before the dsa_port_notify, as was correctly hinted by the commit message but not implemented this way previously net/dsa/port.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/net/dsa/port.c b/net/dsa/port.c index 1b80e0fbdfaa..778b0dc2bb39 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -655,6 +655,12 @@ int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, .addr = addr, .vid = vid, }; + struct dsa_port *cpu_dp = dp->cpu_dp; + int err; + + err = dev_uc_add(cpu_dp->master, addr); + if (err) + return err; return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_ADD, &info); } @@ -668,6 +674,12 @@ int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, .addr = addr, .vid = vid, }; + struct dsa_port *cpu_dp = dp->cpu_dp; + int err; + + err = dev_uc_del(cpu_dp->master, addr); + if (err) + return err; return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_DEL, &info); } @@ -715,6 +727,12 @@ int dsa_port_host_mdb_add(const struct dsa_port *dp, .port = dp->index, .mdb = mdb, }; + struct dsa_port *cpu_dp = dp->cpu_dp; + int err; + + err = dev_mc_add(cpu_dp->master, mdb->addr); + if (err) + return err; return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_ADD, &info); } @@ -727,6 +745,12 @@ int dsa_port_host_mdb_del(const struct dsa_port *dp, .port = dp->index, .mdb = mdb, }; + struct dsa_port *cpu_dp = dp->cpu_dp; + int err; + + err = dev_mc_del(cpu_dp->master, mdb->addr); + if (err) + return err; return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_DEL, &info); } -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 11/15] net: dsa: sync static FDB entries on foreign interfaces to hardware
From: Vladimir Oltean <vladimir.oltean at nxp.com> DSA is able to install FDB entries towards the CPU port for addresses which were dynamically learnt by the software bridge on foreign interfaces that are in the same bridge with a DSA switch interface. Since this behavior is opportunistic, it is guarded by the "assisted_learning_on_cpu_port" property which can be enabled by drivers and is not done automatically (since certain switches may support address learning of packets coming from the CPU port). But if those FDB entries added on the foreign interfaces are static (added by the user) instead of dynamically learnt, currently DSA does not do anything (and arguably it should). Because static FDB entries are not supposed to move on their own, there is no downside in reusing the "assisted_learning_on_cpu_port" logic to sync static FDB entries to the DSA CPU port unconditionally, even if assisted_learning_on_cpu_port is not requested by the driver. For example, this situation: br0 / \ swp0 dummy0 $ bridge fdb add 02:00:de:ad:00:01 dev dummy0 vlan 1 master static Results in DSA adding an entry in the hardware FDB, pointing this address towards the CPU port. The same is true for entries added to the bridge itself, e.g: $ bridge fdb add 02:00:de:ad:00:01 dev br0 vlan 1 self local (except that right now, DSA still ignores 'local' FDB entries, this will be changed in a later patch) Signed-off-by: Tobias Waldekranz <tobias at waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: small fixups to commit message net/dsa/slave.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index ac7f4f200ab1..ea9a7c1ce83e 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2403,9 +2403,12 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, dp = dsa_slave_to_port(dev); } else { - /* Snoop addresses learnt on foreign interfaces - * bridged with us, for switches that don't - * automatically learn SA from CPU-injected traffic + /* Snoop addresses added to foreign interfaces + * bridged with us, or the bridge + * itself. Dynamically learned addresses can + * also be added for switches that don't + * automatically learn SA from CPU-injected + * traffic. */ struct net_device *br_dev; struct dsa_slave_priv *p; @@ -2424,7 +2427,8 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, dp = p->dp; host_addr = true; - if (!dp->ds->assisted_learning_on_cpu_port) + if (!fdb_info->added_by_user && + !dp->ds->assisted_learning_on_cpu_port) return NOTIFY_DONE; /* When the bridge learns an address on an offloaded -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 12/15] net: dsa: include bridge addresses which are local in the host fdb list
From: Tobias Waldekranz <tobias at waldekranz.com> The bridge automatically creates local (not forwarded) fdb entries pointing towards physical ports with their interface MAC addresses. For switchdev, the significance of these fdb entries is the exact opposite of that of non-local entries: instead of sending these frame outwards, we must send them inwards (towards the host). NOTE: The bridge's own MAC address is also "local". If that address is not shared with any port, the bridge's MAC is not be added by this functionality - but the following commit takes care of that case. NOTE 2: We mark these addresses as host-filtered regardless of the value of ds->assisted_learning_on_cpu_port. This is because, as opposed to the speculative logic done for dynamic address learning on foreign interfaces, the local FDB entries are rather fixed, so there isn't any risk of them migrating from one bridge port to another. Signed-off-by: Tobias Waldekranz <tobias at waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none net/dsa/slave.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index ea9a7c1ce83e..d006bd04f84a 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2398,10 +2398,12 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, fdb_info = ptr; if (dsa_slave_dev_check(dev)) { - if (!fdb_info->added_by_user || fdb_info->is_local) - return NOTIFY_OK; - dp = dsa_slave_to_port(dev); + + if (fdb_info->is_local) + host_addr = true; + else if (!fdb_info->added_by_user) + return NOTIFY_OK; } else { /* Snoop addresses added to foreign interfaces * bridged with us, or the bridge @@ -2425,9 +2427,15 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, return NOTIFY_DONE; dp = p->dp; - host_addr = true; + host_addr = fdb_info->is_local; - if (!fdb_info->added_by_user && + /* FDB entries learned by the software bridge should + * be installed as host addresses only if the driver + * requests assisted learning. + * On the other hand, FDB entries for local termination + * should always be installed. + */ + if (!fdb_info->added_by_user && !fdb_info->is_local && !dp->ds->assisted_learning_on_cpu_port) return NOTIFY_DONE; -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 13/15] net: dsa: include fdb entries pointing to bridge in the host fdb list
From: Vladimir Oltean <vladimir.oltean at nxp.com> The bridge supports a legacy way of adding local (non-forwarded) FDB entries, which works on an individual port basis: bridge fdb add dev swp0 00:01:02:03:04:05 master local As well as a new way, added by Roopa Prabhu in commit 3741873b4f73 ("bridge: allow adding of fdb entries pointing to the bridge device"): bridge fdb add dev br0 00:01:02:03:04:05 self local The two commands are functionally equivalent, except that the first one produces an entry with fdb->dst == swp0, and the other an entry with fdb->dst == NULL. The confusing part, though, is that even if fdb->dst is swp0 for the 'local on port' entry, that destination is not used. Nonetheless, the idea is that the bridge has reference counting for local entries, and local entries pointing towards the bridge are still 'as local' as local entries for a port. The bridge adds the MAC addresses of the interfaces automatically as FDB entries with is_local=1. For the MAC address of the ports, fdb->dst will be equal to the port, and for the MAC address of the bridge, fdb->dst will point towards the bridge (i.e. be NULL). Therefore, if the MAC address of the bridge is not inherited from either of the physical ports, then we must explicitly catch local FDB entries emitted towards the br0, otherwise we'll miss the MAC address of the bridge (and, of course, any entry with 'bridge add dev br0 ... self local'). Co-developed-by: Tobias Waldekranz <tobias at waldekranz.com> Signed-off-by: Tobias Waldekranz <tobias at waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none net/dsa/slave.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index d006bd04f84a..a7b5d2a41472 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2415,7 +2415,11 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, struct net_device *br_dev; struct dsa_slave_priv *p; - br_dev = netdev_master_upper_dev_get_rcu(dev); + if (netif_is_bridge_master(dev)) + br_dev = dev; + else + br_dev = netdev_master_upper_dev_get_rcu(dev); + if (!br_dev) return NOTIFY_DONE; @@ -2443,8 +2447,13 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, * LAG we don't want to send traffic to the CPU, the * other ports bridged with the LAG should be able to * autonomously forward towards it. + * On the other hand, if the address is local + * (therefore not learned) then we want to trap it to + * the CPU regardless of whether the interface it + * belongs to is offloaded or not. */ - if (dsa_tree_offloads_bridge_port(dp->ds->dst, dev)) + if (dsa_tree_offloads_bridge_port(dp->ds->dst, dev) && + !fdb_info->is_local) return NOTIFY_DONE; } -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 14/15] net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
From: Vladimir Oltean <vladimir.oltean at nxp.com> When (a) "dev" is a bridge port which the DSA switch tree offloads, but is otherwise not a dsa slave (such as a LAG netdev), or (b) "dev" is the bridge net device itself then strange things happen to the dev_hold/dev_put pair: dsa_schedule_work() will still be called with a DSA port that offloads that netdev, but dev_hold() will be called on the non-DSA netdev. Then the "if" condition in dsa_slave_switchdev_event_work() does not pass, because "dev" is not a DSA netdev, so dev_put() is not called. This results in the simple fact that we have a reference counting mismatch on the "dev" net device. This can be seen when we add support for host addresses installed on the bridge net device. ip link add br1 type bridge ip link set br1 address 00:01:02:03:04:05 ip link set swp0 master br1 ip link del br1 [ 968.512278] unregister_netdevice: waiting for br1 to become free. Usage count = 5 It seems foolish to do penny pinching and not add the net_device pointer in the dsa_switchdev_event_work structure, so let's finally do that. As an added bonus, when we start offloading local entries pointing towards the bridge, these will now properly appear as 'offloaded' in 'bridge fdb' (this was not possible before, because 'dev' was assumed to only be a DSA net device): 00:01:02:03:04:05 dev br0 vlan 1 offload master br0 permanent 00:01:02:03:04:05 dev br0 offload master br0 permanent Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none net/dsa/dsa_priv.h | 1 + net/dsa/slave.c | 9 ++++----- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 36e667ea94db..f201c33980bf 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -116,6 +116,7 @@ struct dsa_notifier_mrp_ring_role_info { struct dsa_switchdev_event_work { struct dsa_switch *ds; int port; + struct net_device *dev; struct work_struct work; unsigned long event; /* Specific for SWITCHDEV_FDB_ADD_TO_DEVICE and diff --git a/net/dsa/slave.c b/net/dsa/slave.c index a7b5d2a41472..ffbba1e71551 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2349,9 +2349,8 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) } rtnl_unlock(); + dev_put(switchdev_work->dev); kfree(switchdev_work); - if (dsa_is_user_port(ds, dp->index)) - dev_put(dp->slave); } static int dsa_lower_dev_walk(struct net_device *lower_dev, @@ -2469,15 +2468,15 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, switchdev_work->ds = dp->ds; switchdev_work->port = dp->index; switchdev_work->event = event; + switchdev_work->dev = dev; ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr; - /* Hold a reference on the slave for dsa_fdb_offload_notify */ - if (dsa_is_user_port(dp->ds, dp->index)) - dev_hold(dev); + /* Hold a reference for dsa_fdb_offload_notify */ + dev_hold(dev); dsa_schedule_work(&switchdev_work->work); break; default: -- 2.25.1
Vladimir Oltean
2021-Jun-29 14:06 UTC
[Bridge] [PATCH v5 net-next 15/15] net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
From: Vladimir Oltean <vladimir.oltean at nxp.com> When we join a bridge that already has some local addresses pointing to itself, we do not get those notifications. Similarly, when we leave that bridge, we do not get notifications for the deletion of those entries. The only switchdev notifications we get are those of entries added while the DSA port is enslaved to the bridge. This makes use cases such as the following work properly (with the number of additions and removals properly balanced): ip link add br0 type bridge ip link add br1 type bridge ip link set br0 address 00:01:02:03:04:05 ip link set br1 address 00:01:02:03:04:05 ip link set swp0 up ip link set swp1 up ip link set swp0 master br0 ip link set swp1 master br1 ip link set br0 up ip link set br1 up ip link del br1 # 00:01:02:03:04:05 still installed on the CPU port ip link del br0 # 00:01:02:03:04:05 finally removed from the CPU port Signed-off-by: Vladimir Oltean <vladimir.oltean at nxp.com> --- v4->v5: none net/dsa/port.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/net/dsa/port.c b/net/dsa/port.c index 778b0dc2bb39..28b45b7e66df 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -199,11 +199,17 @@ static int dsa_port_switchdev_sync(struct dsa_port *dp, if (err && err != -EOPNOTSUPP) return err; + /* Forwarding and termination FDB entries on the port */ err = br_fdb_replay(br, brport_dev, dp, true, &dsa_slave_switchdev_notifier); if (err && err != -EOPNOTSUPP) return err; + /* Termination FDB entries on the bridge itself */ + err = br_fdb_replay(br, br, dp, true, &dsa_slave_switchdev_notifier); + if (err && err != -EOPNOTSUPP) + return err; + err = br_vlan_replay(br, brport_dev, dp, true, &dsa_slave_switchdev_blocking_notifier, extack); if (err && err != -EOPNOTSUPP) @@ -225,11 +231,17 @@ static int dsa_port_switchdev_unsync_objs(struct dsa_port *dp, if (err && err != -EOPNOTSUPP) return err; + /* Forwarding and termination FDB entries on the port */ err = br_fdb_replay(br, brport_dev, dp, false, &dsa_slave_switchdev_notifier); if (err && err != -EOPNOTSUPP) return err; + /* Termination FDB entries on the bridge itself */ + err = br_fdb_replay(br, br, dp, false, &dsa_slave_switchdev_notifier); + if (err && err != -EOPNOTSUPP) + return err; + err = br_vlan_replay(br, brport_dev, dp, false, &dsa_slave_switchdev_blocking_notifier, extack); if (err && err != -EOPNOTSUPP) -- 2.25.1
patchwork-bot+netdevbpf at kernel.org
2021-Jun-29 17:50 UTC
[Bridge] [PATCH v5 net-next 00/15] RX filtering in DSA
Hello: This series was applied to netdev/net-next.git (refs/heads/master): On Tue, 29 Jun 2021 17:06:43 +0300 you wrote:> From: Vladimir Oltean <vladimir.oltean at nxp.com> > > This is my 5th stab at creating a list of unicast and multicast > addresses that the DSA CPU ports must trap. I am reusing a lot of > Tobias's work which he submitted here: > https://patchwork.kernel.org/project/netdevbpf/cover/20210116012515.3152-1-tobias at waldekranz.com/ > > [...]Here is the summary with links: - [v5,net-next,01/15] net: bridge: use READ_ONCE() and WRITE_ONCE() compiler barriers for fdb->dst https://git.kernel.org/netdev/net-next/c/3e19ae7c6fd6 - [v5,net-next,02/15] net: bridge: switchdev: send FDB notifications for host addresses https://git.kernel.org/netdev/net-next/c/6eb38bf8eb90 - [v5,net-next,03/15] net: bridge: allow br_fdb_replay to be called for the bridge device https://git.kernel.org/netdev/net-next/c/f851a721a638 - [v5,net-next,04/15] net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del https://git.kernel.org/netdev/net-next/c/b117e1e8a86d - [v5,net-next,05/15] net: dsa: introduce dsa_is_upstream_port and dsa_switch_is_upstream_of https://git.kernel.org/netdev/net-next/c/63609c8fac40 - [v5,net-next,06/15] net: dsa: introduce a separate cross-chip notifier type for host MDBs https://git.kernel.org/netdev/net-next/c/b8e997c49003 - [v5,net-next,07/15] net: dsa: reference count the MDB entries at the cross-chip notifier level https://git.kernel.org/netdev/net-next/c/161ca59d39e9 - [v5,net-next,08/15] net: dsa: introduce a separate cross-chip notifier type for host FDBs https://git.kernel.org/netdev/net-next/c/3dc80afc5098 - [v5,net-next,09/15] net: dsa: reference count the FDB addresses at the cross-chip notifier level https://git.kernel.org/netdev/net-next/c/3f6e32f92a02 - [v5,net-next,10/15] net: dsa: install the host MDB and FDB entries in the master's RX filter https://git.kernel.org/netdev/net-next/c/26ee7b06a4d3 - [v5,net-next,11/15] net: dsa: sync static FDB entries on foreign interfaces to hardware https://git.kernel.org/netdev/net-next/c/3068d466a67e - [v5,net-next,12/15] net: dsa: include bridge addresses which are local in the host fdb list https://git.kernel.org/netdev/net-next/c/10fae4ac89ce - [v5,net-next,13/15] net: dsa: include fdb entries pointing to bridge in the host fdb list https://git.kernel.org/netdev/net-next/c/81a619f78759 - [v5,net-next,14/15] net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev https://git.kernel.org/netdev/net-next/c/4bed397c3e65 - [v5,net-next,15/15] net: dsa: replay the local bridge FDB entries pointing to the bridge dev too https://git.kernel.org/netdev/net-next/c/63c51453c82c You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
From: Vladimir Oltean <olteanv at gmail.com> Date: Tue, 29 Jun 2021 17:06:43 +0300> Changes in v5: > - added READ_ONCE and WRITE_ONCE for fdb->dst > - removed a paranoid WARN_ON in DSA > - added some documentation regarding how 'bridge fdb' is supposed to be > used with DSAVlad, I applied v4, could you please send me relative fixups to v5? Thank you.