Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 0/8] net: bridge: add flush filtering support
Hi, This patch-set adds support to specify filtering conditions for a bulk delete (flush) operation. This version uses a new nlmsghdr delete flag called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is used to signal that the driver supports bulk deletes (that avoids pushing common mac address checks to ndo_fdb_del implementations and also has a different prototype and parsed attribute expectations, more info in patch 03). The new delete flag can be used for any RTM_DEL* type, implementations just need to be careful with older kernels which are doing non-strict attribute parses. Here I use the fact that mac address attribute (lladdr) is mandatory in the classic fdb del case, but it's not allowed if bulk deleting so older kernels will error out. Patch 01 adds the new NLM_F_BULK delete request modifier, patch 02 then adds the new ndo_fdb_del_bulk call. Patch 03 adds NLM_F_BULK support to rtnl_fdb_del, on such request strict parsing is used only for the supported attributes, and if the ndo is implemented it's called, the NTF_SELF/MASTER rules are the same as for the standard rtnl_fdb_del. Patch 04 implements bridge-specific minimal ndo_fdb_del_bulk call which uses the current br_fdb_flush to delete all entries. Patch 05 adds filtering support to the new bridge flush op which supports target ifindex (port or bridge), vlan id and flags/state mask. Patch 06 adds ndm state and flags mask attributes which will be used for filtering. Patch 07 converts ndm state/flags and their masks to bridge-private flags and fills them in the filter descriptor for matching. Finally patch 08 fills in the target ifindex (after validating it) and vlan id (already validated by rtnl_fdb_flush) for matching. Flush filtering is needed because user-space applications need a quick way to delete only a specific set of entries, e.g. mlag implementations need a way to flush only dynamic entries excluding externally learned ones or only externally learned ones without static entries etc. Also apps usually want to target only a specific vlan or port/vlan combination. The current 2 flush operations (per port and bridge-wide) are not extensible and cannot provide such filtering. I decided against embedding new attrs into the old flush attributes for multiple reasons - proper error handling on unsupported attributes, older kernels silently flushing all, need for a second mechanism to signal that the attribute should be parsed (e.g. using boolopts), special treatment for permanent entries. Examples: $ bridge fdb flush dev bridge vlan 100 static < flush all static entries on vlan 100 > $ bridge fdb flush dev bridge vlan 1 dynamic < flush all dynamic entries on vlan 1 > $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic < flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev ens16 vlan 1 dynamic master < as above: flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev bridge nooffloaded nopermanent self < flush all non-offloaded and non-permanent entries > $ bridge fdb flush dev bridge static noextern_learn < flush all static entries which are not externally learned > $ bridge fdb flush dev bridge permanent < flush all permanent entries > $ bridge fdb flush dev bridge port bridge permanent < flush all permanent entries pointing to the bridge itself > Example of a flush call with unsupported netlink attribute (NDA_DST): $ bridge fdb flush dev bridge vlan 100 dynamic dst Error: Unsupported attribute. Example of a flush call on an older kernel: $ bridge fdb flush dev bridge dynamic Error: invalid address. Note that all flags have their negated version (static vs nostatic etc) and there are some tricky cases to handle like "static" which in flag terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the mask matches on both but we need only NUD_NOARP to be set. That's because permanent entries have both set so we can't just match on NUD_NOARP. Also note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them. We can extend the api with a flag to do that if needed in the future. Patch-sets (in order): - Initial flush infra and fdb flush filtering (this set) - iproute2 support - selftests Future work: - mdb flush support (RTM_FLUSHMDB type) Thanks to Ido for the great discussion and feedback while working on this. v3: Add NLM_F_BULK delete modifier and ndo_fdb_del_bulk callback, patches 01 - 03 and 06 are new. Patch 04 is changed to implement bulk_del instead of flush, patches 05, 07 and 08 are adjusted to use NDA_ attributes Thanks, Nik Nikolay Aleksandrov (8): net: netlink: add NLM_F_BULK delete request modifier net: add ndo_fdb_del_bulk net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del net: bridge: fdb: add ndo_fdb_del_bulk net: bridge: fdb: add support for fine-grained flushing net: rtnetlink: add ndm flags and state mask attributes net: bridge: fdb: add support for flush filtering based on ndm flags and state net: bridge: fdb: add support for flush filtering based on ifindex and vlan include/linux/netdevice.h | 9 ++ include/uapi/linux/neighbour.h | 2 + include/uapi/linux/netlink.h | 1 + net/bridge/br_device.c | 1 + net/bridge/br_fdb.c | 154 +++++++++++++++++++++++++++++++-- net/bridge/br_netlink.c | 9 +- net/bridge/br_private.h | 19 +++- net/bridge/br_sysfs_br.c | 6 +- net/core/rtnetlink.c | 66 ++++++++++---- 9 files changed, 238 insertions(+), 29 deletions(-) -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 1/8] net: netlink: add NLM_F_BULK delete request modifier
Add a new delete request modifier called NLM_F_BULK which, when supported, would cause the request to delete multiple objects. The flag is a convenient way to signal that a multiple delete operation is requested which can be gradually added to different delete requests. In order to make sure older kernels will error out if the operation is not supported instead of doing something unintended we have to break a required condition when implementing support for this flag, f.e. for neighbors we will omit the mandatory mac address attribute. Initially it will be used to add flush with filtering support for bridge fdbs, but it also opens the door to add similar support to others. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/uapi/linux/netlink.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h index 4c0cde075c27..855dffb4c1c3 100644 --- a/include/uapi/linux/netlink.h +++ b/include/uapi/linux/netlink.h @@ -72,6 +72,7 @@ struct nlmsghdr { /* Modifiers to DELETE request */ #define NLM_F_NONREC 0x100 /* Do not delete recursively */ +#define NLM_F_BULK 0x200 /* Delete multiple objects */ /* Flags for ACK message */ #define NLM_F_CAPPED 0x100 /* request was capped */ -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 2/8] net: add ndo_fdb_del_bulk
Add a new netdev op called ndo_fdb_del_bulk, it will be later used for driver-specific bulk delete implementation dispatched from rtnetlink. The first user will be the bridge, we need it to signal to rtnetlink from the driver that we support bulk delete operation (NLM_F_BULK). Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/linux/netdevice.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 28ea4f8269d4..a602f29365b0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1260,6 +1260,10 @@ struct netdev_net_notifier { * struct net_device *dev, * const unsigned char *addr, u16 vid) * Deletes the FDB entry from dev coresponding to addr. + * int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, struct nlattr *tb[], + * struct net_device *dev, + * u16 vid, + * struct netlink_ext_ack *extack); * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, * struct net_device *dev, struct net_device *filter_dev, * int *idx) @@ -1510,6 +1514,11 @@ struct net_device_ops { struct net_device *dev, const unsigned char *addr, u16 vid); + int (*ndo_fdb_del_bulk)(struct ndmsg *ndm, + struct nlattr *tb[], + struct net_device *dev, + u16 vid, + struct netlink_ext_ack *extack); int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, struct net_device *dev, -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 3/8] net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del
When NLM_F_BULK is specified in a fdb del message we need to handle it differently. First since this is a new call we can strictly validate the passed attributes, at first only ifindex and vlan are allowed as these will be the initially supported filter attributes, any other attribute is rejected. The mac address is no longer mandatory, but we use it to error out in older kernels because it cannot be specified with bulk request (the attribute is not allowed) and then we have to dispatch the call to ndo_fdb_del_bulk if the device supports it. The del bulk callback can do further validation of the attributes if necessary. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- net/core/rtnetlink.c | 64 +++++++++++++++++++++++++++++++------------- 1 file changed, 46 insertions(+), 18 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 4041b3e2e8ec..824963aa57b1 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4167,22 +4167,34 @@ int ndo_dflt_fdb_del(struct ndmsg *ndm, } EXPORT_SYMBOL(ndo_dflt_fdb_del); +static const struct nla_policy fdb_del_bulk_policy[NDA_MAX + 1] = { + [NDA_VLAN] = { .type = NLA_U16 }, + [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), +}; + static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, struct netlink_ext_ack *extack) { + bool del_bulk = !!(nlh->nlmsg_flags & NLM_F_BULK); struct net *net = sock_net(skb->sk); + const struct net_device_ops *ops; struct ndmsg *ndm; struct nlattr *tb[NDA_MAX+1]; struct net_device *dev; - __u8 *addr; + __u8 *addr = NULL; int err; u16 vid; if (!netlink_capable(skb, CAP_NET_ADMIN)) return -EPERM; - err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, NULL, - extack); + if (!del_bulk) { + err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, + NULL, extack); + } else { + err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, + fdb_del_bulk_policy, extack); + } if (err < 0) return err; @@ -4198,9 +4210,12 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, return -ENODEV; } - if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { - NL_SET_ERR_MSG(extack, "invalid address"); - return -EINVAL; + if (!del_bulk) { + if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) { + NL_SET_ERR_MSG(extack, "invalid address"); + return -EINVAL; + } + addr = nla_data(tb[NDA_LLADDR]); } if (dev->type != ARPHRD_ETHER) { @@ -4208,8 +4223,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, return -EINVAL; } - addr = nla_data(tb[NDA_LLADDR]); - err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack); if (err) return err; @@ -4220,10 +4233,16 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && netif_is_bridge_port(dev)) { struct net_device *br_dev = netdev_master_upper_dev_get(dev); - const struct net_device_ops *ops = br_dev->netdev_ops; - if (ops->ndo_fdb_del) - err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); + ops = br_dev->netdev_ops; + if (!del_bulk) { + if (ops->ndo_fdb_del) + err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); + } else { + if (ops->ndo_fdb_del_bulk) + err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, + extack); + } if (err) goto out; @@ -4233,15 +4252,24 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, /* Embedded bridge, macvlan, and any other device support */ if (ndm->ndm_flags & NTF_SELF) { - if (dev->netdev_ops->ndo_fdb_del) - err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr, - vid); - else - err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); + ops = dev->netdev_ops; + if (!del_bulk) { + if (ops->ndo_fdb_del) + err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid); + else + err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid); + } else { + /* in case err was cleared by NTF_MASTER call */ + err = -EOPNOTSUPP; + if (ops->ndo_fdb_del_bulk) + err = ops->ndo_fdb_del_bulk(ndm, tb, dev, vid, + extack); + } if (!err) { - rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, - ndm->ndm_state); + if (!del_bulk) + rtnl_fdb_notify(dev, addr, vid, RTM_DELNEIGH, + ndm->ndm_state); ndm->ndm_flags &= ~NTF_SELF; } } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 4/8] net: bridge: fdb: add ndo_fdb_del_bulk
Add a minimal ndo_fdb_del_bulk implementation which flushes all entries. Support for more fine-grained filtering will be added in the following patches. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- net/bridge/br_device.c | 1 + net/bridge/br_fdb.c | 25 ++++++++++++++++++++++++- net/bridge/br_netlink.c | 2 +- net/bridge/br_private.h | 6 +++++- net/bridge/br_sysfs_br.c | 2 +- 5 files changed, 32 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 8d6bab244c4a..58a4f70e01e3 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -465,6 +465,7 @@ static const struct net_device_ops br_netdev_ops = { .ndo_fix_features = br_fix_features, .ndo_fdb_add = br_fdb_add, .ndo_fdb_del = br_fdb_delete, + .ndo_fdb_del_bulk = br_fdb_delete_bulk, .ndo_fdb_dump = br_fdb_dump, .ndo_fdb_get = br_fdb_get, .ndo_bridge_getlink = br_getlink, diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 6ccda68bd473..fd7012c32cd5 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -559,7 +559,7 @@ void br_fdb_cleanup(struct work_struct *work) } /* Completely flush all dynamic entries in forwarding database.*/ -void br_fdb_flush(struct net_bridge *br) +void __br_fdb_flush(struct net_bridge *br) { struct net_bridge_fdb_entry *f; struct hlist_node *tmp; @@ -572,6 +572,29 @@ void br_fdb_flush(struct net_bridge *br) spin_unlock_bh(&br->hash_lock); } +int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], + struct net_device *dev, u16 vid, + struct netlink_ext_ack *extack) +{ + struct net_bridge *br; + + if (netif_is_bridge_master(dev)) { + br = netdev_priv(dev); + } else { + struct net_bridge_port *p = br_port_get_rtnl(dev); + + if (!p) { + NL_SET_ERR_MSG_MOD(extack, "Device is not a bridge port"); + return -EINVAL; + } + br = p->br; + } + + __br_fdb_flush(br); + + return 0; +} + /* Flush all entries referring to a specific port. * if do_all is set also flush static entries * if vid is set delete all entries that match the vlan_id diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 200ad05b296f..c59c775730bb 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1327,7 +1327,7 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], } if (data[IFLA_BR_FDB_FLUSH]) - br_fdb_flush(br); + __br_fdb_flush(br); #ifdef CONFIG_BRIDGE_IGMP_SNOOPING if (data[IFLA_BR_MCAST_ROUTER]) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 6e62af2e07e9..3ba50e41aa4f 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -759,7 +759,8 @@ int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); void br_fdb_hash_fini(struct net_bridge *br); -void br_fdb_flush(struct net_bridge *br); +void __br_fdb_flush(struct net_bridge *br); + void br_fdb_find_delete_local(struct net_bridge *br, const struct net_bridge_port *p, const unsigned char *addr, u16 vid); @@ -781,6 +782,9 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, const unsigned char *addr, u16 vid); +int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], + struct net_device *dev, u16 vid, + struct netlink_ext_ack *extack); int br_fdb_add(struct ndmsg *nlh, struct nlattr *tb[], struct net_device *dev, const unsigned char *addr, u16 vid, u16 nlh_flags, struct netlink_ext_ack *extack); diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c index 3f7ca88c2aa3..7a2cf3aebc84 100644 --- a/net/bridge/br_sysfs_br.c +++ b/net/bridge/br_sysfs_br.c @@ -344,7 +344,7 @@ static DEVICE_ATTR_RW(group_addr); static int set_flush(struct net_bridge *br, unsigned long val, struct netlink_ext_ack *extack) { - br_fdb_flush(br); + __br_fdb_flush(br); return 0; } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 5/8] net: bridge: fdb: add support for fine-grained flushing
Add the ability to specify exactly which fdbs to be flushed. They are described by a new structure - net_bridge_fdb_flush_desc. Currently it can match on port/bridge ifindex, vlan id and fdb flags. It is used to describe the existing dynamic fdb flush operation. Note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them, so currently it can't directly replace deletes which need to handle that case, although we can extend it later for that too. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: changed the flush matches func for better readability (Ido) v3: no change net/bridge/br_fdb.c | 41 ++++++++++++++++++++++++++++++++-------- net/bridge/br_netlink.c | 9 +++++++-- net/bridge/br_private.h | 10 +++++++++- net/bridge/br_sysfs_br.c | 6 +++++- 4 files changed, 54 insertions(+), 12 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index fd7012c32cd5..f1deac42bc0d 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -558,24 +558,49 @@ void br_fdb_cleanup(struct work_struct *work) mod_delayed_work(system_long_wq, &br->gc_work, work_delay); } -/* Completely flush all dynamic entries in forwarding database.*/ -void __br_fdb_flush(struct net_bridge *br) +static bool __fdb_flush_matches(const struct net_bridge *br, + const struct net_bridge_fdb_entry *f, + const struct net_bridge_fdb_flush_desc *desc) +{ + const struct net_bridge_port *dst = READ_ONCE(f->dst); + int port_ifidx = dst ? dst->dev->ifindex : br->dev->ifindex; + + if (desc->vlan_id && desc->vlan_id != f->key.vlan_id) + return false; + if (desc->port_ifindex && desc->port_ifindex != port_ifidx) + return false; + if (desc->flags_mask && (f->flags & desc->flags_mask) != desc->flags) + return false; + + return true; +} + +/* Flush forwarding database entries matching the description */ +void __br_fdb_flush(struct net_bridge *br, + const struct net_bridge_fdb_flush_desc *desc) { struct net_bridge_fdb_entry *f; - struct hlist_node *tmp; - spin_lock_bh(&br->hash_lock); - hlist_for_each_entry_safe(f, tmp, &br->fdb_list, fdb_node) { - if (!test_bit(BR_FDB_STATIC, &f->flags)) + rcu_read_lock(); + hlist_for_each_entry_rcu(f, &br->fdb_list, fdb_node) { + if (!__fdb_flush_matches(br, f, desc)) + continue; + + spin_lock_bh(&br->hash_lock); + if (!hlist_unhashed(&f->fdb_node)) fdb_delete(br, f, true); + spin_unlock_bh(&br->hash_lock); } - spin_unlock_bh(&br->hash_lock); + rcu_read_unlock(); } int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -590,7 +615,7 @@ int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], br = p->br; } - __br_fdb_flush(br); + __br_fdb_flush(br, &desc); return 0; } diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index c59c775730bb..accab38b0b6a 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1326,8 +1326,13 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], br_recalculate_fwd_mask(br); } - if (data[IFLA_BR_FDB_FLUSH]) - __br_fdb_flush(br); + if (data[IFLA_BR_FDB_FLUSH]) { + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; + + __br_fdb_flush(br, &desc); + } #ifdef CONFIG_BRIDGE_IGMP_SNOOPING if (data[IFLA_BR_MCAST_ROUTER]) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 3ba50e41aa4f..dd186ac29737 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -274,6 +274,13 @@ struct net_bridge_fdb_entry { struct rcu_head rcu; }; +struct net_bridge_fdb_flush_desc { + unsigned long flags; + unsigned long flags_mask; + int port_ifindex; + u16 vlan_id; +}; + #define MDB_PG_FLAGS_PERMANENT BIT(0) #define MDB_PG_FLAGS_OFFLOAD BIT(1) #define MDB_PG_FLAGS_FAST_LEAVE BIT(2) @@ -759,7 +766,8 @@ int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); void br_fdb_hash_fini(struct net_bridge *br); -void __br_fdb_flush(struct net_bridge *br); +void __br_fdb_flush(struct net_bridge *br, + const struct net_bridge_fdb_flush_desc *desc); void br_fdb_find_delete_local(struct net_bridge *br, const struct net_bridge_port *p, diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c index 7a2cf3aebc84..c863151f1cde 100644 --- a/net/bridge/br_sysfs_br.c +++ b/net/bridge/br_sysfs_br.c @@ -344,7 +344,11 @@ static DEVICE_ATTR_RW(group_addr); static int set_flush(struct net_bridge *br, unsigned long val, struct netlink_ext_ack *extack) { - __br_fdb_flush(br); + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; + + __br_fdb_flush(br, &desc); return 0; } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 6/8] net: rtnetlink: add ndm flags and state mask attributes
Add ndm flags/state masks which will be used for bulk delete filtering. All of these are used by the bridge and vxlan drivers. Also minimal attr policy validation is added, it is up to ndo_fdb_del_bulk implementers to further validate them. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/uapi/linux/neighbour.h | 2 ++ net/core/rtnetlink.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index db05fb55055e..39c565e460c7 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -32,6 +32,8 @@ enum { NDA_NH_ID, NDA_FDB_EXT_ATTRS, NDA_FLAGS_EXT, + NDA_NDM_STATE_MASK, + NDA_NDM_FLAGS_MASK, __NDA_MAX }; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 824963aa57b1..9118523b328f 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4170,6 +4170,8 @@ EXPORT_SYMBOL(ndo_dflt_fdb_del); static const struct nla_policy fdb_del_bulk_policy[NDA_MAX + 1] = { [NDA_VLAN] = { .type = NLA_U16 }, [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), + [NDA_NDM_STATE_MASK] = { .type = NLA_U16 }, + [NDA_NDM_FLAGS_MASK] = { .type = NLA_U8 }, }; static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 7/8] net: bridge: fdb: add support for flush filtering based on ndm flags and state
Add support for fdb flush filtering based on ndm flags and state. NDM state and flags are mapped to bridge-specific flags and matched according to the specified masks. NTF_USE is used to represent added_by_user flag since it sets it on fdb add and we don't have a 1:1 mapping for it. Only allowed bits can be set, NTF_USE and NTF_MASTER are ignored. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: ignore NTF_USE/NTF_MASTER and reject unknown flags v3: NDFA -> NDA attributes net/bridge/br_fdb.c | 58 ++++++++++++++++++++++++++++++++++++++--- net/bridge/br_private.h | 5 ++++ 2 files changed, 60 insertions(+), 3 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index f1deac42bc0d..bbb00a75ef0a 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -594,13 +594,40 @@ void __br_fdb_flush(struct net_bridge *br, rcu_read_unlock(); } +static unsigned long __ndm_state_to_fdb_flags(u16 ndm_state) +{ + unsigned long flags = 0; + + if (ndm_state & NUD_PERMANENT) + __set_bit(BR_FDB_LOCAL, &flags); + if (ndm_state & NUD_NOARP) + __set_bit(BR_FDB_STATIC, &flags); + + return flags; +} + +static unsigned long __ndm_flags_to_fdb_flags(u8 ndm_flags) +{ + unsigned long flags = 0; + + if (ndm_flags & NTF_USE) + __set_bit(BR_FDB_ADDED_BY_USER, &flags); + if (ndm_flags & NTF_EXT_LEARNED) + __set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &flags); + if (ndm_flags & NTF_OFFLOADED) + __set_bit(BR_FDB_OFFLOADED, &flags); + if (ndm_flags & NTF_STICKY) + __set_bit(BR_FDB_STICKY, &flags); + + return flags; +} + int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { - struct net_bridge_fdb_flush_desc desc = { - .flags_mask = BR_FDB_STATIC - }; + u8 ndm_flags = ndm->ndm_flags & ~FDB_FLUSH_IGNORED_NDM_FLAGS; + struct net_bridge_fdb_flush_desc desc = {}; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -615,6 +642,31 @@ int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], br = p->br; } + if (ndm_flags & ~FDB_FLUSH_ALLOWED_NDM_FLAGS) { + NL_SET_ERR_MSG(extack, "Unsupported fdb flush ndm flag bits set"); + return -EINVAL; + } + if (ndm->ndm_state & ~FDB_FLUSH_ALLOWED_NDM_STATES) { + NL_SET_ERR_MSG(extack, "Unsupported fdb flush ndm state bits set"); + return -EINVAL; + } + + desc.flags |= __ndm_state_to_fdb_flags(ndm->ndm_state); + desc.flags |= __ndm_flags_to_fdb_flags(ndm_flags); + if (tb[NDA_NDM_STATE_MASK]) { + u16 ndm_state_mask = nla_get_u16(tb[NDA_NDM_STATE_MASK]); + + desc.flags_mask |= __ndm_state_to_fdb_flags(ndm_state_mask); + } + if (tb[NDA_NDM_FLAGS_MASK]) { + u8 ndm_flags_mask = nla_get_u8(tb[NDA_NDM_FLAGS_MASK]); + + desc.flags_mask |= __ndm_flags_to_fdb_flags(ndm_flags_mask); + } + + br_debug(br, "flushing port ifindex: %d vlan id: %u flags: 0x%lx flags mask: 0x%lx\n", + desc.port_ifindex, desc.vlan_id, desc.flags, desc.flags_mask); + __br_fdb_flush(br, &desc); return 0; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index dd186ac29737..72b934d1edce 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -762,6 +762,11 @@ static inline void br_netpoll_disable(struct net_bridge_port *p) #endif /* br_fdb.c */ +#define FDB_FLUSH_IGNORED_NDM_FLAGS (NTF_MASTER | NTF_SELF) +#define FDB_FLUSH_ALLOWED_NDM_STATES (NUD_PERMANENT | NUD_NOARP) +#define FDB_FLUSH_ALLOWED_NDM_FLAGS (NTF_USE | NTF_EXT_LEARNED | \ + NTF_STICKY | NTF_OFFLOADED) + int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 13:22 UTC
[Bridge] [PATCH net-next v3 8/8] net: bridge: fdb: add support for flush filtering based on ifindex and vlan
Add support for fdb flush filtering based on destination ifindex and vlan id. The ifindex must either match a port's device ifindex or the bridge's. The vlan support is trivial since it's already validated by rtnl_fdb_flush, we just need to fill it in. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: validate ifindex and fill in vlan id v3: NDFA -> NDA attributes net/bridge/br_fdb.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index bbb00a75ef0a..c44ea83ac3d9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -622,12 +622,44 @@ static unsigned long __ndm_flags_to_fdb_flags(u8 ndm_flags) return flags; } +static int __fdb_flush_validate_ifindex(const struct net_bridge *br, + int ifindex, + struct netlink_ext_ack *extack) +{ + const struct net_device *dev; + + dev = __dev_get_by_index(dev_net(br->dev), ifindex); + if (!dev) { + NL_SET_ERR_MSG_MOD(extack, "Unknown flush device ifindex"); + return -ENODEV; + } + if (!netif_is_bridge_master(dev) && !netif_is_bridge_port(dev)) { + NL_SET_ERR_MSG_MOD(extack, "Flush device is not a bridge or bridge port"); + return -EINVAL; + } + if (netif_is_bridge_master(dev) && dev != br->dev) { + NL_SET_ERR_MSG_MOD(extack, + "Flush bridge device does not match target bridge device"); + return -EINVAL; + } + if (netif_is_bridge_port(dev)) { + struct net_bridge_port *p = br_port_get_rtnl(dev); + + if (p->br != br) { + NL_SET_ERR_MSG_MOD(extack, "Port belongs to a different bridge device"); + return -EINVAL; + } + } + + return 0; +} + int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { u8 ndm_flags = ndm->ndm_flags & ~FDB_FLUSH_IGNORED_NDM_FLAGS; - struct net_bridge_fdb_flush_desc desc = {}; + struct net_bridge_fdb_flush_desc desc = { .vlan_id = vid }; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -663,6 +695,14 @@ int br_fdb_delete_bulk(struct ndmsg *ndm, struct nlattr *tb[], desc.flags_mask |= __ndm_flags_to_fdb_flags(ndm_flags_mask); } + if (tb[NDA_IFINDEX]) { + int err, ifidx = nla_get_s32(tb[NDA_IFINDEX]); + + err = __fdb_flush_validate_ifindex(br, ifidx, extack); + if (err) + return err; + desc.port_ifindex = ifidx; + } br_debug(br, "flushing port ifindex: %d vlan id: %u flags: 0x%lx flags mask: 0x%lx\n", desc.port_ifindex, desc.vlan_id, desc.flags, desc.flags_mask); -- 2.35.1
Nikolay Aleksandrov
2022-Apr-12 22:50 UTC
[Bridge] [PATCH net-next v3 0/8] net: bridge: add flush filtering support
On 4/12/22 16:22, Nikolay Aleksandrov wrote:> Hi, > This patch-set adds support to specify filtering conditions for a bulk > delete (flush) operation. This version uses a new nlmsghdr delete flag > called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is > used to signal that the driver supports bulk deletes (that avoids > pushing common mac address checks to ndo_fdb_del implementations and > also has a different prototype and parsed attribute expectations, more > info in patch 03). The new delete flag can be used for any RTM_DEL* > type, implementations just need to be careful with older kernels which > are doing non-strict attribute parses. Here I use the fact that mac > address attribute (lladdr) is mandatory in the classic fdb del case, but > it's not allowed if bulk deleting so older kernels will error out. > Patch 01 adds the new NLM_F_BULK delete request modifier, patch 02 then > adds the new ndo_fdb_del_bulk call. Patch 03 adds NLM_F_BULK support to > rtnl_fdb_del, on such request strict parsing is used only for the > supported attributes, and if the ndo is implemented it's called, the > NTF_SELF/MASTER rules are the same as for the standard rtnl_fdb_del. > Patch 04 implements bridge-specific minimal ndo_fdb_del_bulk call which > uses the current br_fdb_flush to delete all entries. Patch 05 adds > filtering support to the new bridge flush op which supports target > ifindex (port or bridge), vlan id and flags/state mask. Patch 06 adds > ndm state and flags mask attributes which will be used for filtering. > Patch 07 converts ndm state/flags and their masks to bridge-private flags > and fills them in the filter descriptor for matching. Finally patch 08 > fills in the target ifindex (after validating it) and vlan id (already > validated by rtnl_fdb_flush) for matching. Flush filtering is needed > because user-space applications need a quick way to delete only a > specific set of entries, e.g. mlag implementations need a way to flush only > dynamic entries excluding externally learned ones or only externally > learned ones without static entries etc. Also apps usually want to target > only a specific vlan or port/vlan combination. The current 2 flush > operations (per port and bridge-wide) are not extensible and cannot > provide such filtering. > > I decided against embedding new attrs into the old flush attributes for > multiple reasons - proper error handling on unsupported attributes, > older kernels silently flushing all, need for a second mechanism to > signal that the attribute should be parsed (e.g. using boolopts), > special treatment for permanent entries. > > Examples: > $ bridge fdb flush dev bridge vlan 100 static > < flush all static entries on vlan 100 > > $ bridge fdb flush dev bridge vlan 1 dynamic > < flush all dynamic entries on vlan 1 > > $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic > < flush all dynamic entries on port ens16 and vlan 1 > > $ bridge fdb flush dev ens16 vlan 1 dynamic master > < as above: flush all dynamic entries on port ens16 and vlan 1 > > $ bridge fdb flush dev bridge nooffloaded nopermanent self > < flush all non-offloaded and non-permanent entries > > $ bridge fdb flush dev bridge static noextern_learn > < flush all static entries which are not externally learned > > $ bridge fdb flush dev bridge permanent > < flush all permanent entries > > $ bridge fdb flush dev bridge port bridge permanent > < flush all permanent entries pointing to the bridge itself > > > Example of a flush call with unsupported netlink attribute (NDA_DST): > $ bridge fdb flush dev bridge vlan 100 dynamic dst > Error: Unsupported attribute. > > Example of a flush call on an older kernel: > $ bridge fdb flush dev bridge dynamic > Error: invalid address. > > Note that all flags have their negated version (static vs nostatic etc) > and there are some tricky cases to handle like "static" which in flag > terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the > mask matches on both but we need only NUD_NOARP to be set. That's > because permanent entries have both set so we can't just match on > NUD_NOARP. Also note that this flush operation doesn't treat permanent > entries in a special way (fdb_delete vs fdb_delete_local), it will > delete them regardless if any port is using them. We can extend the api > with a flag to do that if needed in the future. > > Patch-sets (in order): > - Initial flush infra and fdb flush filtering (this set) > - iproute2 support > - selftests > > Future work: > - mdb flush support (RTM_FLUSHMDB type) > > Thanks to Ido for the great discussion and feedback while working on this. > > v3: Add NLM_F_BULK delete modifier and ndo_fdb_del_bulk callback, > patches 01 - 03 and 06 are new. Patch 04 is changed to implement > bulk_del instead of flush, patches 05, 07 and 08 are adjusted to > use NDA_ attributes > > Thanks, > Nik > > Nikolay Aleksandrov (8): > net: netlink: add NLM_F_BULK delete request modifier > net: add ndo_fdb_del_bulk > net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del > net: bridge: fdb: add ndo_fdb_del_bulk > net: bridge: fdb: add support for fine-grained flushing > net: rtnetlink: add ndm flags and state mask attributes > net: bridge: fdb: add support for flush filtering based on ndm flags > and state > net: bridge: fdb: add support for flush filtering based on ifindex and > vlan > > include/linux/netdevice.h | 9 ++ > include/uapi/linux/neighbour.h | 2 + > include/uapi/linux/netlink.h | 1 + > net/bridge/br_device.c | 1 + > net/bridge/br_fdb.c | 154 +++++++++++++++++++++++++++++++-- > net/bridge/br_netlink.c | 9 +- > net/bridge/br_private.h | 19 +++- > net/bridge/br_sysfs_br.c | 6 +- > net/core/rtnetlink.c | 66 ++++++++++---- > 9 files changed, 238 insertions(+), 29 deletions(-) >I realized an improvement I've missed to do in patch 08 (use port's ifindex when doing a bridge flush through a port and NDA_IFINDEX is not specified), I'll leave this set for comments and will prepare v4 with it and anything else that comes up in the meantime. Thanks, Nik
David Ahern
2022-Apr-13 02:04 UTC
[Bridge] [PATCH net-next v3 0/8] net: bridge: add flush filtering support
On 4/12/22 7:22 AM, Nikolay Aleksandrov wrote:> Hi, > This patch-set adds support to specify filtering conditions for a bulk > delete (flush) operation. This version uses a new nlmsghdr delete flag > called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is > used to signal that the driver supports bulk deletes (that avoids > pushing common mac address checks to ndo_fdb_del implementations and > also has a different prototype and parsed attribute expectations, more > info in patch 03). The new delete flag can be used for any RTM_DEL* > type, implementations just need to be careful with older kernels which > are doing non-strict attribute parses. Here I use the fact that macoverall it looks fine to me. The rollout of BULK delete for other commands will be slow so we need a way to reject the BULK flag if the handler does not support it. One thought is to add another flag to rtnl_link_flags (e.g., RTNL_FLAG_BULK_DEL_SUPPORTED) and pass that flag in for handlers that handle bulk delete and reject it for others in core rtnetlink code.
Nikolay Aleksandrov
2022-Apr-13 07:27 UTC
[Bridge] [PATCH net-next v3 0/8] net: bridge: add flush filtering support
On 13/04/2022 05:04, David Ahern wrote:> On 4/12/22 7:22 AM, Nikolay Aleksandrov wrote: >> Hi, >> This patch-set adds support to specify filtering conditions for a bulk >> delete (flush) operation. This version uses a new nlmsghdr delete flag >> called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is >> used to signal that the driver supports bulk deletes (that avoids >> pushing common mac address checks to ndo_fdb_del implementations and >> also has a different prototype and parsed attribute expectations, more >> info in patch 03). The new delete flag can be used for any RTM_DEL* >> type, implementations just need to be careful with older kernels which >> are doing non-strict attribute parses. Here I use the fact that mac > > overall it looks fine to me. The rollout of BULK delete for other > commands will be slow so we need a way to reject the BULK flag if the > handler does not support it. One thought is to add another flag to > rtnl_link_flags (e.g., RTNL_FLAG_BULK_DEL_SUPPORTED) and pass that flag > in for handlers that handle bulk delete and reject it for others in core > rtnetlink code.Good point, it will be nice to error out with something meaningful if bulk delete isn't supported. I'll look into it. Thanks, Nik