Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 0/8] net: bridge: add flush filtering support
Hi, This patch-set adds support to specify filtering conditions for a flush operation. This version has entirely different entry point (v1 had bridge-specific IFLA attribute, here I add new RTM_FLUSHNEIGH msg and netdev ndo_fdb_flush op) so I'll give a new overview altogether. After me and Ido discussed the feature offlist, we agreed that it would be best to add a new generic RTM_FLUSHNEIGH with a new ndo_fdb_flush callback which can be re-used for other drivers (e.g. vxlan). Patch 01 adds the new RTM_FLUSHNEIGH type, patch 02 then adds the new ndo_fdb_flush call. With this structure we need to add a generic rtnl_fdb_flush which will be used to do basic attribute validation and dispatch the call to the appropriate device based on the NTF_USE/MASTER flags (patch 03). Patch 04 then adds some common flush attributes which are used by the bridge and vxlan drivers (target ifindex, vlan id, ndm flags/state masks) with basic attribute validation, further validation can be done by the implementers of the ndo callback. Patch 05 adds a minimal ndo_fdb_flush to the bridge driver, it uses the current br_fdb_flush implementation to flush all entries similar to existing calls. Patch 06 adds filtering support to the new bridge flush op which supports target ifindex (port or bridge), vlan id and flags/state mask. Patch 07 converts ndm state/flags and their masks to bridge-private flags and fills them in the filter descriptor for matching. Finally patch 08 fills in the target ifindex (after validating it) and vlan id (already validated by rtnl_fdb_flush) for matching. Flush filtering is needed because user-space applications need a quick way to delete only a specific set of entries, e.g. mlag implementations need a way to flush only dynamic entries excluding externally learned ones or only externally learned ones without static entries etc. Also apps usually want to target only a specific vlan or port/vlan combination. The current 2 flush operations (per port and bridge-wide) are not extensible and cannot provide such filtering. I decided against embedding new attrs into the old flush attributes for multiple reasons - proper error handling on unsupported attributes, older kernels silently flushing all, need for a second mechanism to signal that the attribute should be parsed (e.g. using boolopts), special treatment for permanent entries. Examples: $ bridge fdb flush dev bridge vlan 100 static < flush all static entries on vlan 100 > $ bridge fdb flush dev bridge vlan 1 dynamic < flush all dynamic entries on vlan 1 > $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic < flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev ens16 vlan 1 dynamic master < as above: flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev bridge nooffloaded nopermanent self < flush all non-offloaded and non-permanent entries > $ bridge fdb flush dev bridge static noextern_learn < flush all static entries which are not externally learned > $ bridge fdb flush dev bridge permanent < flush all permanent entries > $ bridge fdb flush dev bridge port bridge permanent < flush all permanent entries pointing to the bridge itself > Note that all flags have their negated version (static vs nostatic etc) and there are some tricky cases to handle like "static" which in flag terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the mask matches on both but we need only NUD_NOARP to be set. That's because permanent entries have both set so we can't just match on NUD_NOARP. Also note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them. We can extend the api with a flag to do that if needed in the future. Patch-sets (in order): - Initial flush infra and fdb flush filtering (this set) - iproute2 support - selftests Future work: - mdb flush support (RTM_FLUSHMDB type) Thanks to Ido for the great discussion and feedback while working on this. Thanks, Nik Nikolay Aleksandrov (8): net: rtnetlink: add RTM_FLUSHNEIGH net: add ndo_fdb_flush op net: bridge: fdb: add ndo_fdb_flush op net: rtnetlink: register a generic rtnl_fdb_flush call net: rtnetlink: add common flush attributes net: bridge: fdb: add support for fine-grained flushing net: bridge: fdb: add support for flush filtering based on ndm flags and state net: bridge: fdb: add support for flush filtering based on ifindex and vlan include/linux/netdevice.h | 11 +++ include/uapi/linux/neighbour.h | 10 +++ include/uapi/linux/rtnetlink.h | 3 + net/bridge/br_device.c | 1 + net/bridge/br_fdb.c | 154 +++++++++++++++++++++++++++++++-- net/bridge/br_netlink.c | 9 +- net/bridge/br_private.h | 19 +++- net/bridge/br_sysfs_br.c | 6 +- net/core/rtnetlink.c | 62 +++++++++++++ security/selinux/nlmsgtab.c | 3 +- 10 files changed, 266 insertions(+), 12 deletions(-) -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 1/8] net: rtnetlink: add RTM_FLUSHNEIGH
Add a new rtnetlink type used to flush neigh objects. It will be initially used to add flush with filtering support for bridge fdbs, but it also opens the door to add similar support to others (e.g. vxlan). Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/uapi/linux/rtnetlink.h | 3 +++ security/selinux/nlmsgtab.c | 3 ++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 83849a37db5b..06001cfd404b 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -194,6 +194,9 @@ enum { RTM_GETTUNNEL, #define RTM_GETTUNNEL RTM_GETTUNNEL + RTM_FLUSHNEIGH = 124, +#define RTM_FLUSHNEIGH RTM_FLUSHNEIGH + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index d8ceee9e0d6f..ff53aea8790f 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -95,6 +95,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] { RTM_NEWTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_FLUSHNEIGH, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, }; static const struct nlmsg_perm nlmsg_tcpdiag_perms[] @@ -180,7 +181,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWTUNNEL + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_FLUSHNEIGH + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break; -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 2/8] net: add ndo_fdb_flush op
Add a new netdev op called ndo_fdb_flush, it will be later used for driver-specific flush implementation dispatched from rtnetlink. The first user will be the bridge. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/linux/netdevice.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 28ea4f8269d4..16d67e40053c 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1265,6 +1265,12 @@ struct netdev_net_notifier { * int *idx) * Used to add FDB entries to dump requests. Implementers should add * entries to skb and update idx with the number of entries. + * int (*ndo_fdb_flush)(struct ndmsg *ndm, struct nlattr *tb[], + * struct net_device *dev, + * u16 vid, + * struct netlink_ext_ack *extack); + * Used to flush FDB entries. Filter attributes can be specified to delete + * only matching FDB entries if implementers support it. * * int (*ndo_bridge_setlink)(struct net_device *dev, struct nlmsghdr *nlh, * u16 flags, struct netlink_ext_ack *extack) @@ -1515,6 +1521,11 @@ struct net_device_ops { struct net_device *dev, struct net_device *filter_dev, int *idx); + int (*ndo_fdb_flush)(struct ndmsg *ndm, + struct nlattr *tb[], + struct net_device *dev, + u16 vid, + struct netlink_ext_ack *extack); int (*ndo_fdb_get)(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 3/8] net: bridge: fdb: add ndo_fdb_flush op
Add a minimal ndo_fdb_flush implementation which flushes all entries. Support for more fine-grained filtering will be added in the following patches. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- net/bridge/br_device.c | 1 + net/bridge/br_fdb.c | 25 ++++++++++++++++++++++++- net/bridge/br_netlink.c | 2 +- net/bridge/br_private.h | 6 +++++- net/bridge/br_sysfs_br.c | 2 +- 5 files changed, 32 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 8d6bab244c4a..76ee2675457a 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -466,6 +466,7 @@ static const struct net_device_ops br_netdev_ops = { .ndo_fdb_add = br_fdb_add, .ndo_fdb_del = br_fdb_delete, .ndo_fdb_dump = br_fdb_dump, + .ndo_fdb_flush = br_fdb_flush, .ndo_fdb_get = br_fdb_get, .ndo_bridge_getlink = br_getlink, .ndo_bridge_setlink = br_setlink, diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 6ccda68bd473..64a549acdac8 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -559,7 +559,7 @@ void br_fdb_cleanup(struct work_struct *work) } /* Completely flush all dynamic entries in forwarding database.*/ -void br_fdb_flush(struct net_bridge *br) +void __br_fdb_flush(struct net_bridge *br) { struct net_bridge_fdb_entry *f; struct hlist_node *tmp; @@ -572,6 +572,29 @@ void br_fdb_flush(struct net_bridge *br) spin_unlock_bh(&br->hash_lock); } +int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], + struct net_device *dev, u16 vid, + struct netlink_ext_ack *extack) +{ + struct net_bridge *br; + + if (netif_is_bridge_master(dev)) { + br = netdev_priv(dev); + } else { + struct net_bridge_port *p = br_port_get_rtnl(dev); + + if (!p) { + NL_SET_ERR_MSG_MOD(extack, "Device is not a bridge port"); + return -EINVAL; + } + br = p->br; + } + + __br_fdb_flush(br); + + return 0; +} + /* Flush all entries referring to a specific port. * if do_all is set also flush static entries * if vid is set delete all entries that match the vlan_id diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 200ad05b296f..c59c775730bb 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1327,7 +1327,7 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], } if (data[IFLA_BR_FDB_FLUSH]) - br_fdb_flush(br); + __br_fdb_flush(br); #ifdef CONFIG_BRIDGE_IGMP_SNOOPING if (data[IFLA_BR_MCAST_ROUTER]) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 6e62af2e07e9..23ef2982d1bc 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -759,7 +759,11 @@ int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); void br_fdb_hash_fini(struct net_bridge *br); -void br_fdb_flush(struct net_bridge *br); +void __br_fdb_flush(struct net_bridge *br); +int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], + struct net_device *dev, u16 vid, + struct netlink_ext_ack *extack); + void br_fdb_find_delete_local(struct net_bridge *br, const struct net_bridge_port *p, const unsigned char *addr, u16 vid); diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c index 3f7ca88c2aa3..7a2cf3aebc84 100644 --- a/net/bridge/br_sysfs_br.c +++ b/net/bridge/br_sysfs_br.c @@ -344,7 +344,7 @@ static DEVICE_ATTR_RW(group_addr); static int set_flush(struct net_bridge *br, unsigned long val, struct netlink_ext_ack *extack) { - br_fdb_flush(br); + __br_fdb_flush(br); return 0; } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 4/8] net: rtnetlink: register a generic rtnl_fdb_flush call
Register a generic PF_BRIDGE rtnl_fdb_flush call which does basic validation and dispatches the call to the appropriate device based on ndm flags (NTF_MASTER and NTF_SELF). The flags are interepreted in a similar way to the already existing fdb add and del. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/uapi/linux/neighbour.h | 6 ++++ net/core/rtnetlink.c | 52 ++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index db05fb55055e..60e728319a50 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -212,4 +212,10 @@ enum { }; #define NFEA_MAX (__NFEA_MAX - 1) +enum { + NDFA_UNSPEC, + __NDFA_MAX +}; +#define NDFA_MAX (__NDFA_MAX - 1) + #endif diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 4041b3e2e8ec..7325b60d1aa2 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4659,6 +4659,56 @@ static int rtnl_fdb_get(struct sk_buff *in_skb, struct nlmsghdr *nlh, return err; } +static const struct nla_policy fdb_flush_policy[NDFA_MAX + 1] = { + [NDFA_UNSPEC] = { .type = NLA_REJECT }, +}; + +static int rtnl_fdb_flush(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct nlattr *tb[NDFA_MAX + 1]; + struct net_device *dev; + struct ndmsg *ndm; + int err; + + err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDFA_MAX, fdb_flush_policy, + extack); + if (err < 0) + return err; + + ndm = nlmsg_data(nlh); + if (ndm->ndm_ifindex == 0) { + NL_SET_ERR_MSG(extack, "Invalid ifindex"); + return -EINVAL; + } + + dev = __dev_get_by_index(net, ndm->ndm_ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "Unknown ifindex"); + return -ENODEV; + } + + err = -EOPNOTSUPP; + if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && + netif_is_bridge_port(dev)) { + struct net_device *br_dev = netdev_master_upper_dev_get(dev); + + err = br_dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, 0, extack); + if (err) + goto out; + else + ndm->ndm_flags &= ~NTF_MASTER; + } + if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_flush) { + err = dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, 0, extack); + if (!err) + ndm->ndm_flags &= ~NTF_SELF; + } +out: + return err; +} + static int brport_nla_put_flag(struct sk_buff *skb, u32 flags, u32 mask, unsigned int attrnum, unsigned int flag) { @@ -6144,6 +6194,8 @@ void __init rtnetlink_init(void) rtnl_register(PF_BRIDGE, RTM_DELLINK, rtnl_bridge_dellink, NULL, 0); rtnl_register(PF_BRIDGE, RTM_SETLINK, rtnl_bridge_setlink, NULL, 0); + rtnl_register(PF_BRIDGE, RTM_FLUSHNEIGH, rtnl_fdb_flush, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0); rtnl_register(PF_UNSPEC, RTM_SETSTATS, rtnl_stats_set, NULL, 0); -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 5/8] net: rtnetlink: add common flush attributes
Add common fdb flush attributes - ifindex, vlan id, ndm flags/state masks. All of these are used by the bridge and vxlan drivers. Also minimal attr policy validation is added, it is up to ndo_fdb_flush implementers to further validate them. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- include/uapi/linux/neighbour.h | 4 ++++ net/core/rtnetlink.c | 16 +++++++++++++--- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index 60e728319a50..5ab4e9b5edc8 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -214,6 +214,10 @@ enum { enum { NDFA_UNSPEC, + NDFA_IFINDEX, + NDFA_VLAN, + NDFA_NDM_STATE_MASK, + NDFA_NDM_FLAGS_MASK, __NDFA_MAX }; #define NDFA_MAX (__NDFA_MAX - 1) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 7325b60d1aa2..379b6a066fbd 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4660,7 +4660,11 @@ static int rtnl_fdb_get(struct sk_buff *in_skb, struct nlmsghdr *nlh, } static const struct nla_policy fdb_flush_policy[NDFA_MAX + 1] = { - [NDFA_UNSPEC] = { .type = NLA_REJECT }, + [NDFA_UNSPEC] = { .type = NLA_REJECT }, + [NDFA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), + [NDFA_VLAN] = { .type = NLA_U16 }, + [NDFA_NDM_STATE_MASK] = { .type = NLA_U16 }, + [NDFA_NDM_FLAGS_MASK] = { .type = NLA_U8 }, }; static int rtnl_fdb_flush(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -4670,6 +4674,7 @@ static int rtnl_fdb_flush(struct sk_buff *skb, struct nlmsghdr *nlh, struct nlattr *tb[NDFA_MAX + 1]; struct net_device *dev; struct ndmsg *ndm; + u16 vid; int err; err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDFA_MAX, fdb_flush_policy, @@ -4689,19 +4694,24 @@ static int rtnl_fdb_flush(struct sk_buff *skb, struct nlmsghdr *nlh, return -ENODEV; } + err = fdb_vid_parse(tb[NDFA_VLAN], &vid, extack); + if (err) + return err; + err = -EOPNOTSUPP; if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) && netif_is_bridge_port(dev)) { struct net_device *br_dev = netdev_master_upper_dev_get(dev); - err = br_dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, 0, extack); + err = br_dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, vid, + extack); if (err) goto out; else ndm->ndm_flags &= ~NTF_MASTER; } if ((ndm->ndm_flags & NTF_SELF) && dev->netdev_ops->ndo_fdb_flush) { - err = dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, 0, extack); + err = dev->netdev_ops->ndo_fdb_flush(ndm, tb, dev, vid, extack); if (!err) ndm->ndm_flags &= ~NTF_SELF; } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 6/8] net: bridge: fdb: add support for fine-grained flushing
Add the ability to specify exactly which fdbs to be flushed. They are described by a new structure - net_bridge_fdb_flush_desc. Currently it can match on port/bridge ifindex, vlan id and fdb flags. It is used to describe the existing dynamic fdb flush operation. Note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them, so currently it can't directly replace deletes which need to handle that case, although we can extend it later for that too. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: changed the flush matches func for better readability (Ido) net/bridge/br_fdb.c | 41 ++++++++++++++++++++++++++++++++-------- net/bridge/br_netlink.c | 9 +++++++-- net/bridge/br_private.h | 10 +++++++++- net/bridge/br_sysfs_br.c | 6 +++++- 4 files changed, 54 insertions(+), 12 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 64a549acdac8..045eb61e833e 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -558,24 +558,49 @@ void br_fdb_cleanup(struct work_struct *work) mod_delayed_work(system_long_wq, &br->gc_work, work_delay); } -/* Completely flush all dynamic entries in forwarding database.*/ -void __br_fdb_flush(struct net_bridge *br) +static bool __fdb_flush_matches(const struct net_bridge *br, + const struct net_bridge_fdb_entry *f, + const struct net_bridge_fdb_flush_desc *desc) +{ + const struct net_bridge_port *dst = READ_ONCE(f->dst); + int port_ifidx = dst ? dst->dev->ifindex : br->dev->ifindex; + + if (desc->vlan_id && desc->vlan_id != f->key.vlan_id) + return false; + if (desc->port_ifindex && desc->port_ifindex != port_ifidx) + return false; + if (desc->flags_mask && (f->flags & desc->flags_mask) != desc->flags) + return false; + + return true; +} + +/* Flush forwarding database entries matching the description */ +void __br_fdb_flush(struct net_bridge *br, + const struct net_bridge_fdb_flush_desc *desc) { struct net_bridge_fdb_entry *f; - struct hlist_node *tmp; - spin_lock_bh(&br->hash_lock); - hlist_for_each_entry_safe(f, tmp, &br->fdb_list, fdb_node) { - if (!test_bit(BR_FDB_STATIC, &f->flags)) + rcu_read_lock(); + hlist_for_each_entry_rcu(f, &br->fdb_list, fdb_node) { + if (!__fdb_flush_matches(br, f, desc)) + continue; + + spin_lock_bh(&br->hash_lock); + if (!hlist_unhashed(&f->fdb_node)) fdb_delete(br, f, true); + spin_unlock_bh(&br->hash_lock); } - spin_unlock_bh(&br->hash_lock); + rcu_read_unlock(); } int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -590,7 +615,7 @@ int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], br = p->br; } - __br_fdb_flush(br); + __br_fdb_flush(br, &desc); return 0; } diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index c59c775730bb..accab38b0b6a 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1326,8 +1326,13 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], br_recalculate_fwd_mask(br); } - if (data[IFLA_BR_FDB_FLUSH]) - __br_fdb_flush(br); + if (data[IFLA_BR_FDB_FLUSH]) { + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; + + __br_fdb_flush(br, &desc); + } #ifdef CONFIG_BRIDGE_IGMP_SNOOPING if (data[IFLA_BR_MCAST_ROUTER]) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 23ef2982d1bc..9fb9abdbd3f4 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -274,6 +274,13 @@ struct net_bridge_fdb_entry { struct rcu_head rcu; }; +struct net_bridge_fdb_flush_desc { + unsigned long flags; + unsigned long flags_mask; + int port_ifindex; + u16 vlan_id; +}; + #define MDB_PG_FLAGS_PERMANENT BIT(0) #define MDB_PG_FLAGS_OFFLOAD BIT(1) #define MDB_PG_FLAGS_FAST_LEAVE BIT(2) @@ -759,7 +766,8 @@ int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); void br_fdb_hash_fini(struct net_bridge *br); -void __br_fdb_flush(struct net_bridge *br); +void __br_fdb_flush(struct net_bridge *br, + const struct net_bridge_fdb_flush_desc *desc); int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack); diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c index 7a2cf3aebc84..c863151f1cde 100644 --- a/net/bridge/br_sysfs_br.c +++ b/net/bridge/br_sysfs_br.c @@ -344,7 +344,11 @@ static DEVICE_ATTR_RW(group_addr); static int set_flush(struct net_bridge *br, unsigned long val, struct netlink_ext_ack *extack) { - __br_fdb_flush(br); + struct net_bridge_fdb_flush_desc desc = { + .flags_mask = BR_FDB_STATIC + }; + + __br_fdb_flush(br, &desc); return 0; } -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 7/8] net: bridge: fdb: add support for flush filtering based on ndm flags and state
Add support for fdb flush filtering based on ndm flags and state. NDM state and flags are mapped to bridge-specific flags and matched according to the specified masks. NTF_USE is used to represent added_by_user flag since it sets it on fdb add and we don't have a 1:1 mapping for it. Only allowed bits can be set, NTF_USE and NTF_MASTER are ignored. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: ignore NTF_USE/NTF_MASTER and reject unknown flags net/bridge/br_fdb.c | 58 ++++++++++++++++++++++++++++++++++++++--- net/bridge/br_private.h | 5 ++++ 2 files changed, 60 insertions(+), 3 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 045eb61e833e..2cea03cbc55f 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -594,13 +594,40 @@ void __br_fdb_flush(struct net_bridge *br, rcu_read_unlock(); } +static unsigned long __ndm_state_to_fdb_flags(u16 ndm_state) +{ + unsigned long flags = 0; + + if (ndm_state & NUD_PERMANENT) + __set_bit(BR_FDB_LOCAL, &flags); + if (ndm_state & NUD_NOARP) + __set_bit(BR_FDB_STATIC, &flags); + + return flags; +} + +static unsigned long __ndm_flags_to_fdb_flags(u8 ndm_flags) +{ + unsigned long flags = 0; + + if (ndm_flags & NTF_USE) + __set_bit(BR_FDB_ADDED_BY_USER, &flags); + if (ndm_flags & NTF_EXT_LEARNED) + __set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &flags); + if (ndm_flags & NTF_OFFLOADED) + __set_bit(BR_FDB_OFFLOADED, &flags); + if (ndm_flags & NTF_STICKY) + __set_bit(BR_FDB_STICKY, &flags); + + return flags; +} + int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { - struct net_bridge_fdb_flush_desc desc = { - .flags_mask = BR_FDB_STATIC - }; + u8 ndm_flags = ndm->ndm_flags & ~FDB_FLUSH_IGNORED_NDM_FLAGS; + struct net_bridge_fdb_flush_desc desc = {}; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -615,6 +642,31 @@ int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], br = p->br; } + if (ndm_flags & ~FDB_FLUSH_ALLOWED_NDM_FLAGS) { + NL_SET_ERR_MSG(extack, "Unsupported fdb flush ndm flag bits set"); + return -EINVAL; + } + if (ndm->ndm_state & ~FDB_FLUSH_ALLOWED_NDM_STATES) { + NL_SET_ERR_MSG(extack, "Unsupported fdb flush ndm state bits set"); + return -EINVAL; + } + + desc.flags |= __ndm_state_to_fdb_flags(ndm->ndm_state); + desc.flags |= __ndm_flags_to_fdb_flags(ndm_flags); + if (tb[NDFA_NDM_STATE_MASK]) { + u16 ndm_state_mask = nla_get_u16(tb[NDFA_NDM_STATE_MASK]); + + desc.flags_mask |= __ndm_state_to_fdb_flags(ndm_state_mask); + } + if (tb[NDFA_NDM_FLAGS_MASK]) { + u8 ndm_flags_mask = nla_get_u8(tb[NDFA_NDM_FLAGS_MASK]); + + desc.flags_mask |= __ndm_flags_to_fdb_flags(ndm_flags_mask); + } + + br_debug(br, "flushing port ifindex: %d vlan id: %u flags: 0x%lx flags mask: 0x%lx\n", + desc.port_ifindex, desc.vlan_id, desc.flags, desc.flags_mask); + __br_fdb_flush(br, &desc); return 0; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 9fb9abdbd3f4..fd5cbd00e12d 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -762,6 +762,11 @@ static inline void br_netpoll_disable(struct net_bridge_port *p) #endif /* br_fdb.c */ +#define FDB_FLUSH_IGNORED_NDM_FLAGS (NTF_MASTER | NTF_SELF) +#define FDB_FLUSH_ALLOWED_NDM_STATES (NUD_PERMANENT | NUD_NOARP) +#define FDB_FLUSH_ALLOWED_NDM_FLAGS (NTF_USE | NTF_EXT_LEARNED | \ + NTF_STICKY | NTF_OFFLOADED) + int br_fdb_init(void); void br_fdb_fini(void); int br_fdb_hash_init(struct net_bridge *br); -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:29 UTC
[Bridge] [PATCH net-next v2 8/8] net: bridge: fdb: add support for flush filtering based on ifindex and vlan
Add support for fdb flush filtering based on destination ifindex and vlan id. The ifindex must either match a port's device ifindex or the bridge's. The vlan support is trivial since it's already validated by rtnl_fdb_flush, we just need to fill it in. Signed-off-by: Nikolay Aleksandrov <razor at blackwall.org> --- v2: validate ifindex and fill in vlan id net/bridge/br_fdb.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 2cea03cbc55f..b078a656776a 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -622,12 +622,44 @@ static unsigned long __ndm_flags_to_fdb_flags(u8 ndm_flags) return flags; } +static int __fdb_flush_validate_ifindex(const struct net_bridge *br, + int ifindex, + struct netlink_ext_ack *extack) +{ + const struct net_device *dev; + + dev = __dev_get_by_index(dev_net(br->dev), ifindex); + if (!dev) { + NL_SET_ERR_MSG_MOD(extack, "Unknown flush device ifindex"); + return -ENODEV; + } + if (!netif_is_bridge_master(dev) && !netif_is_bridge_port(dev)) { + NL_SET_ERR_MSG_MOD(extack, "Flush device is not a bridge or bridge port"); + return -EINVAL; + } + if (netif_is_bridge_master(dev) && dev != br->dev) { + NL_SET_ERR_MSG_MOD(extack, + "Flush bridge device does not match target bridge device"); + return -EINVAL; + } + if (netif_is_bridge_port(dev)) { + struct net_bridge_port *p = br_port_get_rtnl(dev); + + if (p->br != br) { + NL_SET_ERR_MSG_MOD(extack, "Port belongs to a different bridge device"); + return -EINVAL; + } + } + + return 0; +} + int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, u16 vid, struct netlink_ext_ack *extack) { u8 ndm_flags = ndm->ndm_flags & ~FDB_FLUSH_IGNORED_NDM_FLAGS; - struct net_bridge_fdb_flush_desc desc = {}; + struct net_bridge_fdb_flush_desc desc = { .vlan_id = vid }; struct net_bridge *br; if (netif_is_bridge_master(dev)) { @@ -663,6 +695,14 @@ int br_fdb_flush(struct ndmsg *ndm, struct nlattr *tb[], desc.flags_mask |= __ndm_flags_to_fdb_flags(ndm_flags_mask); } + if (tb[NDFA_IFINDEX]) { + int err, ifidx = nla_get_s32(tb[NDFA_IFINDEX]); + + err = __fdb_flush_validate_ifindex(br, ifidx, extack); + if (err) + return err; + desc.port_ifindex = ifidx; + } br_debug(br, "flushing port ifindex: %d vlan id: %u flags: 0x%lx flags mask: 0x%lx\n", desc.port_ifindex, desc.vlan_id, desc.flags, desc.flags_mask); -- 2.35.1
Nikolay Aleksandrov
2022-Apr-11 17:42 UTC
[Bridge] [PATCH net-next v2 0/8] net: bridge: add flush filtering support
On 11/04/2022 20:29, Nikolay Aleksandrov wrote:> Hi, > This patch-set adds support to specify filtering conditions for a flush > operation. This version has entirely different entry point (v1 had > bridge-specific IFLA attribute, here I add new RTM_FLUSHNEIGH msg and > netdev ndo_fdb_flush op) so I'll give a new overview altogether. > After me and Ido discussed the feature offlist, we agreed that it would > be best to add a new generic RTM_FLUSHNEIGH with a new ndo_fdb_flush > callback which can be re-used for other drivers (e.g. vxlan). > Patch 01 adds the new RTM_FLUSHNEIGH type, patch 02 then adds the > new ndo_fdb_flush call. With this structure we need to add a generic > rtnl_fdb_flush which will be used to do basic attribute validation and > dispatch the call to the appropriate device based on the NTF_USE/MASTER > flags (patch 03). Patch 04 then adds some common flush attributes which > are used by the bridge and vxlan drivers (target ifindex, vlan id, ndm > flags/state masks) with basic attribute validation, further validation > can be done by the implementers of the ndo callback. Patch 05 adds a > minimal ndo_fdb_flush to the bridge driver, it uses the current > br_fdb_flush implementation to flush all entries similar to existing > calls. Patch 06 adds filtering support to the new bridge flush op which > supports target ifindex (port or bridge), vlan id and flags/state mask. > Patch 07 converts ndm state/flags and their masks to bridge-private flags > and fills them in the filter descriptor for matching. Finally patch 08Aargh.. I mixed up the patch numbers above. Patch 03 adds the minimal ndo_fdb_flush to the bridge driver (not patch 05), patch 04 adds the generic rtnl_fdb_flush (not patch 03) and patch 05 adds the common attributes (not patch 04). Let me know if you'd like me to repost it with fixed numbers. I'll wait for feedback anyway.> fills in the target ifindex (after validating it) and vlan id (already > validated by rtnl_fdb_flush) for matching. Flush filtering is needed> because user-space applications need a quick way to delete only a > specific set of entries, e.g. mlag implementations need a way to flush only > dynamic entries excluding externally learned ones or only externally > learned ones without static entries etc. Also apps usually want to target > only a specific vlan or port/vlan combination. The current 2 flush > operations (per port and bridge-wide) are not extensible and cannot > provide such filtering.
Roopa Prabhu
2022-Apr-11 18:08 UTC
[Bridge] [PATCH net-next v2 0/8] net: bridge: add flush filtering support
On 4/11/22 10:29, Nikolay Aleksandrov wrote:> Hi, > This patch-set adds support to specify filtering conditions for a flush > operation. This version has entirely different entry point (v1 had > bridge-specific IFLA attribute, here I add new RTM_FLUSHNEIGH msg and > netdev ndo_fdb_flush op) so I'll give a new overview altogether. > After me and Ido discussed the feature offlist, we agreed that it would > be best to add a new generic RTM_FLUSHNEIGH with a new ndo_fdb_flush > callback which can be re-used for other drivers (e.g. vxlan). > Patch 01 adds the new RTM_FLUSHNEIGH type, patch 02 then adds the > new ndo_fdb_flush call. With this structure we need to add a generic > rtnl_fdb_flush which will be used to do basic attribute validation and > dispatch the call to the appropriate device based on the NTF_USE/MASTER > flags (patch 03). Patch 04 then adds some common flush attributes which > are used by the bridge and vxlan drivers (target ifindex, vlan id, ndm > flags/state masks) with basic attribute validation, further validation > can be done by the implementers of the ndo callback. Patch 05 adds a > minimal ndo_fdb_flush to the bridge driver, it uses the current > br_fdb_flush implementation to flush all entries similar to existing > calls. Patch 06 adds filtering support to the new bridge flush op which > supports target ifindex (port or bridge), vlan id and flags/state mask. > Patch 07 converts ndm state/flags and their masks to bridge-private flags > and fills them in the filter descriptor for matching. Finally patch 08 > fills in the target ifindex (after validating it) and vlan id (already > validated by rtnl_fdb_flush) for matching. Flush filtering is needed > because user-space applications need a quick way to delete only a > specific set of entries, e.g. mlag implementations need a way to flush only > dynamic entries excluding externally learned ones or only externally > learned ones without static entries etc. Also apps usually want to target > only a specific vlan or port/vlan combination. The current 2 flush > operations (per port and bridge-wide) are not extensible and cannot > provide such filtering. > > I decided against embedding new attrs into the old flush attributes for > multiple reasons - proper error handling on unsupported attributes, > older kernels silently flushing all, need for a second mechanism to > signal that the attribute should be parsed (e.g. using boolopts), > special treatment for permanent entries. > > Examples: > $ bridge fdb flush dev bridge vlan 100 static > < flush all static entries on vlan 100 > > $ bridge fdb flush dev bridge vlan 1 dynamic > < flush all dynamic entries on vlan 1 > > $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic > < flush all dynamic entries on port ens16 and vlan 1 > > $ bridge fdb flush dev ens16 vlan 1 dynamic master > < as above: flush all dynamic entries on port ens16 and vlan 1 > > $ bridge fdb flush dev bridge nooffloaded nopermanent self > < flush all non-offloaded and non-permanent entries > > $ bridge fdb flush dev bridge static noextern_learn > < flush all static entries which are not externally learned > > $ bridge fdb flush dev bridge permanent > < flush all permanent entries > > $ bridge fdb flush dev bridge port bridge permanent > < flush all permanent entries pointing to the bridge itself > > > Note that all flags have their negated version (static vs nostatic etc) > and there are some tricky cases to handle like "static" which in flag > terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the > mask matches on both but we need only NUD_NOARP to be set. That's > because permanent entries have both set so we can't just match on > NUD_NOARP. Also note that this flush operation doesn't treat permanent > entries in a special way (fdb_delete vs fdb_delete_local), it will > delete them regardless if any port is using them. We can extend the api > with a flag to do that if needed in the future. > > Patch-sets (in order): > - Initial flush infra and fdb flush filtering (this set) > - iproute2 support > - selftests > > Future work: > - mdb flush support (RTM_FLUSHMDB type) > > Thanks to Ido for the great discussion and feedback while working on this. >Cant we pile this on to RTM_DELNEIGH with a flush flag ?. It is a bulk del, and sounds seems similar to the bulk dev del discussion on netdev a few months ago (i dont remember how that api ended up to be. unless i am misremembering). neigh subsystem also needs this, curious how this api will work there. (apologies if you guys already discussed this, did not have time to look through all the comments)