Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 0/8] Add layer 2 miss indication and filtering
tl;dr ==== This patchset adds a single bit to the tc skb extension to indicate that a packet encountered a layer 2 miss in the bridge and extends flower to match on this metadata. This is required for non-DF (Designated Forwarder) filtering in EVPN multi-homing which prevents decapsulated BUM packets from being forwarded multiple times to the same multi-homed host. Background ========= In a typical EVPN multi-homing setup each host is multi-homed using a set of links called ES (Ethernet Segment, i.e., LAG) to multiple leaf switches in a rack. These switches act as VTEPs and are not directly connected (as opposed to MLAG), but can communicate with each other (as well as with VTEPs in remote racks) via spine switches over L3. When a host sends a BUM packet over ES1 to VTEP1, the VTEP will flood it to other VTEPs in the network, including those connected to the host over ES1. The receiving VTEPs must drop the packet and not forward it back to the host. This is called "split-horizon filtering" (SPH) [1]. FRR configures SPH filtering using two tc filters. The first, an ingress filter that matches on packets received from VTEP1 and marks them using a fwmark (firewall mark). The second, an egress filter configured on the LAG interface connected to the host that matches on the fwmark and drops the packets. Example: # tc filter add dev vxlan0 ingress pref 1 proto all flower enc_src_ip $VTEP1_IP action skbedit mark 101 # tc filter add dev bond0 egress pref 1 handle 101 fw action drop Motivation ========= For each ES, only one VTEP is elected by the control plane as the DF. The DF is responsible for forwarding decapsulated BUM traffic to the host over the ES. The non-DF VTEPs must drop such traffic as otherwise the host will receive multiple copies of BUM traffic. This is called "non-DF filtering" [2]. Filtering of multicast and broadcast traffic can be achieved using the following flower filter: # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop Unlike broadcast and multicast traffic, it is not currently possible to filter unknown unicast traffic. The classification into unknown unicast is performed by the bridge driver, but is not visible to other layers. Implementation ============= The proposed solution is to add a single bit to the tc skb extension that is set by the bridge for packets that encountered an FDB or MDB miss. The flower classifier is extended to be able to match on this new metadata bit in a similar fashion to existing metadata options such as 'indev'. A bit that is set for every flooded packet would also work, but it does not allow us to differentiate between registered and unregistered multicast traffic which might be useful in the future. A relatively generic name is chosen for this bit - 'l2_miss' - to allow its use to be extended to other layer 2 devices such as VXLAN, should a use case arise. With the above, the control plane can implement a non-DF filter using the following tc filters: # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop # tc filter add dev bond0 egress pref 2 proto all flower indev vxlan0 l2_miss true action drop The first drops broadcast and multicast traffic and the second drops unknown unicast traffic. Testing ====== A test exercising the different permutations of the 'l2_miss' bit is added in patch #8. Patchset overview ================ Patch #1 adds the new bit to the tc skb extension and sets it in the bridge driver for packets that encountered a miss. The marking of the packets and the use of this extension is protected by the 'tc_skb_ext_tc' static key in order to keep performance impact to a minimum when the feature is not in use. Patch #2 extends the flow dissector to dissect this information from the tc skb extension into the 'FLOW_DISSECTOR_KEY_META' key. Patch #3 extends the flower classifier to be able to match on the new layer 2 miss metadata. The classifier enables the 'tc_skb_ext_tc' static key upon the installation of the first filter that matches on 'l2_miss' and disables the key upon the removal of the last filter that matches on it. Patch #4 rejects matching on the new metadata in drivers that already support the 'FLOW_DISSECTOR_KEY_META' key. Patches #5-#6 are small preparations in mlxsw. Patch #7 extends mlxsw to be able to match on layer 2 miss. Patch #8 adds a selftest. iproute2 patches can be found here [3]. Changelog ======== Since v1 [4]: * Patch #1: Use tc skb extension instead of adding a bit to the skb. Do not mark broadcast packets as they never perform a lookup and therefore never incur a miss. * Patch #2: Split from flower patch. Use tc skb extension instead of 'skb->l2_miss'. * Patch #3: Split flow_dissector changes to a previous patch. Use tc skb extension instead of 'skb->l2_miss'. * Patch #4: Expand commit message to explain why some users were not patched. * Patch #5: New patch. * Patch #6: New patch. * Patch #7: Use 'fdb_miss' key element instead of 'dmac_type'. * Patch #8: Test that broadcast does not hit miss filter. Since RFC [5]: No changes. [1] https://datatracker.ietf.org/doc/html/rfc7432#section-8.3 [2] https://datatracker.ietf.org/doc/html/rfc7432#section-8.5 [3] https://github.com/idosch/iproute2/tree/submit/non_df_filter_v1 [4] https://lore.kernel.org/netdev/20230518113328.1952135-1-idosch at nvidia.com/ [5] https://lore.kernel.org/netdev/20230509070446.246088-1-idosch at nvidia.com/ Ido Schimmel (8): skbuff: bridge: Add layer 2 miss indication flow_dissector: Dissect layer 2 miss from tc skb extension net/sched: flower: Allow matching on layer 2 miss flow_offload: Reject matching on layer 2 miss mlxsw: spectrum_flower: Split iif parsing to a separate function mlxsw: spectrum_flower: Do not force matching on iif mlxsw: spectrum_flower: Add ability to match on layer 2 miss selftests: forwarding: Add layer 2 miss test cases .../marvell/prestera/prestera_flower.c | 6 + .../net/ethernet/mellanox/mlx5/core/en_tc.c | 6 + .../mellanox/mlxsw/core_acl_flex_keys.c | 1 + .../mellanox/mlxsw/core_acl_flex_keys.h | 3 +- .../mellanox/mlxsw/spectrum_acl_flex_keys.c | 2 + .../ethernet/mellanox/mlxsw/spectrum_flower.c | 45 ++- drivers/net/ethernet/mscc/ocelot_flower.c | 10 + include/linux/skbuff.h | 1 + include/net/flow_dissector.h | 2 + include/uapi/linux/pkt_cls.h | 2 + net/bridge/br_device.c | 1 + net/bridge/br_forward.c | 3 + net/bridge/br_input.c | 1 + net/bridge/br_private.h | 27 ++ net/core/flow_dissector.c | 10 + net/sched/cls_flower.c | 30 +- .../testing/selftests/net/forwarding/Makefile | 1 + .../net/forwarding/tc_flower_l2_miss.sh | 350 ++++++++++++++++++ 18 files changed, 485 insertions(+), 16 deletions(-) create mode 100755 tools/testing/selftests/net/forwarding/tc_flower_l2_miss.sh -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 1/8] skbuff: bridge: Add layer 2 miss indication
For EVPN non-DF (Designated Forwarder) filtering we need to be able to prevent decapsulated traffic from being flooded to a multi-homed host. Filtering of multicast and broadcast traffic can be achieved using the following flower filter: # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop Unlike broadcast and multicast traffic, it is not currently possible to filter unknown unicast traffic. The classification into unknown unicast is performed by the bridge driver, but is not visible to other layers such as tc. Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear the bit whenever a packet enters the bridge (received from a bridge port or transmitted via the bridge) and set it if the packet did not match an FDB or MDB entry. If there is no skb extension and the bit needs to be cleared, then do not allocate one as no extension is equivalent to the bit being cleared. The bit is not set for broadcast packets as they never perform a lookup and therefore never incur a miss. A bit that is set for every flooded packet would also work for the current use case, but it does not allow us to differentiate between registered and unregistered multicast traffic, which might be useful in the future. To keep the performance impact to a minimum, the marking of packets is guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not touched and an skb extension is not allocated. Instead, only a 5 bytes nop is executed, as demonstrated below for the call site in br_handle_frame(). Before the patch: ``` memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) c37b10: 00 00 p = br_port_get_rcu(skb->dev); c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12) c37b1e: 00 00 c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12) c37b27: 00 00 ``` After the patch (when static key is disabled): ``` memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) c37c30: 00 00 c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) c37c3e: 00 c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) c37c46: 00 #ifdef CONFIG_HAVE_JUMP_LABEL_HACK static __always_inline bool arch_static_branch(struct static_key *key, bool branch) { asm_volatile_goto("1:" c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) br_tc_skb_miss_set(skb, false); p = br_port_get_rcu(skb->dev); c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax ``` Subsequent patches will extend the flower classifier to be able to match on the new 'l2_miss' bit and enable / disable the static key when filters that match on it are added / deleted. Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Use tc skb extension instead of adding a bit to the skb. * Do not mark broadcast packets as they never perform a lookup and therefore never incur a miss. include/linux/skbuff.h | 1 + net/bridge/br_device.c | 1 + net/bridge/br_forward.c | 3 +++ net/bridge/br_input.c | 1 + net/bridge/br_private.h | 27 +++++++++++++++++++++++++++ 5 files changed, 33 insertions(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 5951904413ab..e2f48ddb2f7c 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -330,6 +330,7 @@ struct tc_skb_ext { u8 post_ct_snat:1; u8 post_ct_dnat:1; u8 act_miss:1; /* Set if act_miss_cookie is used */ + u8 l2_miss:1; /* Set by bridge upon FDB or MDB miss */ }; #endif diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 8eca8a5c80c6..9a5ea06236bd 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -39,6 +39,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) u16 vid = 0; memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); + br_tc_skb_miss_set(skb, false); rcu_read_lock(); nf_ops = rcu_dereference(nf_br_ops); diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 84d6dd5e5b1a..6116eba1bd89 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -203,6 +203,8 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb, struct net_bridge_port *prev = NULL; struct net_bridge_port *p; + br_tc_skb_miss_set(skb, pkt_type != BR_PKT_BROADCAST); + list_for_each_entry_rcu(p, &br->port_list, list) { /* Do not flood unicast traffic to ports that turn it off, nor * other traffic if flood off, except for traffic we originate @@ -295,6 +297,7 @@ void br_multicast_flood(struct net_bridge_mdb_entry *mdst, allow_mode_include = false; } else { p = NULL; + br_tc_skb_miss_set(skb, true); } while (p || rp) { diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index fc17b9fd93e6..c34a0b0901b0 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -334,6 +334,7 @@ static rx_handler_result_t br_handle_frame(struct sk_buff **pskb) return RX_HANDLER_CONSUMED; memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); + br_tc_skb_miss_set(skb, false); p = br_port_get_rcu(skb->dev); if (p->flags & BR_VLAN_TUNNEL) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 2119729ded2b..a63b32c1638e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -15,6 +15,7 @@ #include <linux/u64_stats_sync.h> #include <net/route.h> #include <net/ip6_fib.h> +#include <net/pkt_cls.h> #include <linux/if_vlan.h> #include <linux/rhashtable.h> #include <linux/refcount.h> @@ -754,6 +755,32 @@ void br_boolopt_multi_get(const struct net_bridge *br, struct br_boolopt_multi *bm); void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on); +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) +static inline void br_tc_skb_miss_set(struct sk_buff *skb, bool miss) +{ + struct tc_skb_ext *ext; + + if (!tc_skb_ext_tc_enabled()) + return; + + ext = skb_ext_find(skb, TC_SKB_EXT); + if (ext) { + ext->l2_miss = miss; + return; + } + if (!miss) + return; + ext = tc_skb_ext_alloc(skb); + if (!ext) + return; + ext->l2_miss = true; +} +#else +static inline void br_tc_skb_miss_set(struct sk_buff *skb, bool miss) +{ +} +#endif + /* br_device.c */ void br_dev_setup(struct net_device *dev); void br_dev_delete(struct net_device *dev, struct list_head *list); -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 2/8] flow_dissector: Dissect layer 2 miss from tc skb extension
Extend the 'FLOW_DISSECTOR_KEY_META' key with a new 'l2_miss' field and populate it from a field with the same name in the tc skb extension. This field is set by the bridge driver for packets that incur an FDB or MDB miss. The next patch will extend the flower classifier to be able to match on layer 2 misses. Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Split from flower patch. * Use tc skb extension instead of 'skb->l2_miss'. include/net/flow_dissector.h | 2 ++ net/core/flow_dissector.c | 10 ++++++++++ 2 files changed, 12 insertions(+) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 85b2281576ed..8b41668c77fc 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -243,10 +243,12 @@ struct flow_dissector_key_ip { * struct flow_dissector_key_meta: * @ingress_ifindex: ingress ifindex * @ingress_iftype: ingress interface type + * @l2_miss: packet did not match an L2 entry during forwarding */ struct flow_dissector_key_meta { int ingress_ifindex; u16 ingress_iftype; + u8 l2_miss; }; /** diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 25fb0bbc310f..481ca4080cbd 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -27,6 +27,7 @@ #include <linux/tcp.h> #include <linux/ptp_classify.h> #include <net/flow_dissector.h> +#include <net/pkt_cls.h> #include <scsi/fc/fc_fcoe.h> #include <uapi/linux/batadv_packet.h> #include <linux/bpf.h> @@ -241,6 +242,15 @@ void skb_flow_dissect_meta(const struct sk_buff *skb, FLOW_DISSECTOR_KEY_META, target_container); meta->ingress_ifindex = skb->skb_iif; +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) + if (tc_skb_ext_tc_enabled()) { + struct tc_skb_ext *ext; + + ext = skb_ext_find(skb, TC_SKB_EXT); + if (ext) + meta->l2_miss = ext->l2_miss; + } +#endif } EXPORT_SYMBOL(skb_flow_dissect_meta); -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 3/8] net/sched: flower: Allow matching on layer 2 miss
Add the 'TCA_FLOWER_L2_MISS' netlink attribute that allows user space to match on packets that encountered a layer 2 miss. The miss indication is set as metadata in the tc skb extension by the bridge driver upon FDB or MDB lookup miss and dissected by the flow dissector to the 'FLOW_DISSECTOR_KEY_META' key. The use of this skb extension is guarded by the 'tc_skb_ext_tc' static key. As such, enable / disable this key when filters that match on layer 2 miss are added / deleted. Tested: # cat tc_skb_ext_tc.py #!/usr/bin/env -S drgn -s vmlinux refcount = prog["tc_skb_ext_tc"].key.enabled.counter.value_() print(f"tc_skb_ext_tc reference count is {refcount}") # ./tc_skb_ext_tc.py tc_skb_ext_tc reference count is 0 # tc filter add dev swp1 egress proto all handle 101 pref 1 flower src_mac 00:11:22:33:44:55 action drop # tc filter add dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:11:22:33:44:55 l2_miss true action drop # tc filter add dev swp1 egress proto all handle 103 pref 3 flower src_mac 00:11:22:33:44:55 l2_miss false action drop # ./tc_skb_ext_tc.py tc_skb_ext_tc reference count is 2 # tc filter replace dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:01:02:03:04:05 l2_miss false action drop # ./tc_skb_ext_tc.py tc_skb_ext_tc reference count is 2 # tc filter del dev swp1 egress proto all handle 103 pref 3 flower # tc filter del dev swp1 egress proto all handle 102 pref 2 flower # tc filter del dev swp1 egress proto all handle 101 pref 1 flower # ./tc_skb_ext_tc.py tc_skb_ext_tc reference count is 0 Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Split flow_dissector changes to a previous patch. * Use tc skb extension instead of 'skb->l2_miss'. include/uapi/linux/pkt_cls.h | 2 ++ net/sched/cls_flower.c | 30 ++++++++++++++++++++++++++++-- 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 648a82f32666..00933dda7b10 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -594,6 +594,8 @@ enum { TCA_FLOWER_KEY_L2TPV3_SID, /* be32 */ + TCA_FLOWER_L2_MISS, /* u8 */ + __TCA_FLOWER_MAX, }; diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 9dbc43388e57..04adcde9eb81 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -120,6 +120,7 @@ struct cls_fl_filter { u32 handle; u32 flags; u32 in_hw_count; + u8 needs_tc_skb_ext:1; struct rcu_work rwork; struct net_device *hw_dev; /* Flower classifier is unlocked, which means that its reference counter @@ -415,6 +416,8 @@ static struct cls_fl_head *fl_head_dereference(struct tcf_proto *tp) static void __fl_destroy_filter(struct cls_fl_filter *f) { + if (f->needs_tc_skb_ext) + tc_skb_ext_tc_disable(); tcf_exts_destroy(&f->exts); tcf_exts_put_net(&f->exts); kfree(f); @@ -615,7 +618,8 @@ static void *fl_get(struct tcf_proto *tp, u32 handle) } static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { - [TCA_FLOWER_UNSPEC] = { .type = NLA_UNSPEC }, + [TCA_FLOWER_UNSPEC] = { .strict_start_type + TCA_FLOWER_L2_MISS }, [TCA_FLOWER_CLASSID] = { .type = NLA_U32 }, [TCA_FLOWER_INDEV] = { .type = NLA_STRING, .len = IFNAMSIZ }, @@ -720,7 +724,7 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { [TCA_FLOWER_KEY_PPPOE_SID] = { .type = NLA_U16 }, [TCA_FLOWER_KEY_PPP_PROTO] = { .type = NLA_U16 }, [TCA_FLOWER_KEY_L2TPV3_SID] = { .type = NLA_U32 }, - + [TCA_FLOWER_L2_MISS] = NLA_POLICY_MAX(NLA_U8, 1), }; static const struct nla_policy @@ -1668,6 +1672,10 @@ static int fl_set_key(struct net *net, struct nlattr **tb, mask->meta.ingress_ifindex = 0xffffffff; } + fl_set_key_val(tb, &key->meta.l2_miss, TCA_FLOWER_L2_MISS, + &mask->meta.l2_miss, TCA_FLOWER_UNSPEC, + sizeof(key->meta.l2_miss)); + fl_set_key_val(tb, key->eth.dst, TCA_FLOWER_KEY_ETH_DST, mask->eth.dst, TCA_FLOWER_KEY_ETH_DST_MASK, sizeof(key->eth.dst)); @@ -2085,6 +2093,11 @@ static int fl_check_assign_mask(struct cls_fl_head *head, return ret; } +static bool fl_needs_tc_skb_ext(const struct fl_flow_key *mask) +{ + return mask->meta.l2_miss; +} + static int fl_set_parms(struct net *net, struct tcf_proto *tp, struct cls_fl_filter *f, struct fl_flow_mask *mask, unsigned long base, struct nlattr **tb, @@ -2121,6 +2134,14 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp, return -EINVAL; } + /* Enable tc skb extension if filter matches on data extracted from + * this extension. + */ + if (fl_needs_tc_skb_ext(&mask->key)) { + f->needs_tc_skb_ext = 1; + tc_skb_ext_tc_enable(); + } + return 0; } @@ -3074,6 +3095,11 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, goto nla_put_failure; } + if (fl_dump_key_val(skb, &key->meta.l2_miss, + TCA_FLOWER_L2_MISS, &mask->meta.l2_miss, + TCA_FLOWER_UNSPEC, sizeof(key->meta.l2_miss))) + goto nla_put_failure; + if (fl_dump_key_val(skb, key->eth.dst, TCA_FLOWER_KEY_ETH_DST, mask->eth.dst, TCA_FLOWER_KEY_ETH_DST_MASK, sizeof(key->eth.dst)) || -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 4/8] flow_offload: Reject matching on layer 2 miss
Adjust drivers that support the 'FLOW_DISSECTOR_KEY_META' key to reject filters that try to match on the newly added layer 2 miss field. Add an extack message to clearly communicate the failure reason to user space. The following users were not patched: 1. mtk_flow_offload_replace(): Only checks that the key is present, but does not do anything with it. 2. mlx5_tc_ct_set_tuple_match(): Used as part of netfilter offload, which does not make use of the new field, unlike tc. 3. get_netdev_from_rule() in nfp: Likewise. Example: # tc filter add dev swp1 egress pref 1 proto all flower skip_sw l2_miss true action drop Error: mlxsw_spectrum: Can't match on "l2_miss". We have an error talking to the kernel Acked-by: Elad Nachman <enachman at marvell.com> Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Expand commit message to explain why some users were not patched. .../net/ethernet/marvell/prestera/prestera_flower.c | 6 ++++++ drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 ++++++ drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 6 ++++++ drivers/net/ethernet/mscc/ocelot_flower.c | 10 ++++++++++ 4 files changed, 28 insertions(+) diff --git a/drivers/net/ethernet/marvell/prestera/prestera_flower.c b/drivers/net/ethernet/marvell/prestera/prestera_flower.c index 91a478b75cbf..3e20e71b0f81 100644 --- a/drivers/net/ethernet/marvell/prestera/prestera_flower.c +++ b/drivers/net/ethernet/marvell/prestera/prestera_flower.c @@ -148,6 +148,12 @@ static int prestera_flower_parse_meta(struct prestera_acl_rule *rule, __be16 key, mask; flow_rule_match_meta(f_rule, &match); + + if (match.mask->l2_miss) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); + return -EOPNOTSUPP; + } + if (match.mask->ingress_ifindex != 0xFFFFFFFF) { NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported ingress ifindex mask"); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index e95414ef1f04..1b0906cb57ef 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2587,6 +2587,12 @@ static int mlx5e_flower_parse_meta(struct net_device *filter_dev, return 0; flow_rule_match_meta(rule, &match); + + if (match.mask->l2_miss) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); + return -EOPNOTSUPP; + } + if (!match.mask->ingress_ifindex) return 0; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c index 594cdcb90b3d..6fec9223250b 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -294,6 +294,12 @@ static int mlxsw_sp_flower_parse_meta(struct mlxsw_sp_acl_rule_info *rulei, return 0; flow_rule_match_meta(rule, &match); + + if (match.mask->l2_miss) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); + return -EOPNOTSUPP; + } + if (match.mask->ingress_ifindex != 0xFFFFFFFF) { NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported ingress ifindex mask"); return -EINVAL; diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c index ee052404eb55..e0916afcddfb 100644 --- a/drivers/net/ethernet/mscc/ocelot_flower.c +++ b/drivers/net/ethernet/mscc/ocelot_flower.c @@ -592,6 +592,16 @@ ocelot_flower_parse_key(struct ocelot *ocelot, int port, bool ingress, return -EOPNOTSUPP; } + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_META)) { + struct flow_match_meta match; + + flow_rule_match_meta(rule, &match); + if (match.mask->l2_miss) { + NL_SET_ERR_MSG_MOD(extack, "Can't match on \"l2_miss\""); + return -EOPNOTSUPP; + } + } + /* For VCAP ES0 (egress rewriter) we can match on the ingress port */ if (!ingress) { ret = ocelot_flower_parse_indev(ocelot, port, f, filter); -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 5/8] mlxsw: spectrum_flower: Split iif parsing to a separate function
Currently, mlxsw only supports the 'ingress_ifindex' field in the 'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add support for the 'l2_miss' field as well. Split the parsing of the 'ingress_ifindex' field to a separate function to avoid nesting. No functional changes intended. Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * New patch. .../ethernet/mellanox/mlxsw/spectrum_flower.c | 54 +++++++++++-------- 1 file changed, 33 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c index 6fec9223250b..2b0bae847eb9 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -281,45 +281,35 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp, return 0; } -static int mlxsw_sp_flower_parse_meta(struct mlxsw_sp_acl_rule_info *rulei, - struct flow_cls_offload *f, - struct mlxsw_sp_flow_block *block) +static int +mlxsw_sp_flower_parse_meta_iif(struct mlxsw_sp_acl_rule_info *rulei, + const struct mlxsw_sp_flow_block *block, + const struct flow_match_meta *match, + struct netlink_ext_ack *extack) { - struct flow_rule *rule = flow_cls_offload_flow_rule(f); struct mlxsw_sp_port *mlxsw_sp_port; struct net_device *ingress_dev; - struct flow_match_meta match; - - if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_META)) - return 0; - - flow_rule_match_meta(rule, &match); - if (match.mask->l2_miss) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); - return -EOPNOTSUPP; - } - - if (match.mask->ingress_ifindex != 0xFFFFFFFF) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported ingress ifindex mask"); + if (match->mask->ingress_ifindex != 0xFFFFFFFF) { + NL_SET_ERR_MSG_MOD(extack, "Unsupported ingress ifindex mask"); return -EINVAL; } ingress_dev = __dev_get_by_index(block->net, - match.key->ingress_ifindex); + match->key->ingress_ifindex); if (!ingress_dev) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Can't find specified ingress port to match on"); + NL_SET_ERR_MSG_MOD(extack, "Can't find specified ingress port to match on"); return -EINVAL; } if (!mlxsw_sp_port_dev_check(ingress_dev)) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on non-mlxsw ingress port"); + NL_SET_ERR_MSG_MOD(extack, "Can't match on non-mlxsw ingress port"); return -EINVAL; } mlxsw_sp_port = netdev_priv(ingress_dev); if (mlxsw_sp_port->mlxsw_sp != block->mlxsw_sp) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on a port from different device"); + NL_SET_ERR_MSG_MOD(extack, "Can't match on a port from different device"); return -EINVAL; } @@ -327,9 +317,31 @@ static int mlxsw_sp_flower_parse_meta(struct mlxsw_sp_acl_rule_info *rulei, MLXSW_AFK_ELEMENT_SRC_SYS_PORT, mlxsw_sp_port->local_port, 0xFFFFFFFF); + return 0; } +static int mlxsw_sp_flower_parse_meta(struct mlxsw_sp_acl_rule_info *rulei, + struct flow_cls_offload *f, + struct mlxsw_sp_flow_block *block) +{ + struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_meta match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_META)) + return 0; + + flow_rule_match_meta(rule, &match); + + if (match.mask->l2_miss) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); + return -EOPNOTSUPP; + } + + return mlxsw_sp_flower_parse_meta_iif(rulei, block, &match, + f->common.extack); +} + static void mlxsw_sp_flower_parse_ipv4(struct mlxsw_sp_acl_rule_info *rulei, struct flow_cls_offload *f) { -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 6/8] mlxsw: spectrum_flower: Do not force matching on iif
Currently, mlxsw only supports the 'ingress_ifindex' field in the 'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add support for the 'l2_miss' field as well. It is valid to only match on 'l2_miss' without 'ingress_ifindex', so do not force matching on it. Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * New patch. drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c index 2b0bae847eb9..9c62c12e410b 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -290,6 +290,9 @@ mlxsw_sp_flower_parse_meta_iif(struct mlxsw_sp_acl_rule_info *rulei, struct mlxsw_sp_port *mlxsw_sp_port; struct net_device *ingress_dev; + if (!match->mask->ingress_ifindex) + return 0; + if (match->mask->ingress_ifindex != 0xFFFFFFFF) { NL_SET_ERR_MSG_MOD(extack, "Unsupported ingress ifindex mask"); return -EINVAL; -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 7/8] mlxsw: spectrum_flower: Add ability to match on layer 2 miss
Add the 'fdb_miss' key element to supported key blocks and make use of it to match on layer 2 miss. The key is only supported on Spectrum-{2,3,4}. An error is returned for Spectrum-1 since the key element is not present in any of its key blocks. Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Use 'fdb_miss' key element instead of 'dmac_type'. drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c | 1 + drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.h | 3 ++- .../net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c | 2 ++ drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 6 ++---- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c index bd1a51a0a540..f0b2963ebac3 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c @@ -42,6 +42,7 @@ static const struct mlxsw_afk_element_info mlxsw_afk_element_infos[] = { MLXSW_AFK_ELEMENT_INFO_BUF(DST_IP_64_95, 0x34, 4), MLXSW_AFK_ELEMENT_INFO_BUF(DST_IP_32_63, 0x38, 4), MLXSW_AFK_ELEMENT_INFO_BUF(DST_IP_0_31, 0x3C, 4), + MLXSW_AFK_ELEMENT_INFO_U32(FDB_MISS, 0x40, 0, 1), }; struct mlxsw_afk { diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.h b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.h index 3a037fe47211..65a4abadc7db 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.h +++ b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.h @@ -35,6 +35,7 @@ enum mlxsw_afk_element { MLXSW_AFK_ELEMENT_IP_DSCP, MLXSW_AFK_ELEMENT_VIRT_ROUTER_MSB, MLXSW_AFK_ELEMENT_VIRT_ROUTER_LSB, + MLXSW_AFK_ELEMENT_FDB_MISS, MLXSW_AFK_ELEMENT_MAX, }; @@ -69,7 +70,7 @@ struct mlxsw_afk_element_info { MLXSW_AFK_ELEMENT_INFO(MLXSW_AFK_ELEMENT_TYPE_BUF, \ _element, _offset, 0, _size) -#define MLXSW_AFK_ELEMENT_STORAGE_SIZE 0x40 +#define MLXSW_AFK_ELEMENT_STORAGE_SIZE 0x44 struct mlxsw_afk_element_inst { /* element instance in actual block */ enum mlxsw_afk_element element; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c index 00c32320f891..4dea39f2b304 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c @@ -123,10 +123,12 @@ const struct mlxsw_afk_ops mlxsw_sp1_afk_ops = { }; static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_mac_0[] = { + MLXSW_AFK_ELEMENT_INST_U32(FDB_MISS, 0x00, 3, 1), MLXSW_AFK_ELEMENT_INST_BUF(DMAC_0_31, 0x04, 4), }; static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_mac_1[] = { + MLXSW_AFK_ELEMENT_INST_U32(FDB_MISS, 0x00, 3, 1), MLXSW_AFK_ELEMENT_INST_BUF(SMAC_0_31, 0x04, 4), }; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c index 9c62c12e410b..72917f09e806 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c @@ -336,10 +336,8 @@ static int mlxsw_sp_flower_parse_meta(struct mlxsw_sp_acl_rule_info *rulei, flow_rule_match_meta(rule, &match); - if (match.mask->l2_miss) { - NL_SET_ERR_MSG_MOD(f->common.extack, "Can't match on \"l2_miss\""); - return -EOPNOTSUPP; - } + mlxsw_sp_acl_rulei_keymask_u32(rulei, MLXSW_AFK_ELEMENT_FDB_MISS, + match.key->l2_miss, match.mask->l2_miss); return mlxsw_sp_flower_parse_meta_iif(rulei, block, &match, f->common.extack); -- 2.40.1
Ido Schimmel
2023-May-29 11:48 UTC
[Bridge] [PATCH net-next v2 8/8] selftests: forwarding: Add layer 2 miss test cases
Add test cases to verify that the bridge driver correctly marks layer 2 misses only when it should and that the flower classifier can match on this metadata. Example output: # ./tc_flower_l2_miss.sh TEST: L2 miss - Unicast [ OK ] TEST: L2 miss - Multicast (IPv4) [ OK ] TEST: L2 miss - Multicast (IPv6) [ OK ] TEST: L2 miss - Link-local multicast (IPv4) [ OK ] TEST: L2 miss - Link-local multicast (IPv6) [ OK ] TEST: L2 miss - Broadcast [ OK ] Signed-off-by: Ido Schimmel <idosch at nvidia.com> --- Notes: v2: * Test that broadcast does not hit miss filter. .../testing/selftests/net/forwarding/Makefile | 1 + .../net/forwarding/tc_flower_l2_miss.sh | 350 ++++++++++++++++++ 2 files changed, 351 insertions(+) create mode 100755 tools/testing/selftests/net/forwarding/tc_flower_l2_miss.sh diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile index a474c60fe348..9d0062b542e5 100644 --- a/tools/testing/selftests/net/forwarding/Makefile +++ b/tools/testing/selftests/net/forwarding/Makefile @@ -83,6 +83,7 @@ TEST_PROGS = bridge_igmp.sh \ tc_chains.sh \ tc_flower_router.sh \ tc_flower.sh \ + tc_flower_l2_miss.sh \ tc_mpls_l2vpn.sh \ tc_police.sh \ tc_shblocks.sh \ diff --git a/tools/testing/selftests/net/forwarding/tc_flower_l2_miss.sh b/tools/testing/selftests/net/forwarding/tc_flower_l2_miss.sh new file mode 100755 index 000000000000..37b0369b5246 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/tc_flower_l2_miss.sh @@ -0,0 +1,350 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# +-----------------------+ +----------------------+ +# | H1 (vrf) | | H2 (vrf) | +# | + $h1 | | $h2 + | +# | | 192.0.2.1/28 | | 192.0.2.2/28 | | +# | | 2001:db8:1::1/64 | | 2001:db8:1::2/64 | | +# +----|------------------+ +------------------|---+ +# | | +# +----|-------------------------------------------------------------------|---+ +# | SW | | | +# | +-|-------------------------------------------------------------------|-+ | +# | | + $swp1 BR $swp2 + | | +# | +-----------------------------------------------------------------------+ | +# +----------------------------------------------------------------------------+ + +ALL_TESTS=" + test_l2_miss_unicast + test_l2_miss_multicast + test_l2_miss_ll_multicast + test_l2_miss_broadcast +" + +NUM_NETIFS=4 +source lib.sh +source tc_common.sh + +h1_create() +{ + simple_if_init $h1 192.0.2.1/28 2001:db8:1::1/64 +} + +h1_destroy() +{ + simple_if_fini $h1 192.0.2.1/28 2001:db8:1::1/64 +} + +h2_create() +{ + simple_if_init $h2 192.0.2.2/28 2001:db8:1::2/64 +} + +h2_destroy() +{ + simple_if_fini $h2 192.0.2.2/28 2001:db8:1::2/64 +} + +switch_create() +{ + ip link add name br1 up type bridge + ip link set dev $swp1 master br1 + ip link set dev $swp1 up + ip link set dev $swp2 master br1 + ip link set dev $swp2 up + + tc qdisc add dev $swp2 clsact +} + +switch_destroy() +{ + tc qdisc del dev $swp2 clsact + + ip link set dev $swp2 down + ip link set dev $swp2 nomaster + ip link set dev $swp1 down + ip link set dev $swp1 nomaster + ip link del dev br1 +} + +test_l2_miss_unicast() +{ + local dmac=00:01:02:03:04:05 + local dip=192.0.2.2 + local sip=192.0.2.1 + + RET=0 + + # Unknown unicast. + tc filter add dev $swp2 egress protocol ipv4 handle 101 pref 1 \ + flower indev $swp1 l2_miss true dst_mac $dmac src_ip $sip \ + dst_ip $dip action pass + # Known unicast. + tc filter add dev $swp2 egress protocol ipv4 handle 102 pref 1 \ + flower indev $swp1 l2_miss false dst_mac $dmac src_ip $sip \ + dst_ip $dip action pass + + # Before adding FDB entry. + $MZ $h1 -a own -b $dmac -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 1 + check_err $? "Unknown unicast filter was not hit before adding FDB entry" + + tc_check_packets "dev $swp2 egress" 102 0 + check_err $? "Known unicast filter was hit before adding FDB entry" + + # Adding FDB entry. + bridge fdb replace $dmac dev $swp2 master static + + $MZ $h1 -a own -b $dmac -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 1 + check_err $? "Unknown unicast filter was hit after adding FDB entry" + + tc_check_packets "dev $swp2 egress" 102 1 + check_err $? "Known unicast filter was not hit after adding FDB entry" + + # Deleting FDB entry. + bridge fdb del $dmac dev $swp2 master static + + $MZ $h1 -a own -b $dmac -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 2 + check_err $? "Unknown unicast filter was not hit after deleting FDB entry" + + tc_check_packets "dev $swp2 egress" 102 1 + check_err $? "Known unicast filter was hit after deleting FDB entry" + + tc filter del dev $swp2 egress protocol ipv4 pref 1 handle 102 flower + tc filter del dev $swp2 egress protocol ipv4 pref 1 handle 101 flower + + log_test "L2 miss - Unicast" +} + +test_l2_miss_multicast_common() +{ + local proto=$1; shift + local sip=$1; shift + local dip=$1; shift + local mode=$1; shift + local name=$1; shift + + RET=0 + + # Unregistered multicast. + tc filter add dev $swp2 egress protocol $proto handle 101 pref 1 \ + flower indev $swp1 l2_miss true src_ip $sip dst_ip $dip \ + action pass + # Registered multicast. + tc filter add dev $swp2 egress protocol $proto handle 102 pref 1 \ + flower indev $swp1 l2_miss false src_ip $sip dst_ip $dip \ + action pass + + # Before adding MDB entry. + $MZ $mode $h1 -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 1 + check_err $? "Unregistered multicast filter was not hit before adding MDB entry" + + tc_check_packets "dev $swp2 egress" 102 0 + check_err $? "Registered multicast filter was hit before adding MDB entry" + + # Adding MDB entry. + bridge mdb replace dev br1 port $swp2 grp $dip permanent + + $MZ $mode $h1 -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 1 + check_err $? "Unregistered multicast filter was hit after adding MDB entry" + + tc_check_packets "dev $swp2 egress" 102 1 + check_err $? "Registered multicast filter was not hit after adding MDB entry" + + # Deleting MDB entry. + bridge mdb del dev br1 port $swp2 grp $dip + + $MZ $mode $h1 -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 2 + check_err $? "Unregistered multicast filter was not hit after deleting MDB entry" + + tc_check_packets "dev $swp2 egress" 102 1 + check_err $? "Registered multicast filter was hit after deleting MDB entry" + + tc filter del dev $swp2 egress protocol $proto pref 1 handle 102 flower + tc filter del dev $swp2 egress protocol $proto pref 1 handle 101 flower + + log_test "L2 miss - Multicast ($name)" +} + +test_l2_miss_multicast_ipv4() +{ + local proto="ipv4" + local sip=192.0.2.1 + local dip=239.1.1.1 + local mode="-4" + local name="IPv4" + + test_l2_miss_multicast_common $proto $sip $dip $mode $name +} + +test_l2_miss_multicast_ipv6() +{ + local proto="ipv6" + local sip=2001:db8:1::1 + local dip=ff0e::1 + local mode="-6" + local name="IPv6" + + test_l2_miss_multicast_common $proto $sip $dip $mode $name +} + +test_l2_miss_multicast() +{ + # Configure $swp2 as a multicast router port so that it will forward + # both registered and unregistered multicast traffic. + bridge link set dev $swp2 mcast_router 2 + + # Forwarding according to MDB entries only takes place when the bridge + # detects that there is a valid querier in the network. Set the bridge + # as the querier and assign it a valid IPv6 link-local address to be + # used as the source address for MLD queries. + ip link set dev br1 type bridge mcast_querier 1 + ip -6 address add fe80::1/64 nodad dev br1 + # Wait the default Query Response Interval (10 seconds) for the bridge + # to determine that there are no other queriers in the network. + sleep 10 + + test_l2_miss_multicast_ipv4 + test_l2_miss_multicast_ipv6 + + ip -6 address del fe80::1/64 dev br1 + ip link set dev br1 type bridge mcast_querier 0 + bridge link set dev $swp2 mcast_router 1 +} + +test_l2_miss_multicast_common2() +{ + local name=$1; shift + local dmac=$1; shift + local dip=224.0.0.1 + local sip=192.0.2.1 + +} + +test_l2_miss_ll_multicast_common() +{ + local proto=$1; shift + local dmac=$1; shift + local sip=$1; shift + local dip=$1; shift + local mode=$1; shift + local name=$1; shift + + RET=0 + + tc filter add dev $swp2 egress protocol $proto handle 101 pref 1 \ + flower indev $swp1 l2_miss true dst_mac $dmac src_ip $sip \ + dst_ip $dip action pass + + $MZ $mode $h1 -a own -b $dmac -t ip -A $sip -B $dip -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 1 + check_err $? "Filter was not hit" + + tc filter del dev $swp2 egress protocol $proto pref 1 handle 101 flower + + log_test "L2 miss - Link-local multicast ($name)" +} + +test_l2_miss_ll_multicast_ipv4() +{ + local proto=ipv4 + local dmac=01:00:5e:00:00:01 + local sip=192.0.2.1 + local dip=224.0.0.1 + local mode="-4" + local name="IPv4" + + test_l2_miss_ll_multicast_common $proto $dmac $sip $dip $mode $name +} + +test_l2_miss_ll_multicast_ipv6() +{ + local proto=ipv6 + local dmac=33:33:00:00:00:01 + local sip=2001:db8:1::1 + local dip=ff02::1 + local mode="-6" + local name="IPv6" + + test_l2_miss_ll_multicast_common $proto $dmac $sip $dip $mode $name +} + +test_l2_miss_ll_multicast() +{ + test_l2_miss_ll_multicast_ipv4 + test_l2_miss_ll_multicast_ipv6 +} + +test_l2_miss_broadcast() +{ + local dmac=ff:ff:ff:ff:ff:ff + local smac=00:01:02:03:04:05 + + RET=0 + + tc filter add dev $swp2 egress protocol all handle 101 pref 1 \ + flower l2_miss true dst_mac $dmac src_mac $smac \ + action pass + tc filter add dev $swp2 egress protocol all handle 102 pref 1 \ + flower l2_miss false dst_mac $dmac src_mac $smac \ + action pass + + $MZ $h1 -a $smac -b $dmac -c 1 -p 100 -q + + tc_check_packets "dev $swp2 egress" 101 0 + check_err $? "L2 miss filter was hit when should not" + + tc_check_packets "dev $swp2 egress" 102 1 + check_err $? "L2 no miss filter was not hit when should" + + tc filter del dev $swp2 egress protocol all pref 1 handle 102 flower + tc filter del dev $swp2 egress protocol all pref 1 handle 101 flower + + log_test "L2 miss - Broadcast" +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + swp1=${NETIFS[p2]} + + swp2=${NETIFS[p3]} + h2=${NETIFS[p4]} + + vrf_prepare + h1_create + h2_create + switch_create +} + +cleanup() +{ + pre_cleanup + + switch_destroy + h2_destroy + h1_destroy + vrf_cleanup +} + +trap cleanup EXIT + +setup_prepare +setup_wait + +tests_run + +exit $EXIT_STATUS -- 2.40.1
patchwork-bot+netdevbpf at kernel.org
2023-May-31 07:00 UTC
[Bridge] [PATCH net-next v2 0/8] Add layer 2 miss indication and filtering
Hello: This series was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba at kernel.org>: On Mon, 29 May 2023 14:48:27 +0300 you wrote:> tl;dr > ====> > This patchset adds a single bit to the tc skb extension to indicate that > a packet encountered a layer 2 miss in the bridge and extends flower to > match on this metadata. This is required for non-DF (Designated > Forwarder) filtering in EVPN multi-homing which prevents decapsulated > BUM packets from being forwarded multiple times to the same multi-homed > host. > > [...]Here is the summary with links: - [net-next,v2,1/8] skbuff: bridge: Add layer 2 miss indication https://git.kernel.org/netdev/net-next/c/7b4858df3bf7 - [net-next,v2,2/8] flow_dissector: Dissect layer 2 miss from tc skb extension https://git.kernel.org/netdev/net-next/c/d5ccfd90df7f - [net-next,v2,3/8] net/sched: flower: Allow matching on layer 2 miss https://git.kernel.org/netdev/net-next/c/1a432018c0cd - [net-next,v2,4/8] flow_offload: Reject matching on layer 2 miss https://git.kernel.org/netdev/net-next/c/f4356947f029 - [net-next,v2,5/8] mlxsw: spectrum_flower: Split iif parsing to a separate function https://git.kernel.org/netdev/net-next/c/d04e26509678 - [net-next,v2,6/8] mlxsw: spectrum_flower: Do not force matching on iif https://git.kernel.org/netdev/net-next/c/0b9cd74b8d1e - [net-next,v2,7/8] mlxsw: spectrum_flower: Add ability to match on layer 2 miss https://git.kernel.org/netdev/net-next/c/caa4c58ab5d9 - [net-next,v2,8/8] selftests: forwarding: Add layer 2 miss test cases https://git.kernel.org/netdev/net-next/c/8c33266ae26a You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html