Jesper Dangaard Brouer
2020-Jul-31 14:15 UTC
[Bridge] [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP
I really appreciate that you are working on adding this helper. Some comments below. On Fri, 31 Jul 2020 13:44:20 +0900 Yoshiki Komachi <komachi.yoshiki at gmail.com> wrote:> diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c > new file mode 100644 > index 000000000000..00f802503199 > --- /dev/null > +++ b/samples/bpf/xdp_bridge_kern.c > @@ -0,0 +1,129 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2020 NTT Corp. All Rights Reserved. > + *[...]> + > +struct { > + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); > + __uint(key_size, sizeof(int)); > + __uint(value_size, sizeof(int)); > + __uint(max_entries, 64); > +} xdp_tx_ports SEC(".maps"); > + > +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto) > +{ > + void *data_end = (void *)(long)ctx->data_end; > + void *data = (void *)(long)ctx->data; > + struct bpf_fdb_lookup fdb_lookup_params; > + struct vlan_hdr *vlan_hdr = NULL; > + struct ethhdr *eth = data; > + u16 h_proto; > + u64 nh_off; > + int rc; > + > + nh_off = sizeof(*eth); > + if (data + nh_off > data_end) > + return XDP_DROP; > + > + __builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params)); > + > + h_proto = eth->h_proto; > + > + if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN)) > + return XDP_PASS; > + > + /* Handle VLAN tagged packet */ > + if (h_proto == br_vlan_proto) { > + vlan_hdr = (void *)eth + nh_off; > + nh_off += sizeof(*vlan_hdr); > + if ((void *)eth + nh_off > data_end) > + return XDP_PASS; > + > + fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) & > + VLAN_VID_MASK; > + } > + > + /* FIXME: Although Linux bridge provides us with vlan filtering (contains > + * PVID) at ingress, the feature is currently unsupported in this XDP program. > + * > + * Two ideas to realize the vlan filtering are below: > + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs^^ Typo: usespace -> userspace> + * of them through BPF maps > + * 2. introduce another bpf helper to retrieve bridge vlan informationThe comment appears two times time this file.> + * > + * > + * FIXME: After the vlan filtering, learning feature is required here, but > + * it is currently unsupported as well. If another bpf helper for learning > + * is accepted, the processing could be implemented in the future. > + */ > + > + memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN); > + > + /* Note: This program definitely takes ifindex of ingress interface as > + * a bridge port. Linux networking devices can be stacked and physical > + * interfaces are not necessarily slaves of bridges (e.g., bonding or > + * vlan devices can be slaves of bridges), but stacked bridge ports are > + * currently unsupported in this program. In such cases, XDP programs > + * should be attached to a lower device in order to process packets with > + * higher speed. Then, a new bpf helper to find upper devices will be > + * required here in the future because they will be registered on FDB > + * in the kernel. > + */ > + fdb_lookup_params.ifindex = ctx->ingress_ifindex; > + > + rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0); > + if (rc != BPF_FDB_LKUP_RET_SUCCESS) { > + /* In cases of flooding, XDP_PASS will be returned here */ > + return XDP_PASS; > + } > + > + /* FIXME: Although Linux bridge provides us with vlan filtering (contains > + * untagged policy) at egress as well, the feature is currently unsupported > + * in this XDP program. > + * > + * Two ideas to realize the vlan filtering are below: > + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs > + * of them through BPF maps > + * 2. introduce another bpf helper to retrieve bridge vlan information > + */(2nd time the comment appears)> +A comment about below bpf_redirect_map() would be good. Explaining that we depend on fallback behavior, to let normal bridge code handle other cases (e.g. flood/broadcast). And also that if lookup fails, XDP_PASS/fallback also happens.> + return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS); > +} > + > +SEC("xdp_bridge") > +int xdp_bridge_prog(struct xdp_md *ctx) > +{ > + return xdp_bridge_proto(ctx, 0); > +} > + > +SEC("xdp_8021q_bridge") > +int xdp_8021q_bridge_prog(struct xdp_md *ctx) > +{ > + return xdp_bridge_proto(ctx, htons(ETH_P_8021Q)); > +} > + > +SEC("xdp_8021ad_bridge") > +int xdp_8021ad_bridge_prog(struct xdp_md *ctx) > +{ > + return xdp_bridge_proto(ctx, htons(ETH_P_8021AD)); > +} > + > +char _license[] SEC("license") = "GPL";-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
Yoshiki Komachi
2020-Aug-04 10:08 UTC
[Bridge] [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP
> 2020/07/31 23:15?Jesper Dangaard Brouer <brouer at redhat.com>????: > > > I really appreciate that you are working on adding this helper. > Some comments below.Thanks! Find my response below, please.> On Fri, 31 Jul 2020 13:44:20 +0900 > Yoshiki Komachi <komachi.yoshiki at gmail.com> wrote: > >> diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c >> new file mode 100644 >> index 000000000000..00f802503199 >> --- /dev/null >> +++ b/samples/bpf/xdp_bridge_kern.c >> @@ -0,0 +1,129 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* Copyright (c) 2020 NTT Corp. All Rights Reserved. >> + * > [...] >> + >> +struct { >> + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); >> + __uint(key_size, sizeof(int)); >> + __uint(value_size, sizeof(int)); >> + __uint(max_entries, 64); >> +} xdp_tx_ports SEC(".maps"); >> + >> +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto) >> +{ >> + void *data_end = (void *)(long)ctx->data_end; >> + void *data = (void *)(long)ctx->data; >> + struct bpf_fdb_lookup fdb_lookup_params; >> + struct vlan_hdr *vlan_hdr = NULL; >> + struct ethhdr *eth = data; >> + u16 h_proto; >> + u64 nh_off; >> + int rc; >> + >> + nh_off = sizeof(*eth); >> + if (data + nh_off > data_end) >> + return XDP_DROP; >> + >> + __builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params)); >> + >> + h_proto = eth->h_proto; >> + >> + if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN)) >> + return XDP_PASS; >> + >> + /* Handle VLAN tagged packet */ >> + if (h_proto == br_vlan_proto) { >> + vlan_hdr = (void *)eth + nh_off; >> + nh_off += sizeof(*vlan_hdr); >> + if ((void *)eth + nh_off > data_end) >> + return XDP_PASS; >> + >> + fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) & >> + VLAN_VID_MASK; >> + } >> + >> + /* FIXME: Although Linux bridge provides us with vlan filtering (contains >> + * PVID) at ingress, the feature is currently unsupported in this XDP program. >> + * >> + * Two ideas to realize the vlan filtering are below: >> + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs > ^^ > Typo: usespace -> userspaceI will fix this in the next version.>> + * of them through BPF maps >> + * 2. introduce another bpf helper to retrieve bridge vlan information > > The comment appears two times time this file.I was aiming to show future implementation of the vlan filtering at ingress (not egress) to be required here by the above comment.>> + * >> + * >> + * FIXME: After the vlan filtering, learning feature is required here, but >> + * it is currently unsupported as well. If another bpf helper for learning >> + * is accepted, the processing could be implemented in the future. >> + */ >> + >> + memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN); >> + >> + /* Note: This program definitely takes ifindex of ingress interface as >> + * a bridge port. Linux networking devices can be stacked and physical >> + * interfaces are not necessarily slaves of bridges (e.g., bonding or >> + * vlan devices can be slaves of bridges), but stacked bridge ports are >> + * currently unsupported in this program. In such cases, XDP programs >> + * should be attached to a lower device in order to process packets with >> + * higher speed. Then, a new bpf helper to find upper devices will be >> + * required here in the future because they will be registered on FDB >> + * in the kernel. >> + */ >> + fdb_lookup_params.ifindex = ctx->ingress_ifindex; >> + >> + rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0); >> + if (rc != BPF_FDB_LKUP_RET_SUCCESS) { >> + /* In cases of flooding, XDP_PASS will be returned here */ >> + return XDP_PASS; >> + } >> + >> + /* FIXME: Although Linux bridge provides us with vlan filtering (contains >> + * untagged policy) at egress as well, the feature is currently unsupported >> + * in this XDP program. >> + * >> + * Two ideas to realize the vlan filtering are below: >> + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs >> + * of them through BPF maps >> + * 2. introduce another bpf helper to retrieve bridge vlan information >> + */ > > (2nd time the comment appears)The 2nd one is marking for future implementation of the egress filtering. Sorry for confusing you. I will try to remove the redundancy and confusion.>> + > > A comment about below bpf_redirect_map() would be good. Explaining > that we depend on fallback behavior, to let normal bridge code handle > other cases (e.g. flood/broadcast). And also that if lookup fails, > XDP_PASS/fallback also happens.In this example, flooded packets will be transferred to the upper normal bridge by not the bpf_redirect_map() call but the XDP_PASS action as below: + rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0); + if (rc != BPF_FDB_LKUP_RET_SUCCESS) { + /* In cases of flooding, XDP_PASS will be returned here */ + return XDP_PASS; + } Thus, such a comment should be described as above, IMO. Thanks & Best regards,>> + return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS); >> +} >> + >> +SEC("xdp_bridge") >> +int xdp_bridge_prog(struct xdp_md *ctx) >> +{ >> + return xdp_bridge_proto(ctx, 0); >> +} >> + >> +SEC("xdp_8021q_bridge") >> +int xdp_8021q_bridge_prog(struct xdp_md *ctx) >> +{ >> + return xdp_bridge_proto(ctx, htons(ETH_P_8021Q)); >> +} >> + >> +SEC("xdp_8021ad_bridge") >> +int xdp_8021ad_bridge_prog(struct xdp_md *ctx) >> +{ >> + return xdp_bridge_proto(ctx, htons(ETH_P_8021AD)); >> +} >> + >> +char _license[] SEC("license") = "GPL"; > > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer >? Yoshiki Komachi komachi.yoshiki at gmail.com