Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 00/15] net: bridge: mcast: initial IGMPv3 support (part 1)
Hi all, This patch-set implements the control plane for initial IGMPv3 support which takes care of include/exclude sets and state transitions based on the different report types. Patch 01 arranges the structure better by moving the frequently used fields together, patches 02 and 03 add support for source lists and group modes per port group which are dumped. Patch 04 adds support for group-and-source specific queries required for IGMPv3. Patch 05 factors out the port group deletion code which is used in a few places, then patch 06 adds support for group and group-and-source query retransmissions via a new rexmit timer. Patches 07 and 08 make use of the already present mdb fill functions when sending notifications so we can have the full mdb entries' state filled in (with sources, mode etc). Patch 09 takes care of port group expiration, it switches the group mode to include and deletes it if there are no sources with active timers. Patches 10-13 are the core changes which add support for IGMPv3 reports and handle the source list set operations as per RFC 3376, all IGMPv3 report types with their transitions should be supported after these patches. I've used RFC 3376 and FRR as a reference implementation. The source lists are capped at 32 entries, we can remove that limitation at a later point which would require a better data structure to hold them. IGMPv3 processing is hidden behind the bridge's multicast_igmp_version option which must be set to 3 in order to enable it. Patch 14 improves other querier processing a bit (more about this below). And finally patch 15 transform the src gc so it can be used with all mcast objects since now we have multiple timers that can be running and we need to make sure they have all finished before freeing the objects. This is part 1, it only adds control plane support and doesn't change the fast path. A following patch-set will take care of that. Here're the sets that will come next (in order): - Fast path patch-set which adds support for (S, G) mdb entries needed for IGMPv3 forwarding, entry add source (kernel, user-space etc) needed for IGMPv3 entry management, entry block mode needed for IGMPv3 exclude mode. This set will also add iproute2 support for manipulating and showing all the new state. - Selftests patch-set which will verify all IGMPv3 state transitions and forwarding - Explicit host tracking patch-set, needed for proper fast leave and with it fast leave will be enabled for IGMPv3 Not implemented yet: - Host IGMPv3 support (currently we handle only join/leave as before) - Proper other querier source timer and value updates - IGMPv3/v2 compat (I have a few rough patches for this one) v2: patches 02-03: make src lists RCU friendly so they can be traversed when dumping, reduce limit to a more conservative 32 src group entries for a start patches 11-13: remove helper and directly do bitops patch 15: force mcast gc on bridge port del to make sure port group timers have finished before freeing the port Thanks, Nik Nikolay Aleksandrov (15): net: bridge: mdb: arrange internal structs so fast-path fields are close net: bridge: mcast: add support for group source list net: bridge: mcast: add support for src list and filter mode dumping net: bridge: mcast: add support for group-and-source specific queries net: bridge: mcast: factor out port group del net: bridge: mcast: add support for group query retransmit net: bridge: mdb: push notifications in __br_mdb_add/del net: bridge: mdb: use mdb and port entries in notifications net: bridge: mcast: delete expired port groups without srcs net: bridge: mcast: support for IGMPv3 IGMPV3_ALLOW_NEW_SOURCES report net: bridge: mcast: support for IGMPV3_MODE_IS_INCLUDE/EXCLUDE report net: bridge: mcast: support for IGMPV3_CHANGE_TO_INCLUDE/EXCLUDE report net: bridge: mcast: support for IGMPV3_BLOCK_OLD_SOURCES report net: bridge: mcast: improve v3 query processing net: bridge: mcast: destroy all entries via gc include/uapi/linux/if_bridge.h | 21 + net/bridge/br_mdb.c | 223 ++++--- net/bridge/br_multicast.c | 1040 +++++++++++++++++++++++++++++--- net/bridge/br_private.h | 67 +- 4 files changed, 1170 insertions(+), 181 deletions(-) -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 01/15] net: bridge: mdb: arrange internal structs so fast-path fields are close
Before this patch we'd need 2 cache lines for fast-path, now all used fields are in the first cache line. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_private.h | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index baa1500f384f..357b6905ecef 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -217,23 +217,27 @@ struct net_bridge_fdb_entry { struct net_bridge_port_group { struct net_bridge_port *port; struct net_bridge_port_group __rcu *next; - struct hlist_node mglist; - struct rcu_head rcu; - struct timer_list timer; struct br_ip addr; unsigned char eth_addr[ETH_ALEN] __aligned(2); unsigned char flags; + + struct timer_list timer; + struct hlist_node mglist; + + struct rcu_head rcu; }; struct net_bridge_mdb_entry { struct rhash_head rhnode; struct net_bridge *br; struct net_bridge_port_group __rcu *ports; - struct rcu_head rcu; - struct timer_list timer; struct br_ip addr; bool host_joined; + + struct timer_list timer; struct hlist_node mdb_node; + + struct rcu_head rcu; }; struct net_bridge_port { -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 02/15] net: bridge: mcast: add support for group source list
Initial functions for group source lists which are needed for IGMPv3 include/exclude lists. Currently only IPv4 sources are supported. User-added mdb entries are created with exclude filter mode, we can extend that later to allow user-supplied mode. When group src entries are deleted, they're freed from a workqueue to make sure their timers are not still running. Source entries are protected by the multicast_lock and rcu. The number of src groups per port group is limited to 32. v2: allow src groups to be traversed under rcu Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_mdb.c | 6 ++ net/bridge/br_multicast.c | 129 ++++++++++++++++++++++++++++++++++++++ net/bridge/br_private.h | 23 +++++++ 3 files changed, 158 insertions(+) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index da5ed4cf9233..025a5aff2b2f 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -641,6 +641,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, p = br_multicast_new_port_group(port, group, *pp, state, NULL); if (unlikely(!p)) return -ENOMEM; + p->filter_mode = MCAST_EXCLUDE; rcu_assign_pointer(*pp, p); if (state == MDB_TEMPORARY) mod_timer(&p->timer, now + br->multicast_membership_interval); @@ -761,6 +762,11 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry) if (!p->port || p->port->dev->ifindex != entry->ifindex) continue; + if (!hlist_empty(&p->src_list)) { + err = -EINVAL; + goto unlock; + } + if (p->port->state == BR_STATE_DISABLED) goto unlock; diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 4c4a93abde68..9269d62884e5 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -163,12 +163,24 @@ static void br_multicast_group_expired(struct timer_list *t) spin_unlock(&br->multicast_lock); } +static void br_multicast_del_group_src(struct net_bridge_group_src *src) +{ + struct net_bridge *br = src->pg->port->br; + + hlist_del_init_rcu(&src->node); + src->pg->src_ents--; + hlist_add_head(&src->del_node, &br->src_gc_list); + queue_work(system_long_wq, &br->src_gc_work); +} + static void br_multicast_del_pg(struct net_bridge *br, struct net_bridge_port_group *pg) { struct net_bridge_mdb_entry *mp; struct net_bridge_port_group *p; struct net_bridge_port_group __rcu **pp; + struct net_bridge_group_src *ent; + struct hlist_node *tmp; mp = br_mdb_ip_get(br, &pg->addr); if (WARN_ON(!mp)) @@ -183,6 +195,8 @@ static void br_multicast_del_pg(struct net_bridge *br, rcu_assign_pointer(*pp, p->next); hlist_del_init(&p->mglist); del_timer(&p->timer); + hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) + br_multicast_del_group_src(ent); br_mdb_notify(br->dev, p->port, &pg->addr, RTM_DELMDB, p->flags); kfree_rcu(p, rcu); @@ -470,6 +484,83 @@ struct net_bridge_mdb_entry *br_multicast_new_group(struct net_bridge *br, return mp; } +static void br_multicast_group_src_expired(struct timer_list *t) +{ + struct net_bridge_group_src *src = from_timer(src, t, timer); + struct net_bridge_port_group *pg; + struct net_bridge *br = src->br; + + spin_lock(&br->multicast_lock); + if (hlist_unhashed(&src->node) || !netif_running(br->dev) || + timer_pending(&src->timer)) + goto out; + + pg = src->pg; + br_debug(br, "port %s: src %pI4 group %pI4 filter mode %u timer expired\n", + pg->port->dev->name, &src->addr.u.ip4, &src->pg->addr.u.ip4, + src->pg->filter_mode); + + if (pg->filter_mode == MCAST_INCLUDE) { + br_multicast_del_group_src(src); + if (!hlist_empty(&pg->src_list)) + goto out; + br_multicast_del_pg(br, pg); + } +out: + spin_unlock(&br->multicast_lock); +} + +static struct net_bridge_group_src * +br_multicast_find_group_src(struct net_bridge_port_group *pg, struct br_ip *ip) +{ + struct net_bridge_group_src *ent; + + hlist_for_each_entry(ent, &pg->src_list, node) { + if (ent->addr.proto != ip->proto) + continue; + + switch (ip->proto) { + case htons(ETH_P_IP): + if (ip->u.ip4 == ent->addr.u.ip4) + return ent; + break; + } + } + + return NULL; +} + +static struct net_bridge_group_src * +br_multicast_new_group_src(struct net_bridge_port_group *pg, struct br_ip *src_ip) +{ + struct net_bridge_group_src *grp_src; + + if (unlikely(pg->src_ents >= PG_SRC_ENT_LIMIT)) + return NULL; + + switch (src_ip->proto) { + case htons(ETH_P_IP): + if (ipv4_is_zeronet(src_ip->u.ip4) || + ipv4_is_multicast(src_ip->u.ip4)) + return NULL; + break; + } + + grp_src = kzalloc(sizeof(*grp_src), GFP_ATOMIC); + if (unlikely(!grp_src)) + return NULL; + + grp_src->pg = pg; + grp_src->br = pg->port->br; + grp_src->addr = *src_ip; + timer_setup(&grp_src->timer, br_multicast_group_src_expired, 0); + + hlist_add_head_rcu(&grp_src->node, &pg->src_list); + pg->src_ents++; + + return grp_src; +} + struct net_bridge_port_group *br_multicast_new_port_group( struct net_bridge_port *port, struct br_ip *group, @@ -486,6 +577,12 @@ struct net_bridge_port_group *br_multicast_new_port_group( p->addr = *group; p->port = port; p->flags = flags; + if (group->proto == htons(ETH_P_IP) && + port->br->multicast_igmp_version == 3) + p->filter_mode = MCAST_INCLUDE; + else + p->filter_mode = MCAST_EXCLUDE; + INIT_HLIST_HEAD(&p->src_list); rcu_assign_pointer(p->next, next); hlist_add_head(&p->mglist, &port->mglist); timer_setup(&p->timer, br_multicast_port_group_expired, 0); @@ -1781,6 +1878,31 @@ static void br_ip6_multicast_query_expired(struct timer_list *t) } #endif +static void __grp_src_gc(struct hlist_head *head) +{ + struct net_bridge_group_src *ent; + struct hlist_node *tmp; + + hlist_for_each_entry_safe(ent, tmp, head, del_node) { + hlist_del_init(&ent->del_node); + del_timer_sync(&ent->timer); + kfree_rcu(ent, rcu); + } +} + +static void br_multicast_src_gc(struct work_struct *work) +{ + struct net_bridge *br = container_of(work, struct net_bridge, + src_gc_work); + HLIST_HEAD(deleted_head); + + spin_lock_bh(&br->multicast_lock); + hlist_move_list(&br->src_gc_list, &deleted_head); + spin_unlock_bh(&br->multicast_lock); + + __grp_src_gc(&deleted_head); +} + void br_multicast_init(struct net_bridge *br) { br->hash_max = BR_MULTICAST_DEFAULT_HASH_MAX; @@ -1821,6 +1943,8 @@ void br_multicast_init(struct net_bridge *br) br_ip6_multicast_query_expired, 0); #endif INIT_HLIST_HEAD(&br->mdb_list); + INIT_HLIST_HEAD(&br->src_gc_list); + INIT_WORK(&br->src_gc_work, br_multicast_src_gc); } static void br_ip4_multicast_join_snoopers(struct net_bridge *br) @@ -1924,6 +2048,7 @@ void br_multicast_stop(struct net_bridge *br) void br_multicast_dev_del(struct net_bridge *br) { struct net_bridge_mdb_entry *mp; + HLIST_HEAD(deleted_head); struct hlist_node *tmp; spin_lock_bh(&br->multicast_lock); @@ -1934,8 +2059,12 @@ void br_multicast_dev_del(struct net_bridge *br) hlist_del_rcu(&mp->mdb_node); kfree_rcu(mp, rcu); } + hlist_move_list(&br->src_gc_list, &deleted_head); spin_unlock_bh(&br->multicast_lock); + __grp_src_gc(&deleted_head); + cancel_work_sync(&br->src_gc_work); + rcu_barrier(); } diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 357b6905ecef..311ad0e402dc 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -214,13 +214,34 @@ struct net_bridge_fdb_entry { #define MDB_PG_FLAGS_OFFLOAD BIT(1) #define MDB_PG_FLAGS_FAST_LEAVE BIT(2) +#define PG_SRC_ENT_LIMIT 32 + +#define BR_SGRP_F_DELETE BIT(0) +#define BR_SGRP_F_SEND BIT(1) + +struct net_bridge_group_src { + struct hlist_node node; + + struct br_ip addr; + struct net_bridge_port_group *pg; + u8 flags; + struct timer_list timer; + + struct net_bridge *br; + struct hlist_node del_node; + struct rcu_head rcu; +}; + struct net_bridge_port_group { struct net_bridge_port *port; struct net_bridge_port_group __rcu *next; struct br_ip addr; unsigned char eth_addr[ETH_ALEN] __aligned(2); unsigned char flags; + unsigned char filter_mode; + struct hlist_head src_list; + unsigned int src_ents; struct timer_list timer; struct hlist_node mglist; @@ -410,6 +431,7 @@ struct net_bridge { struct rhashtable mdb_hash_tbl; + struct hlist_head src_gc_list; struct hlist_head mdb_list; struct hlist_head router_list; @@ -423,6 +445,7 @@ struct net_bridge { struct bridge_mcast_own_query ip6_own_query; struct bridge_mcast_querier ip6_querier; #endif /* IS_ENABLED(CONFIG_IPV6) */ + struct work_struct src_gc_work; #endif struct timer_list hello_timer; -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 03/15] net: bridge: mcast: add support for src list and filter mode dumping
Support per port group src list (address and timer) and filter mode dumping. Protected by either multicast_lock or rcu and currently limited only to IPv4. v2: require RCU or multicast_lock to traverse src groups Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- include/uapi/linux/if_bridge.h | 21 +++++++++++ net/bridge/br_mdb.c | 67 +++++++++++++++++++++++++++++++++- 2 files changed, 86 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index c1227aecd38f..75a2ac479247 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -455,10 +455,31 @@ enum { enum { MDBA_MDB_EATTR_UNSPEC, MDBA_MDB_EATTR_TIMER, + MDBA_MDB_EATTR_SRC_LIST, + MDBA_MDB_EATTR_GROUP_MODE, __MDBA_MDB_EATTR_MAX }; #define MDBA_MDB_EATTR_MAX (__MDBA_MDB_EATTR_MAX - 1) +/* per mdb entry source */ +enum { + MDBA_MDB_SRCLIST_UNSPEC, + MDBA_MDB_SRCLIST_ENTRY, + __MDBA_MDB_SRCLIST_MAX +}; +#define MDBA_MDB_SRCLIST_MAX (__MDBA_MDB_SRCLIST_MAX - 1) + +/* per mdb entry per source attributes + * these are embedded in MDBA_MDB_SRCLIST_ENTRY + */ +enum { + MDBA_MDB_SRCATTR_UNSPEC, + MDBA_MDB_SRCATTR_ADDRESS, + MDBA_MDB_SRCATTR_TIMER, + __MDBA_MDB_SRCATTR_MAX +}; +#define MDBA_MDB_SRCATTR_MAX (__MDBA_MDB_SRCATTR_MAX - 1) + /* multicast router types */ enum { MDB_RTR_TYPE_DISABLED, diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index 025a5aff2b2f..7d4d064a49e3 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -77,6 +77,53 @@ static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip) #endif } +static int __mdb_fill_srcs(struct sk_buff *skb, + struct net_bridge_port_group *p) +{ + struct net_bridge_group_src *ent; + struct nlattr *nest, *nest_ent; + + if (hlist_empty(&p->src_list)) + return 0; + + nest = nla_nest_start(skb, MDBA_MDB_EATTR_SRC_LIST); + if (!nest) + return -EMSGSIZE; + + hlist_for_each_entry_rcu(ent, &p->src_list, node, + lockdep_is_held(&p->port->br->multicast_lock)) { + nest_ent = nla_nest_start(skb, MDBA_MDB_SRCLIST_ENTRY); + if (!nest_ent) + goto out_cancel_err; + switch (ent->addr.proto) { + case htons(ETH_P_IP): + if (nla_put_in_addr(skb, MDBA_MDB_SRCATTR_ADDRESS, + ent->addr.u.ip4)) { + nla_nest_cancel(skb, nest_ent); + goto out_cancel_err; + } + break; + default: + nla_nest_cancel(skb, nest_ent); + continue; + } + if (nla_put_u32(skb, MDBA_MDB_SRCATTR_TIMER, + br_timer_value(&ent->timer))) { + nla_nest_cancel(skb, nest_ent); + goto out_cancel_err; + } + nla_nest_end(skb, nest_ent); + } + + nla_nest_end(skb, nest); + + return 0; + +out_cancel_err: + nla_nest_cancel(skb, nest); + return -EMSGSIZE; +} + static int __mdb_fill_info(struct sk_buff *skb, struct net_bridge_mdb_entry *mp, struct net_bridge_port_group *p) @@ -119,6 +166,15 @@ static int __mdb_fill_info(struct sk_buff *skb, nla_nest_cancel(skb, nest_ent); return -EMSGSIZE; } + if (p && + mp->br->multicast_igmp_version == 3 && + mp->addr.proto == htons(ETH_P_IP) && + (__mdb_fill_srcs(skb, p) || + nla_put_u8(skb, MDBA_MDB_EATTR_GROUP_MODE, p->filter_mode))) { + nla_nest_cancel(skb, nest_ent); + return -EMSGSIZE; + } + nla_nest_end(skb, nest_ent); return 0; @@ -127,7 +183,7 @@ static int __mdb_fill_info(struct sk_buff *skb, static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb, struct net_device *dev) { - int idx = 0, s_idx = cb->args[1], err = 0; + int idx = 0, s_idx = cb->args[1], err = 0, pidx = 0, s_pidx = cb->args[2]; struct net_bridge *br = netdev_priv(dev); struct net_bridge_mdb_entry *mp; struct nlattr *nest, *nest2; @@ -152,7 +208,7 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb, break; } - if (mp->host_joined) { + if (!s_pidx && mp->host_joined) { err = __mdb_fill_info(skb, mp, NULL); if (err) { nla_nest_cancel(skb, nest2); @@ -164,13 +220,19 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb, pp = &p->next) { if (!p->port) continue; + if (pidx < s_pidx) + goto skip_pg; err = __mdb_fill_info(skb, mp, p); if (err) { nla_nest_cancel(skb, nest2); goto out; } +skip_pg: + pidx++; } + pidx = 0; + s_pidx = 0; nla_nest_end(skb, nest2); skip: idx++; @@ -178,6 +240,7 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb, out: cb->args[1] = idx; + cb->args[2] = pidx; nla_nest_end(skb, nest); return err; } -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 04/15] net: bridge: mcast: add support for group-and-source specific queries
Allows br_ip4_multicast_alloc_query to build queries with the port group's source lists and sends a query for sources over and under lmqt when necessary as per RFC 3376 with the suppress flag set appropriately. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 126 +++++++++++++++++++++++++++++--------- net/bridge/br_private.h | 1 + 2 files changed, 99 insertions(+), 28 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 9269d62884e5..8a415672764a 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -228,21 +228,50 @@ static void br_multicast_port_group_expired(struct timer_list *t) } static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, - __be32 group, - u8 *igmp_type) + struct net_bridge_port_group *pg, + __be32 ip_dst, __be32 group, + bool with_srcs, bool over_lmqt, + u8 sflag, u8 *igmp_type) { + struct net_bridge_port *p = pg ? pg->port : NULL; + struct net_bridge_group_src *ent; + size_t pkt_size, igmp_hdr_size; + unsigned long now = jiffies; struct igmpv3_query *ihv3; - size_t igmp_hdr_size; + void *csum_start = NULL; + __sum16 *csum = NULL; struct sk_buff *skb; struct igmphdr *ih; struct ethhdr *eth; + unsigned long lmqt; struct iphdr *iph; + u16 lmqt_srcs = 0; igmp_hdr_size = sizeof(*ih); - if (br->multicast_igmp_version == 3) + if (br->multicast_igmp_version == 3) { igmp_hdr_size = sizeof(*ihv3); - skb = netdev_alloc_skb_ip_align(br->dev, sizeof(*eth) + sizeof(*iph) + - igmp_hdr_size + 4); + if (pg && with_srcs) { + lmqt = now + (br->multicast_last_member_interval * + br->multicast_last_member_count); + hlist_for_each_entry(ent, &pg->src_list, node) { + if (over_lmqt == time_after(ent->timer.expires, + lmqt) && + ent->src_query_rexmit_cnt > 0) + lmqt_srcs++; + } + + if (!lmqt_srcs) + return NULL; + igmp_hdr_size += lmqt_srcs * sizeof(__be32); + } + } + + pkt_size = sizeof(*eth) + sizeof(*iph) + 4 + igmp_hdr_size; + if ((p && pkt_size > p->dev->mtu) || + pkt_size > br->dev->mtu) + return NULL; + + skb = netdev_alloc_skb_ip_align(br->dev, pkt_size); if (!skb) goto out; @@ -252,29 +281,24 @@ static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, eth = eth_hdr(skb); ether_addr_copy(eth->h_source, br->dev->dev_addr); - eth->h_dest[0] = 1; - eth->h_dest[1] = 0; - eth->h_dest[2] = 0x5e; - eth->h_dest[3] = 0; - eth->h_dest[4] = 0; - eth->h_dest[5] = 1; + ip_eth_mc_map(ip_dst, eth->h_dest); eth->h_proto = htons(ETH_P_IP); skb_put(skb, sizeof(*eth)); skb_set_network_header(skb, skb->len); iph = ip_hdr(skb); + iph->tot_len = htons(pkt_size - sizeof(*eth)); iph->version = 4; iph->ihl = 6; iph->tos = 0xc0; - iph->tot_len = htons(sizeof(*iph) + igmp_hdr_size + 4); iph->id = 0; iph->frag_off = htons(IP_DF); iph->ttl = 1; iph->protocol = IPPROTO_IGMP; iph->saddr = br_opt_get(br, BROPT_MULTICAST_QUERY_USE_IFADDR) ? inet_select_addr(br->dev, 0, RT_SCOPE_LINK) : 0; - iph->daddr = htonl(INADDR_ALLHOSTS_GROUP); + iph->daddr = ip_dst; ((u8 *)&iph[1])[0] = IPOPT_RA; ((u8 *)&iph[1])[1] = 4; ((u8 *)&iph[1])[2] = 0; @@ -294,7 +318,8 @@ static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, (HZ / IGMP_TIMER_SCALE); ih->group = group; ih->csum = 0; - ih->csum = ip_compute_csum((void *)ih, sizeof(*ih)); + csum = &ih->csum; + csum_start = (void *)ih; break; case 3: ihv3 = igmpv3_query_hdr(skb); @@ -304,15 +329,38 @@ static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, (HZ / IGMP_TIMER_SCALE); ihv3->group = group; ihv3->qqic = br->multicast_query_interval / HZ; - ihv3->nsrcs = 0; + ihv3->nsrcs = htons(lmqt_srcs); ihv3->resv = 0; - ihv3->suppress = 0; + ihv3->suppress = sflag; ihv3->qrv = 2; ihv3->csum = 0; - ihv3->csum = ip_compute_csum((void *)ihv3, sizeof(*ihv3)); + csum = &ihv3->csum; + csum_start = (void *)ihv3; + if (!pg || !with_srcs) + break; + + lmqt_srcs = 0; + hlist_for_each_entry(ent, &pg->src_list, node) { + if (over_lmqt == time_after(ent->timer.expires, + lmqt) && + ent->src_query_rexmit_cnt > 0) { + ihv3->srcs[lmqt_srcs++] = ent->addr.u.ip4; + ent->src_query_rexmit_cnt--; + } + } + if (WARN_ON(lmqt_srcs != ntohs(ihv3->nsrcs))) { + kfree_skb(skb); + return NULL; + } break; } + if (WARN_ON(!csum || !csum_start)) { + kfree(skb); + return NULL; + } + + *csum = ip_compute_csum(csum_start, igmp_hdr_size); skb_put(skb, igmp_hdr_size); __skb_pull(skb, sizeof(*eth)); @@ -435,15 +483,24 @@ static struct sk_buff *br_ip6_multicast_alloc_query(struct net_bridge *br, #endif static struct sk_buff *br_multicast_alloc_query(struct net_bridge *br, - struct br_ip *addr, - u8 *igmp_type) + struct net_bridge_port_group *pg, + struct br_ip *ip_dst, + struct br_ip *group, + bool with_srcs, bool over_lmqt, + u8 sflag, u8 *igmp_type) { - switch (addr->proto) { + __be32 ip4_dst; + + switch (group->proto) { case htons(ETH_P_IP): - return br_ip4_multicast_alloc_query(br, addr->u.ip4, igmp_type); + ip4_dst = ip_dst ? ip_dst->u.ip4 : htonl(INADDR_ALLHOSTS_GROUP); + return br_ip4_multicast_alloc_query(br, pg, + ip4_dst, group->u.ip4, + with_srcs, over_lmqt, + sflag, igmp_type); #if IS_ENABLED(CONFIG_IPV6) case htons(ETH_P_IPV6): - return br_ip6_multicast_alloc_query(br, &addr->u.ip6, + return br_ip6_multicast_alloc_query(br, &group->u.ip6, igmp_type); #endif } @@ -808,12 +865,19 @@ static void br_multicast_select_own_querier(struct net_bridge *br, static void __br_multicast_send_query(struct net_bridge *br, struct net_bridge_port *port, - struct br_ip *ip) + struct net_bridge_port_group *pg, + struct br_ip *ip_dst, + struct br_ip *group, + bool with_srcs, + u8 sflag) { + bool over_lmqt = !!sflag; struct sk_buff *skb; u8 igmp_type; - skb = br_multicast_alloc_query(br, ip, &igmp_type); +again_under_lmqt: + skb = br_multicast_alloc_query(br, pg, ip_dst, group, with_srcs, + over_lmqt, sflag, &igmp_type); if (!skb) return; @@ -824,8 +888,13 @@ static void __br_multicast_send_query(struct net_bridge *br, NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_OUT, dev_net(port->dev), NULL, skb, NULL, skb->dev, br_dev_queue_push_xmit); + + if (over_lmqt && with_srcs && sflag) { + over_lmqt = false; + goto again_under_lmqt; + } } else { - br_multicast_select_own_querier(br, ip, skb); + br_multicast_select_own_querier(br, group, skb); br_multicast_count(br, port, skb, igmp_type, BR_MCAST_DIR_RX); netif_rx(skb); @@ -861,7 +930,7 @@ static void br_multicast_send_query(struct net_bridge *br, if (!other_query || timer_pending(&other_query->timer)) return; - __br_multicast_send_query(br, port, &br_group); + __br_multicast_send_query(br, port, NULL, NULL, &br_group, false, 0); time = jiffies; time += own_query->startup_sent < br->multicast_startup_query_count ? @@ -1522,7 +1591,8 @@ br_multicast_leave_group(struct net_bridge *br, goto out; if (br_opt_get(br, BROPT_MULTICAST_QUERIER)) { - __br_multicast_send_query(br, port, &mp->addr); + __br_multicast_send_query(br, port, NULL, NULL, &mp->addr, + false, 0); time = jiffies + br->multicast_last_member_count * br->multicast_last_member_interval; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 311ad0e402dc..87610d6c0b3f 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -225,6 +225,7 @@ struct net_bridge_group_src { struct br_ip addr; struct net_bridge_port_group *pg; u8 flags; + u8 src_query_rexmit_cnt; struct timer_list timer; struct net_bridge *br; -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 05/15] net: bridge: mcast: factor out port group del
In order to avoid future errors and reduce code duplication we should factor out the port group del sequence. This allows us to have one function which takes care of all details when removing a port group. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_mdb.c | 15 +--------- net/bridge/br_multicast.c | 59 +++++++++++++++++++-------------------- net/bridge/br_private.h | 3 ++ 3 files changed, 32 insertions(+), 45 deletions(-) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index 7d4d064a49e3..7625db4b7fb9 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -825,24 +825,11 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry) if (!p->port || p->port->dev->ifindex != entry->ifindex) continue; - if (!hlist_empty(&p->src_list)) { - err = -EINVAL; - goto unlock; - } - if (p->port->state == BR_STATE_DISABLED) goto unlock; - __mdb_entry_fill_flags(entry, p->flags); - rcu_assign_pointer(*pp, p->next); - hlist_del_init(&p->mglist); - del_timer(&p->timer); - kfree_rcu(p, rcu); + br_multicast_del_pg(mp, p, pp); err = 0; - - if (!mp->ports && !mp->host_joined && - netif_running(br->dev)) - mod_timer(&mp->timer, jiffies); break; } diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 8a415672764a..ff73d1760b2f 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -173,14 +173,32 @@ static void br_multicast_del_group_src(struct net_bridge_group_src *src) queue_work(system_long_wq, &br->src_gc_work); } -static void br_multicast_del_pg(struct net_bridge *br, - struct net_bridge_port_group *pg) +void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, + struct net_bridge_port_group *pg, + struct net_bridge_port_group __rcu **pp) +{ + struct net_bridge *br = pg->port->br; + struct net_bridge_group_src *ent; + struct hlist_node *tmp; + + rcu_assign_pointer(*pp, pg->next); + hlist_del_init(&pg->mglist); + del_timer(&pg->timer); + hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) + br_multicast_del_group_src(ent); + br_mdb_notify(br->dev, pg->port, &pg->addr, RTM_DELMDB, pg->flags); + kfree_rcu(pg, rcu); + + if (!mp->ports && !mp->host_joined && netif_running(br->dev)) + mod_timer(&mp->timer, jiffies); +} + +static void br_multicast_find_del_pg(struct net_bridge *br, + struct net_bridge_port_group *pg) { struct net_bridge_mdb_entry *mp; struct net_bridge_port_group *p; struct net_bridge_port_group __rcu **pp; - struct net_bridge_group_src *ent; - struct hlist_node *tmp; mp = br_mdb_ip_get(br, &pg->addr); if (WARN_ON(!mp)) @@ -192,19 +210,7 @@ static void br_multicast_del_pg(struct net_bridge *br, if (p != pg) continue; - rcu_assign_pointer(*pp, p->next); - hlist_del_init(&p->mglist); - del_timer(&p->timer); - hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) - br_multicast_del_group_src(ent); - br_mdb_notify(br->dev, p->port, &pg->addr, RTM_DELMDB, - p->flags); - kfree_rcu(p, rcu); - - if (!mp->ports && !mp->host_joined && - netif_running(br->dev)) - mod_timer(&mp->timer, jiffies); - + br_multicast_del_pg(mp, pg, pp); return; } @@ -221,7 +227,7 @@ static void br_multicast_port_group_expired(struct timer_list *t) hlist_unhashed(&pg->mglist) || pg->flags & MDB_PG_FLAGS_PERMANENT) goto out; - br_multicast_del_pg(br, pg); + br_multicast_find_del_pg(br, pg); out: spin_unlock(&br->multicast_lock); @@ -561,7 +567,7 @@ static void br_multicast_group_src_expired(struct timer_list *t) br_multicast_del_group_src(src); if (!hlist_empty(&pg->src_list)) goto out; - br_multicast_del_pg(br, pg); + br_multicast_find_del_pg(br, pg); } out: spin_unlock(&br->multicast_lock); @@ -1018,7 +1024,7 @@ void br_multicast_del_port(struct net_bridge_port *port) /* Take care of the remaining groups, only perm ones should be left */ spin_lock_bh(&br->multicast_lock); hlist_for_each_entry_safe(pg, n, &port->mglist, mglist) - br_multicast_del_pg(br, pg); + br_multicast_find_del_pg(br, pg); spin_unlock_bh(&br->multicast_lock); del_timer_sync(&port->multicast_router_timer); free_percpu(port->mcast_stats); @@ -1067,7 +1073,7 @@ void br_multicast_disable_port(struct net_bridge_port *port) spin_lock(&br->multicast_lock); hlist_for_each_entry_safe(pg, n, &port->mglist, mglist) if (!(pg->flags & MDB_PG_FLAGS_PERMANENT)) - br_multicast_del_pg(br, pg); + br_multicast_find_del_pg(br, pg); __del_port_router(port); @@ -1573,16 +1579,7 @@ br_multicast_leave_group(struct net_bridge *br, if (p->flags & MDB_PG_FLAGS_PERMANENT) break; - rcu_assign_pointer(*pp, p->next); - hlist_del_init(&p->mglist); - del_timer(&p->timer); - kfree_rcu(p, rcu); - br_mdb_notify(br->dev, port, group, RTM_DELMDB, - p->flags | MDB_PG_FLAGS_FAST_LEAVE); - - if (!mp->ports && !mp->host_joined && - netif_running(br->dev)) - mod_timer(&mp->timer, jiffies); + br_multicast_del_pg(mp, p, pp); } goto out; } diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 87610d6c0b3f..e363b508db92 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -801,6 +801,9 @@ void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port, struct br_ip *group, int type, u8 flags); void br_rtr_notify(struct net_device *dev, struct net_bridge_port *port, int type); +void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, + struct net_bridge_port_group *pg, + struct net_bridge_port_group __rcu **pp); void br_multicast_count(struct net_bridge *br, const struct net_bridge_port *p, const struct sk_buff *skb, u8 type, u8 dir); int br_multicast_init_stats(struct net_bridge *br); -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 06/15] net: bridge: mcast: add support for group query retransmit
We need to be able to retransmit group-specific and group-and-source specific queries. The new timer takes care of those. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 65 ++++++++++++++++++++++++++++++++++----- net/bridge/br_private.h | 8 +++++ 2 files changed, 65 insertions(+), 8 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index ff73d1760b2f..1a910f139ee2 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -50,6 +50,7 @@ static void br_ip4_multicast_leave_group(struct net_bridge *br, __be32 group, __u16 vid, const unsigned char *src); +static void br_multicast_port_group_rexmit(struct timer_list *t); static void __del_port_router(struct net_bridge_port *p); #if IS_ENABLED(CONFIG_IPV6) @@ -184,6 +185,7 @@ void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, rcu_assign_pointer(*pp, pg->next); hlist_del_init(&pg->mglist); del_timer(&pg->timer); + del_timer(&pg->rexmit_timer); hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) br_multicast_del_group_src(ent); br_mdb_notify(br->dev, pg->port, &pg->addr, RTM_DELMDB, pg->flags); @@ -237,7 +239,8 @@ static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, struct net_bridge_port_group *pg, __be32 ip_dst, __be32 group, bool with_srcs, bool over_lmqt, - u8 sflag, u8 *igmp_type) + u8 sflag, u8 *igmp_type, + bool *need_rexmit) { struct net_bridge_port *p = pg ? pg->port : NULL; struct net_bridge_group_src *ent; @@ -352,6 +355,8 @@ static struct sk_buff *br_ip4_multicast_alloc_query(struct net_bridge *br, ent->src_query_rexmit_cnt > 0) { ihv3->srcs[lmqt_srcs++] = ent->addr.u.ip4; ent->src_query_rexmit_cnt--; + if (need_rexmit && ent->src_query_rexmit_cnt) + *need_rexmit = true; } } if (WARN_ON(lmqt_srcs != ntohs(ihv3->nsrcs))) { @@ -493,7 +498,8 @@ static struct sk_buff *br_multicast_alloc_query(struct net_bridge *br, struct br_ip *ip_dst, struct br_ip *group, bool with_srcs, bool over_lmqt, - u8 sflag, u8 *igmp_type) + u8 sflag, u8 *igmp_type, + bool *need_rexmit) { __be32 ip4_dst; @@ -503,7 +509,8 @@ static struct sk_buff *br_multicast_alloc_query(struct net_bridge *br, return br_ip4_multicast_alloc_query(br, pg, ip4_dst, group->u.ip4, with_srcs, over_lmqt, - sflag, igmp_type); + sflag, igmp_type, + need_rexmit); #if IS_ENABLED(CONFIG_IPV6) case htons(ETH_P_IPV6): return br_ip6_multicast_alloc_query(br, &group->u.ip6, @@ -647,8 +654,9 @@ struct net_bridge_port_group *br_multicast_new_port_group( p->filter_mode = MCAST_EXCLUDE; INIT_HLIST_HEAD(&p->src_list); rcu_assign_pointer(p->next, next); - hlist_add_head(&p->mglist, &port->mglist); timer_setup(&p->timer, br_multicast_port_group_expired, 0); + timer_setup(&p->rexmit_timer, br_multicast_port_group_rexmit, 0); + hlist_add_head(&p->mglist, &port->mglist); if (src) memcpy(p->eth_addr, src, ETH_ALEN); @@ -875,7 +883,8 @@ static void __br_multicast_send_query(struct net_bridge *br, struct br_ip *ip_dst, struct br_ip *group, bool with_srcs, - u8 sflag) + u8 sflag, + bool *need_rexmit) { bool over_lmqt = !!sflag; struct sk_buff *skb; @@ -883,7 +892,8 @@ static void __br_multicast_send_query(struct net_bridge *br, again_under_lmqt: skb = br_multicast_alloc_query(br, pg, ip_dst, group, with_srcs, - over_lmqt, sflag, &igmp_type); + over_lmqt, sflag, &igmp_type, + need_rexmit); if (!skb) return; @@ -936,7 +946,8 @@ static void br_multicast_send_query(struct net_bridge *br, if (!other_query || timer_pending(&other_query->timer)) return; - __br_multicast_send_query(br, port, NULL, NULL, &br_group, false, 0); + __br_multicast_send_query(br, port, NULL, NULL, &br_group, false, 0, + NULL); time = jiffies; time += own_query->startup_sent < br->multicast_startup_query_count ? @@ -981,6 +992,44 @@ static void br_ip6_multicast_port_query_expired(struct timer_list *t) } #endif +static void br_multicast_port_group_rexmit(struct timer_list *t) +{ + struct net_bridge_port_group *pg = from_timer(pg, t, rexmit_timer); + struct bridge_mcast_other_query *other_query = NULL; + struct net_bridge *br = pg->port->br; + bool need_rexmit = false; + + spin_lock(&br->multicast_lock); + if (!netif_running(br->dev) || hlist_unhashed(&pg->mglist) || + !br_opt_get(br, BROPT_MULTICAST_ENABLED) || + !br_opt_get(br, BROPT_MULTICAST_QUERIER)) + goto out; + + if (pg->addr.proto == htons(ETH_P_IP)) + other_query = &br->ip4_other_query; +#if IS_ENABLED(CONFIG_IPV6) + else + other_query = &br->ip6_other_query; +#endif + + if (!other_query || timer_pending(&other_query->timer)) + goto out; + + if (pg->grp_query_rexmit_cnt) { + pg->grp_query_rexmit_cnt--; + __br_multicast_send_query(br, pg->port, pg, &pg->addr, + &pg->addr, false, 1, NULL); + } + __br_multicast_send_query(br, pg->port, pg, &pg->addr, + &pg->addr, true, 0, &need_rexmit); + + if (pg->grp_query_rexmit_cnt || need_rexmit) + mod_timer(&pg->rexmit_timer, jiffies + + br->multicast_last_member_interval); +out: + spin_unlock(&br->multicast_lock); +} + static void br_mc_disabled_update(struct net_device *dev, bool value) { struct switchdev_attr attr = { @@ -1589,7 +1638,7 @@ br_multicast_leave_group(struct net_bridge *br, if (br_opt_get(br, BROPT_MULTICAST_QUERIER)) { __br_multicast_send_query(br, port, NULL, NULL, &mp->addr, - false, 0); + false, 0, NULL); time = jiffies + br->multicast_last_member_count * br->multicast_last_member_interval; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index e363b508db92..34acad4a455a 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -240,10 +240,12 @@ struct net_bridge_port_group { unsigned char eth_addr[ETH_ALEN] __aligned(2); unsigned char flags; unsigned char filter_mode; + unsigned char grp_query_rexmit_cnt; struct hlist_head src_list; unsigned int src_ents; struct timer_list timer; + struct timer_list rexmit_timer; struct hlist_node mglist; struct rcu_head rcu; @@ -867,6 +869,12 @@ static inline int br_multicast_igmp_type(const struct sk_buff *skb) { return BR_INPUT_SKB_CB(skb)->igmp; } + +static inline unsigned long br_multicast_lmqt(const struct net_bridge *br) +{ + return br->multicast_last_member_interval * + br->multicast_last_member_count; +} #else static inline int br_multicast_rcv(struct net_bridge *br, struct net_bridge_port *port, -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 07/15] net: bridge: mdb: push notifications in __br_mdb_add/del
This change is in preparation for using the mdb port group entries when sending a notification, so their full state and additional attributes can be filled in. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_mdb.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index 7625db4b7fb9..f5290021310a 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -663,7 +663,7 @@ static int br_mdb_parse(struct sk_buff *skb, struct nlmsghdr *nlh, } static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, - struct br_ip *group, unsigned char state) + struct br_ip *group, struct br_mdb_entry *entry) { struct net_bridge_mdb_entry *mp; struct net_bridge_port_group *p; @@ -682,12 +682,13 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, /* host join */ if (!port) { /* don't allow any flags for host-joined groups */ - if (state) + if (entry->state) return -EINVAL; if (mp->host_joined) return -EEXIST; br_multicast_host_join(mp, false); + __br_mdb_notify(br->dev, NULL, entry, RTM_NEWMDB); return 0; } @@ -701,13 +702,14 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, break; } - p = br_multicast_new_port_group(port, group, *pp, state, NULL); + p = br_multicast_new_port_group(port, group, *pp, entry->state, NULL); if (unlikely(!p)) return -ENOMEM; p->filter_mode = MCAST_EXCLUDE; rcu_assign_pointer(*pp, p); - if (state == MDB_TEMPORARY) + if (entry->state == MDB_TEMPORARY) mod_timer(&p->timer, now + br->multicast_membership_interval); + __br_mdb_notify(br->dev, port, entry, RTM_NEWMDB); return 0; } @@ -736,7 +738,7 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br, __mdb_entry_to_br_ip(entry, &ip); spin_lock_bh(&br->multicast_lock); - ret = br_mdb_add_group(br, p, &ip, entry->state); + ret = br_mdb_add_group(br, p, &ip, entry); spin_unlock_bh(&br->multicast_lock); return ret; } @@ -781,12 +783,9 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, err = __br_mdb_add(net, br, entry); if (err) break; - __br_mdb_notify(dev, p, entry, RTM_NEWMDB); } } else { err = __br_mdb_add(net, br, entry); - if (!err) - __br_mdb_notify(dev, p, entry, RTM_NEWMDB); } return err; @@ -814,6 +813,7 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry) if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) { br_multicast_host_leave(mp, false); err = 0; + __br_mdb_notify(br->dev, NULL, entry, RTM_DELMDB); if (!mp->ports && netif_running(br->dev)) mod_timer(&mp->timer, jiffies); goto unlock; @@ -876,13 +876,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, list_for_each_entry(v, &vg->vlan_list, vlist) { entry->vid = v->vid; err = __br_mdb_del(br, entry); - if (!err) - __br_mdb_notify(dev, p, entry, RTM_DELMDB); } } else { err = __br_mdb_del(br, entry); - if (!err) - __br_mdb_notify(dev, p, entry, RTM_DELMDB); } return err; -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 08/15] net: bridge: mdb: use mdb and port entries in notifications
We have to use mdb and port entries when sending mdb notifications in order to fill in all group attributes properly. Before this change we would've used a fake br_mdb_entry struct to fill in only partial information about the mdb. Now we can also reuse the mdb dump fill function and thus have only a single central place which fills the mdb attributes. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_mdb.c | 131 ++++++++++++++++++++------------------ net/bridge/br_multicast.c | 10 +-- net/bridge/br_private.h | 4 +- 3 files changed, 77 insertions(+), 68 deletions(-) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index f5290021310a..33236c2a14a3 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -326,14 +326,15 @@ static int br_mdb_dump(struct sk_buff *skb, struct netlink_callback *cb) static int nlmsg_populate_mdb_fill(struct sk_buff *skb, struct net_device *dev, - struct br_mdb_entry *entry, u32 pid, - u32 seq, int type, unsigned int flags) + struct net_bridge_mdb_entry *mp, + struct net_bridge_port_group *pg, + int type) { struct nlmsghdr *nlh; struct br_port_msg *bpm; struct nlattr *nest, *nest2; - nlh = nlmsg_put(skb, pid, seq, type, sizeof(*bpm), 0); + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*bpm), 0); if (!nlh) return -EMSGSIZE; @@ -348,7 +349,7 @@ static int nlmsg_populate_mdb_fill(struct sk_buff *skb, if (nest2 == NULL) goto end; - if (nla_put(skb, MDBA_MDB_ENTRY_INFO, sizeof(*entry), entry)) + if (__mdb_fill_info(skb, mp, pg)) goto end; nla_nest_end(skb, nest2); @@ -363,10 +364,34 @@ static int nlmsg_populate_mdb_fill(struct sk_buff *skb, return -EMSGSIZE; } -static inline size_t rtnl_mdb_nlmsg_size(void) +static size_t rtnl_mdb_nlmsg_size(struct net_bridge_port_group *pg) { - return NLMSG_ALIGN(sizeof(struct br_port_msg)) - + nla_total_size(sizeof(struct br_mdb_entry)); + size_t nlmsg_size = NLMSG_ALIGN(sizeof(struct br_port_msg)) + + nla_total_size(sizeof(struct br_mdb_entry)) + + nla_total_size(sizeof(u32)); + + if (pg && pg->port->br->multicast_igmp_version == 3 && + pg->addr.proto == htons(ETH_P_IP)) { + struct net_bridge_group_src *ent; + + /* MDBA_MDB_EATTR_GROUP_MODE */ + nlmsg_size += nla_total_size(sizeof(u8)); + + /* MDBA_MDB_EATTR_SRC_LIST nested attr */ + if (!hlist_empty(&pg->src_list)) + nlmsg_size += nla_total_size(0); + + hlist_for_each_entry(ent, &pg->src_list, node) { + /* MDBA_MDB_SRCLIST_ENTRY nested attr + + * MDBA_MDB_SRCATTR_ADDRESS + MDBA_MDB_SRCATTR_TIMER + */ + nlmsg_size += nla_total_size(0) + + nla_total_size(sizeof(__be32)) + + nla_total_size(sizeof(u32)); + } + } + + return nlmsg_size; } struct br_mdb_complete_info { @@ -404,21 +429,22 @@ static void br_mdb_complete(struct net_device *dev, int err, void *priv) static void br_mdb_switchdev_host_port(struct net_device *dev, struct net_device *lower_dev, - struct br_mdb_entry *entry, int type) + struct net_bridge_mdb_entry *mp, + int type) { struct switchdev_obj_port_mdb mdb = { .obj = { .id = SWITCHDEV_OBJ_ID_HOST_MDB, .flags = SWITCHDEV_F_DEFER, }, - .vid = entry->vid, + .vid = mp->addr.vid, }; - if (entry->addr.proto == htons(ETH_P_IP)) - ip_eth_mc_map(entry->addr.u.ip4, mdb.addr); + if (mp->addr.proto == htons(ETH_P_IP)) + ip_eth_mc_map(mp->addr.u.ip4, mdb.addr); #if IS_ENABLED(CONFIG_IPV6) else - ipv6_eth_mc_map(&entry->addr.u.ip6, mdb.addr); + ipv6_eth_mc_map(&mp->addr.u.ip6, mdb.addr); #endif mdb.obj.orig_dev = dev; @@ -433,17 +459,19 @@ static void br_mdb_switchdev_host_port(struct net_device *dev, } static void br_mdb_switchdev_host(struct net_device *dev, - struct br_mdb_entry *entry, int type) + struct net_bridge_mdb_entry *mp, int type) { struct net_device *lower_dev; struct list_head *iter; netdev_for_each_lower_dev(dev, lower_dev, iter) - br_mdb_switchdev_host_port(dev, lower_dev, entry, type); + br_mdb_switchdev_host_port(dev, lower_dev, mp, type); } -static void __br_mdb_notify(struct net_device *dev, struct net_bridge_port *p, - struct br_mdb_entry *entry, int type) +void br_mdb_notify(struct net_device *dev, + struct net_bridge_mdb_entry *mp, + struct net_bridge_port_group *pg, + int type) { struct br_mdb_complete_info *complete_info; struct switchdev_obj_port_mdb mdb = { @@ -451,44 +479,45 @@ static void __br_mdb_notify(struct net_device *dev, struct net_bridge_port *p, .id = SWITCHDEV_OBJ_ID_PORT_MDB, .flags = SWITCHDEV_F_DEFER, }, - .vid = entry->vid, + .vid = mp->addr.vid, }; - struct net_device *port_dev; struct net *net = dev_net(dev); struct sk_buff *skb; int err = -ENOBUFS; - port_dev = __dev_get_by_index(net, entry->ifindex); - if (entry->addr.proto == htons(ETH_P_IP)) - ip_eth_mc_map(entry->addr.u.ip4, mdb.addr); + if (pg) { + if (mp->addr.proto == htons(ETH_P_IP)) + ip_eth_mc_map(mp->addr.u.ip4, mdb.addr); #if IS_ENABLED(CONFIG_IPV6) - else - ipv6_eth_mc_map(&entry->addr.u.ip6, mdb.addr); + else + ipv6_eth_mc_map(&mp->addr.u.ip6, mdb.addr); #endif - - mdb.obj.orig_dev = port_dev; - if (p && port_dev && type == RTM_NEWMDB) { - complete_info = kmalloc(sizeof(*complete_info), GFP_ATOMIC); - if (complete_info) { - complete_info->port = p; - __mdb_entry_to_br_ip(entry, &complete_info->ip); + mdb.obj.orig_dev = pg->port->dev; + switch (type) { + case RTM_NEWMDB: + complete_info = kmalloc(sizeof(*complete_info), GFP_ATOMIC); + if (!complete_info) + break; + complete_info->port = pg->port; + complete_info->ip = mp->addr; mdb.obj.complete_priv = complete_info; mdb.obj.complete = br_mdb_complete; - if (switchdev_port_obj_add(port_dev, &mdb.obj, NULL)) + if (switchdev_port_obj_add(pg->port->dev, &mdb.obj, NULL)) kfree(complete_info); + break; + case RTM_DELMDB: + switchdev_port_obj_del(pg->port->dev, &mdb.obj); + break; } - } else if (p && port_dev && type == RTM_DELMDB) { - switchdev_port_obj_del(port_dev, &mdb.obj); + } else { + br_mdb_switchdev_host(dev, mp, type); } - if (!p) - br_mdb_switchdev_host(dev, entry, type); - - skb = nlmsg_new(rtnl_mdb_nlmsg_size(), GFP_ATOMIC); + skb = nlmsg_new(rtnl_mdb_nlmsg_size(pg), GFP_ATOMIC); if (!skb) goto errout; - err = nlmsg_populate_mdb_fill(skb, dev, entry, 0, 0, type, NTF_SELF); + err = nlmsg_populate_mdb_fill(skb, dev, mp, pg, type); if (err < 0) { kfree_skb(skb); goto errout; @@ -500,26 +529,6 @@ static void __br_mdb_notify(struct net_device *dev, struct net_bridge_port *p, rtnl_set_sk_err(net, RTNLGRP_MDB, err); } -void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port, - struct br_ip *group, int type, u8 flags) -{ - struct br_mdb_entry entry; - - memset(&entry, 0, sizeof(entry)); - if (port) - entry.ifindex = port->dev->ifindex; - else - entry.ifindex = dev->ifindex; - entry.addr.proto = group->proto; - entry.addr.u.ip4 = group->u.ip4; -#if IS_ENABLED(CONFIG_IPV6) - entry.addr.u.ip6 = group->u.ip6; -#endif - entry.vid = group->vid; - __mdb_entry_fill_flags(&entry, flags); - __br_mdb_notify(dev, port, &entry, type); -} - static int nlmsg_populate_rtr_fill(struct sk_buff *skb, struct net_device *dev, int ifindex, u32 pid, @@ -688,7 +697,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, return -EEXIST; br_multicast_host_join(mp, false); - __br_mdb_notify(br->dev, NULL, entry, RTM_NEWMDB); + br_mdb_notify(br->dev, mp, NULL, RTM_NEWMDB); return 0; } @@ -709,7 +718,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port, rcu_assign_pointer(*pp, p); if (entry->state == MDB_TEMPORARY) mod_timer(&p->timer, now + br->multicast_membership_interval); - __br_mdb_notify(br->dev, port, entry, RTM_NEWMDB); + br_mdb_notify(br->dev, mp, p, RTM_NEWMDB); return 0; } @@ -813,7 +822,7 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry) if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) { br_multicast_host_leave(mp, false); err = 0; - __br_mdb_notify(br->dev, NULL, entry, RTM_DELMDB); + br_mdb_notify(br->dev, mp, NULL, RTM_DELMDB); if (!mp->ports && netif_running(br->dev)) mod_timer(&mp->timer, jiffies); goto unlock; diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 1a910f139ee2..0ec43d549137 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -188,7 +188,7 @@ void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, del_timer(&pg->rexmit_timer); hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) br_multicast_del_group_src(ent); - br_mdb_notify(br->dev, pg->port, &pg->addr, RTM_DELMDB, pg->flags); + br_mdb_notify(br->dev, mp, pg, RTM_DELMDB); kfree_rcu(pg, rcu); if (!mp->ports && !mp->host_joined && netif_running(br->dev)) @@ -684,8 +684,7 @@ void br_multicast_host_join(struct net_bridge_mdb_entry *mp, bool notify) if (!mp->host_joined) { mp->host_joined = true; if (notify) - br_mdb_notify(mp->br->dev, NULL, &mp->addr, - RTM_NEWMDB, 0); + br_mdb_notify(mp->br->dev, mp, NULL, RTM_NEWMDB); } mod_timer(&mp->timer, jiffies + mp->br->multicast_membership_interval); } @@ -697,7 +696,7 @@ void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify) mp->host_joined = false; if (notify) - br_mdb_notify(mp->br->dev, NULL, &mp->addr, RTM_DELMDB, 0); + br_mdb_notify(mp->br->dev, mp, NULL, RTM_DELMDB); } static int br_multicast_add_group(struct net_bridge *br, @@ -739,10 +738,11 @@ static int br_multicast_add_group(struct net_bridge *br, if (unlikely(!p)) goto err; rcu_assign_pointer(*pp, p); - br_mdb_notify(br->dev, port, group, RTM_NEWMDB, 0); + br_mdb_notify(br->dev, mp, p, RTM_NEWMDB); found: mod_timer(&p->timer, now + br->multicast_membership_interval); + out: err = 0; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 34acad4a455a..78822d3a3b7c 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -799,8 +799,8 @@ br_multicast_new_port_group(struct net_bridge_port *port, struct br_ip *group, unsigned char flags, const unsigned char *src); int br_mdb_hash_init(struct net_bridge *br); void br_mdb_hash_fini(struct net_bridge *br); -void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port, - struct br_ip *group, int type, u8 flags); +void br_mdb_notify(struct net_device *dev, struct net_bridge_mdb_entry *mp, + struct net_bridge_port_group *pg, int type); void br_rtr_notify(struct net_device *dev, struct net_bridge_port *port, int type); void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 09/15] net: bridge: mcast: delete expired port groups without srcs
If an expired port group is in EXCLUDE mode, then we have to turn it into INCLUDE mode, remove all srcs with zero timer and finally remove the group itself if there are no more srcs with an active timer. For IGMPv2 use there would be no sources, so this will reduce to just removing the group as before. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 0ec43d549137..aabb1fcc7fa1 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -222,15 +222,34 @@ static void br_multicast_find_del_pg(struct net_bridge *br, static void br_multicast_port_group_expired(struct timer_list *t) { struct net_bridge_port_group *pg = from_timer(pg, t, timer); + struct net_bridge_group_src *src_ent; struct net_bridge *br = pg->port->br; + struct hlist_node *tmp; + bool changed; spin_lock(&br->multicast_lock); if (!netif_running(br->dev) || timer_pending(&pg->timer) || hlist_unhashed(&pg->mglist) || pg->flags & MDB_PG_FLAGS_PERMANENT) goto out; - br_multicast_find_del_pg(br, pg); + changed = !!(pg->filter_mode == MCAST_EXCLUDE); + pg->filter_mode = MCAST_INCLUDE; + hlist_for_each_entry_safe(src_ent, tmp, &pg->src_list, node) { + if (!timer_pending(&src_ent->timer)) { + br_multicast_del_group_src(src_ent); + changed = true; + } + } + if (hlist_empty(&pg->src_list)) { + br_multicast_find_del_pg(br, pg); + } else if (changed) { + struct net_bridge_mdb_entry *mp = br_mdb_ip_get(br, &pg->addr); + + if (WARN_ON(!mp)) + goto out; + br_mdb_notify(br->dev, mp, pg, RTM_NEWMDB); + } out: spin_unlock(&br->multicast_lock); } -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 10/15] net: bridge: mcast: support for IGMPv3 IGMPV3_ALLOW_NEW_SOURCES report
This patch adds handling for the IGMPV3_ALLOW_NEW_SOURCES IGMPv3 report type and limits it only when multicast_igmp_version == 3. Now that IGMPv3 handling functions will be managing timers we need to delay their activation, thus a new argument is added which controls if the timer should be updated. We also disable host IGMPv3 handling as it's not yet implemented and could cause inconsistent group state, the host can only join a group as EXCLUDE {} or leave it. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 113 ++++++++++++++++++++++++++++++++------ net/bridge/br_private.h | 7 +++ 2 files changed, 103 insertions(+), 17 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index aabb1fcc7fa1..93771309f59f 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -721,7 +721,8 @@ void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify) static int br_multicast_add_group(struct net_bridge *br, struct net_bridge_port *port, struct br_ip *group, - const unsigned char *src) + const unsigned char *src, + bool update_timer) { struct net_bridge_port_group __rcu **pp; struct net_bridge_port_group *p; @@ -760,7 +761,8 @@ static int br_multicast_add_group(struct net_bridge *br, br_mdb_notify(br->dev, mp, p, RTM_NEWMDB); found: - mod_timer(&p->timer, now + br->multicast_membership_interval); + if (update_timer) + mod_timer(&p->timer, now + br->multicast_membership_interval); out: err = 0; @@ -774,7 +776,8 @@ static int br_ip4_multicast_add_group(struct net_bridge *br, struct net_bridge_port *port, __be32 group, __u16 vid, - const unsigned char *src) + const unsigned char *src, + bool update_timer) { struct br_ip br_group; @@ -786,7 +789,7 @@ static int br_ip4_multicast_add_group(struct net_bridge *br, br_group.proto = htons(ETH_P_IP); br_group.vid = vid; - return br_multicast_add_group(br, port, &br_group, src); + return br_multicast_add_group(br, port, &br_group, src, update_timer); } #if IS_ENABLED(CONFIG_IPV6) @@ -806,7 +809,7 @@ static int br_ip6_multicast_add_group(struct net_bridge *br, br_group.proto = htons(ETH_P_IPV6); br_group.vid = vid; - return br_multicast_add_group(br, port, &br_group, src); + return br_multicast_add_group(br, port, &br_group, src, true); } #endif @@ -1153,20 +1156,71 @@ void br_multicast_disable_port(struct net_bridge_port *port) spin_unlock(&br->multicast_lock); } +/* State Msg type New state Actions + * INCLUDE (A) IS_IN (B) INCLUDE (A+B) (B)=GMI + * INCLUDE (A) ALLOW (B) INCLUDE (A+B) (B)=GMI + * EXCLUDE (X,Y) ALLOW (A) EXCLUDE (X+A,Y-A) (A)=GMI + */ +static bool br_multicast_isinc_allow(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + struct net_bridge_group_src *ent; + unsigned long now = jiffies; + bool changed = false; + struct br_ip src_ip; + u32 src_idx; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (!ent) { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) + changed = true; + } + + if (ent) + mod_timer(&ent->timer, now + br_multicast_gmi(br)); + } + + return changed; +} + +static struct net_bridge_port_group * +br_multicast_find_port(struct net_bridge_mdb_entry *mp, + struct net_bridge_port *p, + const unsigned char *src) +{ + struct net_bridge_port_group *pg; + struct net_bridge *br = mp->br; + + for (pg = mlock_dereference(mp->ports, br); + pg; + pg = mlock_dereference(pg->next, br)) + if (br_port_group_equal(pg, p, src)) + return pg; + + return NULL; +} + static int br_ip4_multicast_igmp3_report(struct net_bridge *br, struct net_bridge_port *port, struct sk_buff *skb, u16 vid) { + bool igmpv2 = br->multicast_igmp_version == 2; + struct net_bridge_mdb_entry *mdst; + struct net_bridge_port_group *pg; const unsigned char *src; struct igmpv3_report *ih; struct igmpv3_grec *grec; - int i; - int len; - int num; - int type; - int err = 0; + int i, len, num, type; + bool changed = false; __be32 group; + int err = 0; u16 nsrcs; ih = igmpv3_report_hdr(skb); @@ -1187,7 +1241,6 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, if (!ip_mc_may_pull(skb, len)) return -EINVAL; - /* We treat this as an IGMPv2 report for now. */ switch (type) { case IGMPV3_MODE_IS_INCLUDE: case IGMPV3_MODE_IS_EXCLUDE: @@ -1202,16 +1255,41 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, } src = eth_hdr(skb)->h_source; - if ((type == IGMPV3_CHANGE_TO_INCLUDE || - type == IGMPV3_MODE_IS_INCLUDE) && - nsrcs == 0) { - br_ip4_multicast_leave_group(br, port, group, vid, src); + if (nsrcs == 0 && + (type == IGMPV3_CHANGE_TO_INCLUDE || + type == IGMPV3_MODE_IS_INCLUDE)) { + if (igmpv2 || !port) { + br_ip4_multicast_leave_group(br, port, group, vid, src); + continue; + } } else { err = br_ip4_multicast_add_group(br, port, group, vid, - src); + src, igmpv2); if (err) break; } + + if (!port || igmpv2) + continue; + + spin_lock_bh(&br->multicast_lock); + mdst = br_mdb_ip4_get(br, group, vid); + if (!mdst) + goto unlock_continue; + pg = br_multicast_find_port(mdst, port, src); + if (!pg || (pg->flags & MDB_PG_FLAGS_PERMANENT)) + goto unlock_continue; + /* reload grec */ + grec = (void *)(skb->data + len - sizeof(*grec) - (nsrcs * 4)); + switch (type) { + case IGMPV3_ALLOW_NEW_SOURCES: + changed = br_multicast_isinc_allow(pg, grec->grec_src, nsrcs); + break; + } + if (changed) + br_mdb_notify(br->dev, mdst, pg, RTM_NEWMDB); +unlock_continue: + spin_unlock_bh(&br->multicast_lock); } return err; @@ -1859,7 +1937,8 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br, case IGMP_HOST_MEMBERSHIP_REPORT: case IGMPV2_HOST_MEMBERSHIP_REPORT: BR_INPUT_SKB_CB(skb)->mrouters_only = 1; - err = br_ip4_multicast_add_group(br, port, ih->group, vid, src); + err = br_ip4_multicast_add_group(br, port, ih->group, vid, src, + true); break; case IGMPV3_HOST_MEMBERSHIP_REPORT: err = br_ip4_multicast_igmp3_report(br, port, skb, vid); diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 78822d3a3b7c..a18bd67dab34 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -875,6 +875,13 @@ static inline unsigned long br_multicast_lmqt(const struct net_bridge *br) return br->multicast_last_member_interval * br->multicast_last_member_count; } + +static inline unsigned long br_multicast_gmi(const struct net_bridge *br) +{ + /* use the RFC default of 2 for QRV */ + return 2 * br->multicast_query_interval + + br->multicast_query_response_interval; +} #else static inline int br_multicast_rcv(struct net_bridge *br, struct net_bridge_port *port, -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 11/15] net: bridge: mcast: support for IGMPV3_MODE_IS_INCLUDE/EXCLUDE report
In order to process IGMPV3_MODE_IS_INCLUDE/EXCLUDE report types we need some new helpers which allow us to set/clear flags for all current entries and later delete marked entries after the report sources have been processed. v2: drop flag helpers and directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 114 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 93771309f59f..9e0c8a462343 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1156,6 +1156,21 @@ void br_multicast_disable_port(struct net_bridge_port *port) spin_unlock(&br->multicast_lock); } +static int __grp_src_delete_marked(struct net_bridge_port_group *pg) +{ + struct net_bridge_group_src *ent; + struct hlist_node *tmp; + int deleted = 0; + + hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) + if (ent->flags & BR_SGRP_F_DELETE) { + br_multicast_del_group_src(ent); + deleted++; + } + + return deleted; +} + /* State Msg type New state Actions * INCLUDE (A) IS_IN (B) INCLUDE (A+B) (B)=GMI * INCLUDE (A) ALLOW (B) INCLUDE (A+B) (B)=GMI @@ -1189,6 +1204,99 @@ static bool br_multicast_isinc_allow(struct net_bridge_port_group *pg, return changed; } +/* State Msg type New state Actions + * INCLUDE (A) IS_EX (B) EXCLUDE (A*B,B-A) (B-A)=0 + * Delete (A-B) + * Group Timer=GMI + */ +static void __grp_src_isexc_incl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge_group_src *ent; + struct br_ip src_ip; + u32 src_idx; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags |= BR_SGRP_F_DELETE; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) + ent->flags &= ~BR_SGRP_F_DELETE; + else + br_multicast_new_group_src(pg, &src_ip); + } + + __grp_src_delete_marked(pg); +} + +/* State Msg type New state Actions + * EXCLUDE (X,Y) IS_EX (A) EXCLUDE (A-Y,Y*A) (A-X-Y)=GMI + * Delete (X-A) + * Delete (Y-A) + * Group Timer=GMI + */ +static bool __grp_src_isexc_excl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + struct net_bridge_group_src *ent; + unsigned long now = jiffies; + bool changed = false; + struct br_ip src_ip; + u32 src_idx; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags |= BR_SGRP_F_DELETE; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + ent->flags &= ~BR_SGRP_F_DELETE; + } else { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) { + mod_timer(&ent->timer, + now + br_multicast_gmi(br)); + changed = true; + } + } + } + + if (__grp_src_delete_marked(pg)) + changed = true; + + return changed; +} + +static bool br_multicast_isexc(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + bool changed = false; + + switch (pg->filter_mode) { + case MCAST_INCLUDE: + __grp_src_isexc_incl(pg, srcs, nsrcs); + changed = true; + break; + case MCAST_EXCLUDE: + changed = __grp_src_isexc_excl(pg, srcs, nsrcs); + break; + } + + pg->filter_mode = MCAST_EXCLUDE; + mod_timer(&pg->timer, jiffies + br_multicast_gmi(br)); + + return changed; +} + static struct net_bridge_port_group * br_multicast_find_port(struct net_bridge_mdb_entry *mp, struct net_bridge_port *p, @@ -1285,6 +1393,12 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, case IGMPV3_ALLOW_NEW_SOURCES: changed = br_multicast_isinc_allow(pg, grec->grec_src, nsrcs); break; + case IGMPV3_MODE_IS_INCLUDE: + changed = br_multicast_isinc_allow(pg, grec->grec_src, nsrcs); + break; + case IGMPV3_MODE_IS_EXCLUDE: + changed = br_multicast_isexc(pg, grec->grec_src, nsrcs); + break; } if (changed) br_mdb_notify(br->dev, mdst, pg, RTM_NEWMDB); -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 12/15] net: bridge: mcast: support for IGMPV3_CHANGE_TO_INCLUDE/EXCLUDE report
In order to process IGMPV3_CHANGE_TO_INCLUDE/EXCLUDE report types we need new helpers which allow us to mark entries based on their timer state and to query only marked entries. v2: directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 267 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 9e0c8a462343..0dffd5a26110 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1171,6 +1171,61 @@ static int __grp_src_delete_marked(struct net_bridge_port_group *pg) return deleted; } +static void __grp_src_query_marked_and_rexmit(struct net_bridge_port_group *pg) +{ + struct net_bridge *br = pg->port->br; + u32 lmqc = br->multicast_last_member_count; + unsigned long lmqt, lmi, now = jiffies; + struct net_bridge_group_src *ent; + + lmqt = now + br_multicast_lmqt(br); + hlist_for_each_entry(ent, &pg->src_list, node) { + if (ent->flags & BR_SGRP_F_SEND) { + ent->flags &= ~BR_SGRP_F_SEND; + if (ent->timer.expires > lmqt) { + if (br_opt_get(br, BROPT_MULTICAST_QUERIER) && + !timer_pending(&br->ip4_other_query.timer)) + ent->src_query_rexmit_cnt = lmqc; + mod_timer(&ent->timer, lmqt); + } + } + } + + if (!br_opt_get(br, BROPT_MULTICAST_QUERIER) || + timer_pending(&br->ip4_other_query.timer)) + return; + + __br_multicast_send_query(br, pg->port, pg, &pg->addr, + &pg->addr, true, 1, NULL); + + lmi = now + br->multicast_last_member_interval; + if (!timer_pending(&pg->rexmit_timer) || + time_after(pg->rexmit_timer.expires, lmi)) + mod_timer(&pg->rexmit_timer, lmi); +} + +static void __grp_send_query_and_rexmit(struct net_bridge_port_group *pg) +{ + struct net_bridge *br = pg->port->br; + unsigned long now = jiffies, lmi; + + if (br_opt_get(br, BROPT_MULTICAST_QUERIER) && + timer_pending(&br->ip4_other_query.timer)) { + lmi = now + br->multicast_last_member_interval; + pg->grp_query_rexmit_cnt = br->multicast_last_member_count - 1; + __br_multicast_send_query(br, pg->port, pg, &pg->addr, + &pg->addr, false, 0, NULL); + if (!timer_pending(&pg->rexmit_timer) || + time_after(pg->rexmit_timer.expires, lmi)) + mod_timer(&pg->rexmit_timer, lmi); + } + + if (pg->filter_mode == MCAST_EXCLUDE && + (!timer_pending(&pg->timer) || + time_after(pg->timer.expires, now + br_multicast_lmqt(br)))) + mod_timer(&pg->timer, now + br_multicast_lmqt(br)); +} + /* State Msg type New state Actions * INCLUDE (A) IS_IN (B) INCLUDE (A+B) (B)=GMI * INCLUDE (A) ALLOW (B) INCLUDE (A+B) (B)=GMI @@ -1297,6 +1352,212 @@ static bool br_multicast_isexc(struct net_bridge_port_group *pg, return changed; } +/* State Msg type New state Actions + * INCLUDE (A) TO_IN (B) INCLUDE (A+B) (B)=GMI + * Send Q(G,A-B) + */ +static bool __grp_src_toin_incl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + u32 src_idx, to_send = pg->src_ents; + struct net_bridge_group_src *ent; + unsigned long now = jiffies; + bool changed = false; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags |= BR_SGRP_F_SEND; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + ent->flags &= ~BR_SGRP_F_SEND; + to_send--; + } else { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) + changed = true; + } + if (ent) + mod_timer(&ent->timer, now + br_multicast_gmi(br)); + } + + if (to_send) + __grp_src_query_marked_and_rexmit(pg); + + return changed; +} + +/* State Msg type New state Actions + * EXCLUDE (X,Y) TO_IN (A) EXCLUDE (X+A,Y-A) (A)=GMI + * Send Q(G,X-A) + * Send Q(G) + */ +static bool __grp_src_toin_excl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + u32 src_idx, to_send = pg->src_ents; + struct net_bridge_group_src *ent; + unsigned long now = jiffies; + bool changed = false; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + if (timer_pending(&ent->timer)) + ent->flags |= BR_SGRP_F_SEND; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + if (timer_pending(&ent->timer)) { + ent->flags &= ~BR_SGRP_F_SEND; + to_send--; + } + } else { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) + changed = true; + } + if (ent) + mod_timer(&ent->timer, now + br_multicast_gmi(br)); + } + + if (to_send) + __grp_src_query_marked_and_rexmit(pg); + + __grp_send_query_and_rexmit(pg); + + return changed; +} + +static bool br_multicast_toin(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + bool changed = false; + + switch (pg->filter_mode) { + case MCAST_INCLUDE: + changed = __grp_src_toin_incl(pg, srcs, nsrcs); + break; + case MCAST_EXCLUDE: + changed = __grp_src_toin_excl(pg, srcs, nsrcs); + break; + } + + return changed; +} + +/* State Msg type New state Actions + * INCLUDE (A) TO_EX (B) EXCLUDE (A*B,B-A) (B-A)=0 + * Delete (A-B) + * Send Q(G,A*B) + * Group Timer=GMI + */ +static void __grp_src_toex_incl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge_group_src *ent; + u32 src_idx, to_send = 0; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags = (ent->flags & ~BR_SGRP_F_SEND) | BR_SGRP_F_DELETE; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + ent->flags = (ent->flags & ~BR_SGRP_F_DELETE) | + BR_SGRP_F_SEND; + to_send++; + } else { + br_multicast_new_group_src(pg, &src_ip); + } + } + + __grp_src_delete_marked(pg); + if (to_send) + __grp_src_query_marked_and_rexmit(pg); +} + +/* State Msg type New state Actions + * EXCLUDE (X,Y) TO_EX (A) EXCLUDE (A-Y,Y*A) (A-X-Y)=Group Timer + * Delete (X-A) + * Delete (Y-A) + * Send Q(G,A-Y) + * Group Timer=GMI + */ +static bool __grp_src_toex_excl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge_group_src *ent; + u32 src_idx, to_send = 0; + bool changed = false; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags = (ent->flags & ~BR_SGRP_F_SEND) | BR_SGRP_F_DELETE; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + ent->flags &= ~BR_SGRP_F_DELETE; + } else { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) { + mod_timer(&ent->timer, pg->timer.expires); + changed = true; + } + } + if (ent && timer_pending(&ent->timer)) { + ent->flags |= BR_SGRP_F_SEND; + to_send++; + } + } + + if (__grp_src_delete_marked(pg)) + changed = true; + if (to_send) + __grp_src_query_marked_and_rexmit(pg); + + return changed; +} + +static bool br_multicast_toex(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge *br = pg->port->br; + bool changed = false; + + switch (pg->filter_mode) { + case MCAST_INCLUDE: + __grp_src_toex_incl(pg, srcs, nsrcs); + changed = true; + break; + case MCAST_EXCLUDE: + __grp_src_toex_excl(pg, srcs, nsrcs); + break; + } + + pg->filter_mode = MCAST_EXCLUDE; + mod_timer(&pg->timer, jiffies + br_multicast_gmi(br)); + + return changed; +} + static struct net_bridge_port_group * br_multicast_find_port(struct net_bridge_mdb_entry *mp, struct net_bridge_port *p, @@ -1399,6 +1660,12 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, case IGMPV3_MODE_IS_EXCLUDE: changed = br_multicast_isexc(pg, grec->grec_src, nsrcs); break; + case IGMPV3_CHANGE_TO_INCLUDE: + changed = br_multicast_toin(pg, grec->grec_src, nsrcs); + break; + case IGMPV3_CHANGE_TO_EXCLUDE: + changed = br_multicast_toex(pg, grec->grec_src, nsrcs); + break; } if (changed) br_mdb_notify(br->dev, mdst, pg, RTM_NEWMDB); -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 13/15] net: bridge: mcast: support for IGMPV3_BLOCK_OLD_SOURCES report
We already have all necessary helpers, so process IGMPV3_BLOCK_OLD_SOURCES as per the RFC. v2: directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 90 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 0dffd5a26110..7df192e9ec50 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1558,6 +1558,93 @@ static bool br_multicast_toex(struct net_bridge_port_group *pg, return changed; } +/* State Msg type New state Actions + * INCLUDE (A) BLOCK (B) INCLUDE (A) Send Q(G,A*B) + */ +static void __grp_src_block_incl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge_group_src *ent; + u32 src_idx, to_send = 0; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags &= ~BR_SGRP_F_SEND; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (ent) { + ent->flags |= BR_SGRP_F_SEND; + to_send++; + } + } + + if (to_send) + __grp_src_query_marked_and_rexmit(pg); + + if (pg->filter_mode == MCAST_INCLUDE && hlist_empty(&pg->src_list)) + br_multicast_find_del_pg(pg->port->br, pg); +} + +/* State Msg type New state Actions + * EXCLUDE (X,Y) BLOCK (A) EXCLUDE (X+(A-Y),Y) (A-X-Y)=Group Timer + * Send Q(G,A-Y) + */ +static bool __grp_src_block_excl(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + struct net_bridge_group_src *ent; + u32 src_idx, to_send = 0; + bool changed = false; + struct br_ip src_ip; + + hlist_for_each_entry(ent, &pg->src_list, node) + ent->flags &= ~BR_SGRP_F_SEND; + + memset(&src_ip, 0, sizeof(src_ip)); + src_ip.proto = htons(ETH_P_IP); + for (src_idx = 0; src_idx < nsrcs; src_idx++) { + src_ip.u.ip4 = srcs[src_idx]; + ent = br_multicast_find_group_src(pg, &src_ip); + if (!ent) { + ent = br_multicast_new_group_src(pg, &src_ip); + if (ent) { + mod_timer(&ent->timer, pg->timer.expires); + changed = true; + } + } + if (ent && timer_pending(&ent->timer)) { + ent->flags |= BR_SGRP_F_SEND; + to_send++; + } + } + + if (to_send) + __grp_src_query_marked_and_rexmit(pg); + + return changed; +} + +static bool br_multicast_block(struct net_bridge_port_group *pg, + __be32 *srcs, u32 nsrcs) +{ + bool changed = false; + + switch (pg->filter_mode) { + case MCAST_INCLUDE: + __grp_src_block_incl(pg, srcs, nsrcs); + break; + case MCAST_EXCLUDE: + changed = __grp_src_block_excl(pg, srcs, nsrcs); + break; + } + + return changed; +} + static struct net_bridge_port_group * br_multicast_find_port(struct net_bridge_mdb_entry *mp, struct net_bridge_port *p, @@ -1666,6 +1753,9 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, case IGMPV3_CHANGE_TO_EXCLUDE: changed = br_multicast_toex(pg, grec->grec_src, nsrcs); break; + case IGMPV3_BLOCK_OLD_SOURCES: + changed = br_multicast_block(pg, grec->grec_src, nsrcs); + break; } if (changed) br_mdb_notify(br->dev, mdst, pg, RTM_NEWMDB); -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 14/15] net: bridge: mcast: improve v3 query processing
When an IGMPv3 query is received and we're operating in v3 mode then we need to avoid updating group timers if the suppress flag is set. Also we should update only timers for groups in exclude mode. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 7df192e9ec50..db4b2621631c 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -2034,7 +2034,8 @@ static void br_ip4_multicast_query(struct net_bridge *br, } } else if (transport_len >= sizeof(*ih3)) { ih3 = igmpv3_query_hdr(skb); - if (ih3->nsrcs) + if (ih3->nsrcs || + (br->multicast_igmp_version == 3 && group && ih3->suppress)) goto out; max_delay = ih3->code ? @@ -2069,7 +2070,9 @@ static void br_ip4_multicast_query(struct net_bridge *br, pp = &p->next) { if (timer_pending(&p->timer) ? time_after(p->timer.expires, now + max_delay) : - try_to_del_timer_sync(&p->timer) >= 0) + try_to_del_timer_sync(&p->timer) >= 0 && + (br->multicast_igmp_version == 2 || + p->filter_mode == MCAST_EXCLUDE)) mod_timer(&p->timer, now + max_delay); } -- 2.25.4
Nikolay Aleksandrov
2020-Sep-02 11:25 UTC
[Bridge] [PATCH net-next v2 15/15] net: bridge: mcast: destroy all entries via gc
Since each entry type has timers that can be running simultaneously we need to make sure that entries are not freed before their timers have finished. In order to do that generalize the src gc work to mcast gc work and use a callback to free the entries (mdb, port group or src). Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_multicast.c | 103 ++++++++++++++++++++++++++------------ net/bridge/br_private.h | 13 +++-- 2 files changed, 80 insertions(+), 36 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index db4b2621631c..f5fdd1e63f31 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -140,6 +140,29 @@ struct net_bridge_mdb_entry *br_mdb_get(struct net_bridge *br, return br_mdb_ip_get_rcu(br, &ip); } +static void br_multicast_destroy_mdb_entry(struct net_bridge_mcast_gc *gc) +{ + struct net_bridge_mdb_entry *mp; + + mp = container_of(gc, struct net_bridge_mdb_entry, mcast_gc); + WARN_ON(!hlist_unhashed(&mp->mdb_node)); + WARN_ON(mp->ports); + + del_timer_sync(&mp->timer); + kfree_rcu(mp, rcu); +} + +static void br_multicast_del_mdb_entry(struct net_bridge_mdb_entry *mp) +{ + struct net_bridge *br = mp->br; + + rhashtable_remove_fast(&br->mdb_hash_tbl, &mp->rhnode, + br_mdb_rht_params); + hlist_del_init_rcu(&mp->mdb_node); + hlist_add_head(&mp->mcast_gc.gc_node, &br->mcast_gc_list); + queue_work(system_long_wq, &br->mcast_gc_work); +} + static void br_multicast_group_expired(struct timer_list *t) { struct net_bridge_mdb_entry *mp = from_timer(mp, t, timer); @@ -153,15 +176,20 @@ static void br_multicast_group_expired(struct timer_list *t) if (mp->ports) goto out; + br_multicast_del_mdb_entry(mp); +out: + spin_unlock(&br->multicast_lock); +} - rhashtable_remove_fast(&br->mdb_hash_tbl, &mp->rhnode, - br_mdb_rht_params); - hlist_del_rcu(&mp->mdb_node); +static void br_multicast_destroy_group_src(struct net_bridge_mcast_gc *gc) +{ + struct net_bridge_group_src *src; - kfree_rcu(mp, rcu); + src = container_of(gc, struct net_bridge_group_src, mcast_gc); + WARN_ON(!hlist_unhashed(&src->node)); -out: - spin_unlock(&br->multicast_lock); + del_timer_sync(&src->timer); + kfree_rcu(src, rcu); } static void br_multicast_del_group_src(struct net_bridge_group_src *src) @@ -170,8 +198,21 @@ static void br_multicast_del_group_src(struct net_bridge_group_src *src) hlist_del_init_rcu(&src->node); src->pg->src_ents--; - hlist_add_head(&src->del_node, &br->src_gc_list); - queue_work(system_long_wq, &br->src_gc_work); + hlist_add_head(&src->mcast_gc.gc_node, &br->mcast_gc_list); + queue_work(system_long_wq, &br->mcast_gc_work); +} + +static void br_multicast_destroy_port_group(struct net_bridge_mcast_gc *gc) +{ + struct net_bridge_port_group *pg; + + pg = container_of(gc, struct net_bridge_port_group, mcast_gc); + WARN_ON(!hlist_unhashed(&pg->mglist)); + WARN_ON(!hlist_empty(&pg->src_list)); + + del_timer_sync(&pg->rexmit_timer); + del_timer_sync(&pg->timer); + kfree_rcu(pg, rcu); } void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, @@ -184,12 +225,11 @@ void br_multicast_del_pg(struct net_bridge_mdb_entry *mp, rcu_assign_pointer(*pp, pg->next); hlist_del_init(&pg->mglist); - del_timer(&pg->timer); - del_timer(&pg->rexmit_timer); hlist_for_each_entry_safe(ent, tmp, &pg->src_list, node) br_multicast_del_group_src(ent); br_mdb_notify(br->dev, mp, pg, RTM_DELMDB); - kfree_rcu(pg, rcu); + hlist_add_head(&pg->mcast_gc.gc_node, &br->mcast_gc_list); + queue_work(system_long_wq, &br->mcast_gc_work); if (!mp->ports && !mp->host_joined && netif_running(br->dev)) mod_timer(&mp->timer, jiffies); @@ -560,6 +600,7 @@ struct net_bridge_mdb_entry *br_multicast_new_group(struct net_bridge *br, mp->br = br; mp->addr = *group; + mp->mcast_gc.destroy = br_multicast_destroy_mdb_entry; timer_setup(&mp->timer, br_multicast_group_expired, 0); err = rhashtable_lookup_insert_fast(&br->mdb_hash_tbl, &mp->rhnode, br_mdb_rht_params); @@ -642,6 +683,7 @@ br_multicast_new_group_src(struct net_bridge_port_group *pg, struct br_ip *src_i grp_src->pg = pg; grp_src->br = pg->port->br; grp_src->addr = *src_ip; + grp_src->mcast_gc.destroy = br_multicast_destroy_group_src; timer_setup(&grp_src->timer, br_multicast_group_src_expired, 0); hlist_add_head_rcu(&grp_src->node, &pg->src_list); @@ -671,6 +713,7 @@ struct net_bridge_port_group *br_multicast_new_port_group( p->filter_mode = MCAST_INCLUDE; else p->filter_mode = MCAST_EXCLUDE; + p->mcast_gc.destroy = br_multicast_destroy_port_group; INIT_HLIST_HEAD(&p->src_list); rcu_assign_pointer(p->next, next); timer_setup(&p->timer, br_multicast_port_group_expired, 0); @@ -2566,29 +2609,28 @@ static void br_ip6_multicast_query_expired(struct timer_list *t) } #endif -static void __grp_src_gc(struct hlist_head *head) +static void br_multicast_do_gc(struct hlist_head *head) { - struct net_bridge_group_src *ent; + struct net_bridge_mcast_gc *gcent; struct hlist_node *tmp; - hlist_for_each_entry_safe(ent, tmp, head, del_node) { - hlist_del_init(&ent->del_node); - del_timer_sync(&ent->timer); - kfree_rcu(ent, rcu); + hlist_for_each_entry_safe(gcent, tmp, head, gc_node) { + hlist_del_init(&gcent->gc_node); + gcent->destroy(gcent); } } -static void br_multicast_src_gc(struct work_struct *work) +static void br_multicast_gc(struct work_struct *work) { struct net_bridge *br = container_of(work, struct net_bridge, - src_gc_work); + mcast_gc_work); HLIST_HEAD(deleted_head); spin_lock_bh(&br->multicast_lock); - hlist_move_list(&br->src_gc_list, &deleted_head); + hlist_move_list(&br->mcast_gc_list, &deleted_head); spin_unlock_bh(&br->multicast_lock); - __grp_src_gc(&deleted_head); + br_multicast_do_gc(&deleted_head); } void br_multicast_init(struct net_bridge *br) @@ -2631,8 +2673,8 @@ void br_multicast_init(struct net_bridge *br) br_ip6_multicast_query_expired, 0); #endif INIT_HLIST_HEAD(&br->mdb_list); - INIT_HLIST_HEAD(&br->src_gc_list); - INIT_WORK(&br->src_gc_work, br_multicast_src_gc); + INIT_HLIST_HEAD(&br->mcast_gc_list); + INIT_WORK(&br->mcast_gc_work, br_multicast_gc); } static void br_ip4_multicast_join_snoopers(struct net_bridge *br) @@ -2740,18 +2782,13 @@ void br_multicast_dev_del(struct net_bridge *br) struct hlist_node *tmp; spin_lock_bh(&br->multicast_lock); - hlist_for_each_entry_safe(mp, tmp, &br->mdb_list, mdb_node) { - del_timer(&mp->timer); - rhashtable_remove_fast(&br->mdb_hash_tbl, &mp->rhnode, - br_mdb_rht_params); - hlist_del_rcu(&mp->mdb_node); - kfree_rcu(mp, rcu); - } - hlist_move_list(&br->src_gc_list, &deleted_head); + hlist_for_each_entry_safe(mp, tmp, &br->mdb_list, mdb_node) + br_multicast_del_mdb_entry(mp); + hlist_move_list(&br->mcast_gc_list, &deleted_head); spin_unlock_bh(&br->multicast_lock); - __grp_src_gc(&deleted_head); - cancel_work_sync(&br->src_gc_work); + br_multicast_do_gc(&deleted_head); + cancel_work_sync(&br->mcast_gc_work); rcu_barrier(); } diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index a18bd67dab34..478857563957 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -219,6 +219,11 @@ struct net_bridge_fdb_entry { #define BR_SGRP_F_DELETE BIT(0) #define BR_SGRP_F_SEND BIT(1) +struct net_bridge_mcast_gc { + struct hlist_node gc_node; + void (*destroy)(struct net_bridge_mcast_gc *gc); +}; + struct net_bridge_group_src { struct hlist_node node; @@ -229,7 +234,7 @@ struct net_bridge_group_src { struct timer_list timer; struct net_bridge *br; - struct hlist_node del_node; + struct net_bridge_mcast_gc mcast_gc; struct rcu_head rcu; }; @@ -248,6 +253,7 @@ struct net_bridge_port_group { struct timer_list rexmit_timer; struct hlist_node mglist; + struct net_bridge_mcast_gc mcast_gc; struct rcu_head rcu; }; @@ -261,6 +267,7 @@ struct net_bridge_mdb_entry { struct timer_list timer; struct hlist_node mdb_node; + struct net_bridge_mcast_gc mcast_gc; struct rcu_head rcu; }; @@ -434,7 +441,7 @@ struct net_bridge { struct rhashtable mdb_hash_tbl; - struct hlist_head src_gc_list; + struct hlist_head mcast_gc_list; struct hlist_head mdb_list; struct hlist_head router_list; @@ -448,7 +455,7 @@ struct net_bridge { struct bridge_mcast_own_query ip6_own_query; struct bridge_mcast_querier ip6_querier; #endif /* IS_ENABLED(CONFIG_IPV6) */ - struct work_struct src_gc_work; + struct work_struct mcast_gc_work; #endif struct timer_list hello_timer; -- 2.25.4
David Miller
2020-Sep-02 19:58 UTC
[Bridge] [PATCH net-next v2 00/15] net: bridge: mcast: initial IGMPv3 support (part 1)
From: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> Date: Wed, 2 Sep 2020 14:25:14 +0300> Here're the sets that will come next (in order): > - Fast path patch-set which adds support for (S, G) mdb entries needed > for IGMPv3 forwarding, entry add source (kernel, user-space etc) > needed for IGMPv3 entry management, entry block mode needed for > IGMPv3 exclude mode. This set will also add iproute2 support for > manipulating and showing all the new state. > - Selftests patch-set which will verify all IGMPv3 state transitions > and forwarding > - Explicit host tracking patch-set, needed for proper fast leave and > with it fast leave will be enabled for IGMPv3 > > Not implemented yet: > - Host IGMPv3 support (currently we handle only join/leave as before) > - Proper other querier source timer and value updates > - IGMPv3/v2 compat (I have a few rough patches for this one)What about ipv6 support? The first thing I notice when reading these patches is the source filter code only supports ipv4.