Andrew Lunn
2019-Jul-31 03:31 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
> Our plan was to implement this in pure SW, and then look at how to HW offload > it.Great.> But this will take some time before we have anything meaning full to show. > > > Make it an alternative to the STP code? > I'm still working on learning the details of DLR, but I actually believe that it > in some situations may co-exists with STP ;-)The PDF you linked to suggests this as well. But i think you will need to make some core changes to the bridge. At the moment, STP is a bridge level property. But you are going to need it to be a per-port option. You can then use DLR on the ring ports, and optionally STP on the other ports.> But what we are looking at here, is to offload a > non-aware-(DLR|MRP)-switch which happens to be placed in a network > with these protocols running.So we need to think about why we are passing traffic to the CPU port, and under what conditions can it be blocked. 1) The interface is not part of a bridge. In this case, we only need the switch to pass to the CPU port MC addresses which have been set via set_rx_mode(). I think this case does not apply for what you want. You have two ports bridges together as part of the ring. 2) The interface is part of a bridge. There are a few sub-cases a) IGMP snooping is being performed. We can block multicast where there is no interest in the group. But this is limited to IP multicast. b) IGMP snooping is not being used and all interfaces in the bridge are ports of the switch. IP Multicast can be blocked to the CPU. c) IGMP snooping is not being used and there is a non-switch interface in the bridge. Multicast needed is needed, so it can be flooded out this port. d) set_rx_mode() has been called on the br0 interface, indicating there is interest in the packets on the host. They must be sent to the CPU so they can be delivered locally. e) ???? Does the Multicast MAC address being used by DLR also map to an IP mmulticast address? 01:21:6C:00:00:0[123] appear to be the MAC addresses used by DLR. IPv4 multicast MAC addresses are 01:00:5E:XX:XX:XX. IPv6 multicast MAC addresses are 33:33:XX:XX:XX:XX. So one possibility here is to teach the SW bridge about non-IP multicast addresses. Initially the switch should forward all MAC multicast frames to the CPU. If the frame is not an IPv4 or IPv6 frame, and there has not been a call to set_rx_mode() for the MAC address on the br0 interface, and the bridge only contains switch ports, switchdev could be used to block the multicast to the CPU frame, but forward it out all other ports of the bridge. Andrew
Allan W. Nielsen
2019-Jul-31 08:01 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
The 07/31/2019 05:31, Andrew Lunn wrote:> The PDF you linked to suggests this as well. But i think you will need > to make some core changes to the bridge. At the moment, STP is a > bridge level property. But you are going to need it to be a per-port > option. You can then use DLR on the ring ports, and optionally STP on > the other ports.I think you are right. I have not looked into the details of this. Our plan was to not implement this to begin with. This will mean that it cannot act as a gateway until this is done. But it still has good value to be able to implement a DLR ring node.> > But what we are looking at here, is to offload a > > non-aware-(DLR|MRP)-switch which happens to be placed in a network > > with these protocols running. > > So we need to think about why we are passing traffic to the CPU port, > and under what conditions can it be blocked. > > 1) The interface is not part of a bridge. In this case, we only need > the switch to pass to the CPU port MC addresses which have been set > via set_rx_mode().Yes. In our HW we are using the MAC table to do this, but this is an implementation detail.> I think this case does not apply for what you want. You have two ports > bridges together as part of the ring.Correct.> 2) The interface is part of a bridge. There are a few sub-cases > > a) IGMP snooping is being performed. We can block multicast where > there is no interest in the group. But this is limited to IP > multicast.Agree. And this is done today by installing an explicit offload rule to limit the flooding of a specific group.> b) IGMP snooping is not being used and all interfaces in the bridge > are ports of the switch. IP Multicast can be blocked to the CPU.Does it matter if IGMP snooping is used or not? A more general statement could be: - "All interfaces in the bridge are ports of the switch. IP Multicast can be blocked to the CPU." This assumes that if br0 joins a IP multicast group, then the needed MAC addresses is added via the set_rx_mode (which I'm pretty sure is the case. Or much more aggressive (which is where we started this entire discussion): - "All interfaces in the bridge are ports of the switch. All Multicast (except ff:ff:ff:ff:ff:ff, 33:33:xx:xx:xx:xx, 01:80:C2:xx:xx:xx) can be blocked to the CPU." But then we are back to having the need of requesting additional L2-multicast mac addresses to the CPU. This has been discussed, and I do not want to re-start the discussion, just wanted to mention it to have the complete picture.> c) IGMP snooping is not being used and there is a non-switch interface > in the bridge. Multicast needed is needed, so it can be flooded out > this port.Yes. And L2-multicast also always needs to go to the CPU regardless of IGMP.> d) set_rx_mode() has been called on the br0 interface, indicating > there is interest in the packets on the host. They must be sent to the > CPU so they can be delivered locally.Yes - but today set_rx_mode is not doing anything on br0. This was one problem we had a few days ago, when we did not forward all traffic to the CPU by default. When looking at these cases, I'm not sure it matters if IGMP snooping is running or not. Here is how I understand what you say: || Foreign interfaces | No Foreign interfaces ==============||======================|========================IGMP Snoop on || IP-Multicast must | IP-Multicast is not needed || go to CPU by default | in the CPU (by default*) --------------||----------------------|------------------------- IGMP Snoop off|| IP-Multicast must | IP-Multicast is not needed || go to CPU by default | in the CPU (by default*) * This assumes that set_rx_mode starts to do something for the br0 interface (which does not seem to be the case right now - see br_dev_set_multicast_list).> e) ????e) Another case to consider is similar to d), but with VLANs. Assume that br0.100 joins a IP-Multicast group, which will cause set_rx_mode() to be called. As I read the code (I have not tried it, so I might have gotten this wrong), then we loose the VLAN information, because the mc list is not VLAN aware. This means that if br0.100 joins a group, then we will get that group for all VLANs. Maybe it is better (for now) to hold on to the exiting behavioral and let all multicast go to the CPU by default, leave br_dev_set_multicast_list empty, use IGMP snooping to optimize the flooding, and hopefully we will have a way to limit the L2-multicast flooding as an optimization which can be applied to "noisy" l2 multicast streams.> Does the Multicast MAC address being used by DLR also map to an IP > mmulticast address?No.> 01:21:6C:00:00:0[123] appear to be the MAC addresses used by DLR.Yes.> IPv4 multicast MAC addresses are 01:00:5E:XX:XX:XX. IPv6 multicast MAC > addresses are 33:33:XX:XX:XX:XX.Yes, and there are no overlap.> So one possibility here is to teach the SW bridge about non-IP > multicast addresses. Initially the switch should forward all MAC > multicast frames to the CPU. If the frame is not an IPv4 or IPv6 > frame, and there has not been a call to set_rx_mode() for the MAC > address on the br0 interface, and the bridge only contains switch > ports, switchdev could be used to block the multicast to the CPU > frame, but forward it out all other ports of the bridge.Close, but not exactly (due to the arguments above). Here is how I see it: Teach the SW bridge about non-IP multicast addresses. Initially the switch should forward all MAC multicast frames to the CPU. Today MDB rules can be installed (either static or dynamic by IGMP), which limit the flooding of IPv4/6 multicast streams. In the same way, we should have a way to install a rule (FDM/ or MDB) to limit the flooding of L2 multicast frames. If foreign interfaces (or br0 it self) is part of the destination list, then traffic also needs to go to the CPU. By doing this, we can for explicitly configured dst mac address: - limit the flooding on the on the SW bridge interfaces - limit the flooding on the on the HW bridge interfaces - prevent them to go to the CPU if they are not needed -- /Allan
Andrew Lunn
2019-Jul-31 13:45 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
> > 2) The interface is part of a bridge. There are a few sub-cases > > > > a) IGMP snooping is being performed. We can block multicast where > > there is no interest in the group. But this is limited to IP > > multicast. > Agree. And this is done today by installing an explicit offload rule to limit > the flooding of a specific group. > > > b) IGMP snooping is not being used and all interfaces in the bridge > > are ports of the switch. IP Multicast can be blocked to the CPU. > Does it matter if IGMP snooping is used or not? A more general statement could > be: > > - "All interfaces in the bridge are ports of the switch. IP Multicast can be > blocked to the CPU."We have seen IPv6 neighbour discovery break in conditions like this. I don't know the exact details. You also need to watch out for 224.0.0.0/24. This is the link local multicast range. There is no need to join MC addresses in that range. It is assumed they will always be received. So even if IGMP is enabled, you still need to pass some multicast traffic to the CPU.> > So one possibility here is to teach the SW bridge about non-IP > > multicast addresses. Initially the switch should forward all MAC > > multicast frames to the CPU. If the frame is not an IPv4 or IPv6 > > frame, and there has not been a call to set_rx_mode() for the MAC > > address on the br0 interface, and the bridge only contains switch > > ports, switchdev could be used to block the multicast to the CPU > > frame, but forward it out all other ports of the bridge. > Close, but not exactly (due to the arguments above). > > Here is how I see it: > > Teach the SW bridge about non-IP multicast addresses. Initially the switch > should forward all MAC multicast frames to the CPU. Today MDB rules can be > installed (either static or dynamic by IGMP), which limit the flooding of IPv4/6 > multicast streams. In the same way, we should have a way to install a rule > (FDM/ or MDB) to limit the flooding of L2 multicast frames. > > If foreign interfaces (or br0 it self) is part of the destination list, then > traffic also needs to go to the CPU. > > By doing this, we can for explicitly configured dst mac address: > - limit the flooding on the on the SW bridge interfaces > - limit the flooding on the on the HW bridge interfaces > - prevent them to go to the CPU if they are not neededThis is all very complex because of all the different corner cases. So i don't think we want a user API to do the CPU part, we want the network stack to do it. Otherwise the user is going to get is wrong, break their network, and then come running to the list for help. This also fits with how we do things in DSA. There is deliberately no user space concept for configuring the DSA CPU port. To user space, the switch is just a bunch of Linux interfaces. Everything to do with the CPU port is hidden away in the DSA core layer, the DSA drivers, and a little bit in the bridge. Andrew