Andrew Lunn
2019-Jul-30 14:34 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
Hi Allan Just throwing out another idea.... The whole offloading story has been you use the hardware to accelerate what the Linux stack can already do. In this case, you want to accelerate Device Level Ring, DLR. But i've not yet seen a software implementation of DLR. Should we really be considering first adding DLR to the SW bridge? Make it an alternative to the STP code? Once we have a generic implementation we can then look at how it can be accelerated using switchdev. Andrew
Allan W. Nielsen
2019-Jul-30 19:00 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
The 07/30/2019 16:34, Andrew Lunn wrote:> The whole offloading story has been you use the hardware to accelerate > what the Linux stack can already do.It is true, I have been quite keen on finding a way to control the forwarding of L2-multicast which will work in the same way with and without HW acceleration (and which we can HW offlaod with the HW I'm working on).> In this case, you want to accelerate Device Level Ring, DLR.It is actually not only for DLR, there are other ring protocols which has the same needs the same MRP (media redundancy protocol) is an other example. I just used DLR as an example because this is the one we expect to implement the protocol for first. There are other just as important use-cases.> But i've not yet seen a software implementation of DLR. Should we really be > considering first adding DLR to the SW bridge?We have actually (slowly) stared to work on a DLR SW implementation. We want to do this as a Linux driver instead of a user-space implementation, because there are other HW facilities we would like to offload (the HW has a automatic frame generator, which can generate the beacon frames, and a unit which can terminate the beacon frames, and generate an interrupt if the beacon frames are not received). Our plan was to implement this in pure SW, and then look at how to HW offload it. But this will take some time before we have anything meaning full to show.> Make it an alternative to the STP code?I'm still working on learning the details of DLR, but I actually believe that it in some situations may co-exists with STP ;-) DLR only works on ring topologies, but it is possible to connect a ring to a classic STP network. If doing so, then you are suppose to run DLR on the two ring ports, and (M)STP on the ports connecting to the remaining part of the network. As far as I recall, this is called a gateway node. But supporting this is optional, and will properly not be supported in the first implementation.> Once we have a generic implementation we can then look at how it can > be accelerated using switchdev.I agree with you that we need a SW implementation of DLR because we can offload the DLR protocol to HW. But what we are looking at here, is to offload a non-aware-(DLR|MRP)-switch which happens to be placed in a network with these protocols running. It is not really DLR specific, which is why it seems reasonable to implement this without a DLR SW implementation up front. -- /Allan
Andrew Lunn
2019-Jul-31 03:31 UTC
[Bridge] [PATCH] net: bridge: Allow bridge to joing multicast groups
> Our plan was to implement this in pure SW, and then look at how to HW offload > it.Great.> But this will take some time before we have anything meaning full to show. > > > Make it an alternative to the STP code? > I'm still working on learning the details of DLR, but I actually believe that it > in some situations may co-exists with STP ;-)The PDF you linked to suggests this as well. But i think you will need to make some core changes to the bridge. At the moment, STP is a bridge level property. But you are going to need it to be a per-port option. You can then use DLR on the ring ports, and optionally STP on the other ports.> But what we are looking at here, is to offload a > non-aware-(DLR|MRP)-switch which happens to be placed in a network > with these protocols running.So we need to think about why we are passing traffic to the CPU port, and under what conditions can it be blocked. 1) The interface is not part of a bridge. In this case, we only need the switch to pass to the CPU port MC addresses which have been set via set_rx_mode(). I think this case does not apply for what you want. You have two ports bridges together as part of the ring. 2) The interface is part of a bridge. There are a few sub-cases a) IGMP snooping is being performed. We can block multicast where there is no interest in the group. But this is limited to IP multicast. b) IGMP snooping is not being used and all interfaces in the bridge are ports of the switch. IP Multicast can be blocked to the CPU. c) IGMP snooping is not being used and there is a non-switch interface in the bridge. Multicast needed is needed, so it can be flooded out this port. d) set_rx_mode() has been called on the br0 interface, indicating there is interest in the packets on the host. They must be sent to the CPU so they can be delivered locally. e) ???? Does the Multicast MAC address being used by DLR also map to an IP mmulticast address? 01:21:6C:00:00:0[123] appear to be the MAC addresses used by DLR. IPv4 multicast MAC addresses are 01:00:5E:XX:XX:XX. IPv6 multicast MAC addresses are 33:33:XX:XX:XX:XX. So one possibility here is to teach the SW bridge about non-IP multicast addresses. Initially the switch should forward all MAC multicast frames to the CPU. If the frame is not an IPv4 or IPv6 frame, and there has not been a call to set_rx_mode() for the MAC address on the br0 interface, and the bridge only contains switch ports, switchdev could be used to block the multicast to the CPU frame, but forward it out all other ports of the bridge. Andrew