Nikolay Aleksandrov
2022-Mar-17 09:00 UTC
[Bridge] [PATCH v5 net-next 00/15] net: bridge: Multiple Spanning Trees
On 16/03/2022 17:08, Tobias Waldekranz wrote:> The bridge has had per-VLAN STP support for a while now, since: > > https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay at cumulusnetworks.com/ > > The current implementation has some problems: > > - The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN > is managed independently. This is awkward from an MSTP (802.1Q-2018, > Clause 13.5) point of view, where the model is that multiple VLANs > are grouped into MST instances. > > Because of the way that the standard is written, presumably, this is > also reflected in hardware implementations. It is not uncommon for a > switch to support the full 4k range of VIDs, but that the pool of > MST instances is much smaller. Some examples: > > Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs > Marvell Prestera: 4k VLANs, but only 128 MSTIs > Microchip SparX-5i: 4k VLANs, but only 128 MSTIs > > - By default, the feature is enabled, and there is no way to disable > it. This makes it hard to add offloading in a backwards compatible > way, since any underlying switchdevs have no way to refuse the > function if the hardware does not support it > > - The port-global STP state has precedence over per-VLAN states. In > MSTP, as far as I understand it, all VLANs will use the common > spanning tree (CST) by default - through traffic engineering you can > then optimize your network to group subsets of VLANs to use > different trees (MSTI). To my understanding, the way this is > typically managed in silicon is roughly: > > Incoming packet: > .----.----.--------------.----.------------- > | DA | SA | 802.1Q VID=X | ET | Payload ... > '----'----'--------------'----'------------- > | > '->|\ .----------------------------. > | +--> | VID | Members | ... | MSTI | > PVID -->|/ |-----|---------|-----|------| > | 1 | 0001001 | ... | 0 | > | 2 | 0001010 | ... | 10 | > | 3 | 0001100 | ... | 10 | > '----------------------------' > | > .-----------------------------' > | .------------------------. > '->| MSTI | Fwding | Lrning | > |------|--------|--------| > | 0 | 111110 | 111110 | > | 10 | 110111 | 110111 | > '------------------------' > > What this is trying to show is that the STP state (whether MSTP is > used, or ye olde STP) is always accessed via the VLAN table. If STP > is running, all MSTI pointers in that table will reference the same > index in the STP stable - if MSTP is running, some VLANs may point > to other trees (like in this example). > > The fact that in the Linux bridge, the global state (think: index 0 > in most hardware implementations) is supposed to override the > per-VLAN state, is very awkward to offload. In effect, this means > that when the global state changes to blocking, drivers will have to > iterate over all MSTIs in use, and alter them all to match. This > also means that you have to cache whether the hardware state is > currently tracking the global state or the per-VLAN state. In the > first case, you also have to cache the per-VLAN state so that you > can restore it if the global state transitions back to forwarding. > > This series adds a new mst_enable bridge setting (as suggested by Nik) > that can only be changed when no VLANs are configured on the > bridge. Enabling this mode has the following effect: > > - The port-global STP state is used to represent the CST (Common > Spanning Tree) (1/15) > > - Ingress STP filtering is deferred until the frame's VLAN has been > resolved (1/15) > > - The preexisting per-VLAN states can no longer be controlled directly > (1/15). They are instead placed under the MST module's control, > which is managed using a new netlink interface (described in 3/15) > > - VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a > new global VLAN option (2/15) > > Switchdev notifications are added so that a driver can track: > - MST enabled state > - VID to MSTI mappings > - MST port states > > An offloading implementation is this provided for mv88e6xxx. > > A proposal for the corresponding iproute2 interface is available here: > > https://github.com/wkz/iproute2/tree/mst >Hi Tobias, One major missing thing is the selftests for this new feature. Do you have a plan to upstream them? Cheers, Nik
Tobias Waldekranz
2022-Mar-17 09:50 UTC
[Bridge] [PATCH v5 net-next 00/15] net: bridge: Multiple Spanning Trees
On Thu, Mar 17, 2022 at 11:00, Nikolay Aleksandrov <razor at blackwall.org> wrote:> On 16/03/2022 17:08, Tobias Waldekranz wrote: >> The bridge has had per-VLAN STP support for a while now, since: >> >> https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay at cumulusnetworks.com/ >> >> The current implementation has some problems: >> >> - The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN >> is managed independently. This is awkward from an MSTP (802.1Q-2018, >> Clause 13.5) point of view, where the model is that multiple VLANs >> are grouped into MST instances. >> >> Because of the way that the standard is written, presumably, this is >> also reflected in hardware implementations. It is not uncommon for a >> switch to support the full 4k range of VIDs, but that the pool of >> MST instances is much smaller. Some examples: >> >> Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs >> Marvell Prestera: 4k VLANs, but only 128 MSTIs >> Microchip SparX-5i: 4k VLANs, but only 128 MSTIs >> >> - By default, the feature is enabled, and there is no way to disable >> it. This makes it hard to add offloading in a backwards compatible >> way, since any underlying switchdevs have no way to refuse the >> function if the hardware does not support it >> >> - The port-global STP state has precedence over per-VLAN states. In >> MSTP, as far as I understand it, all VLANs will use the common >> spanning tree (CST) by default - through traffic engineering you can >> then optimize your network to group subsets of VLANs to use >> different trees (MSTI). To my understanding, the way this is >> typically managed in silicon is roughly: >> >> Incoming packet: >> .----.----.--------------.----.------------- >> | DA | SA | 802.1Q VID=X | ET | Payload ... >> '----'----'--------------'----'------------- >> | >> '->|\ .----------------------------. >> | +--> | VID | Members | ... | MSTI | >> PVID -->|/ |-----|---------|-----|------| >> | 1 | 0001001 | ... | 0 | >> | 2 | 0001010 | ... | 10 | >> | 3 | 0001100 | ... | 10 | >> '----------------------------' >> | >> .-----------------------------' >> | .------------------------. >> '->| MSTI | Fwding | Lrning | >> |------|--------|--------| >> | 0 | 111110 | 111110 | >> | 10 | 110111 | 110111 | >> '------------------------' >> >> What this is trying to show is that the STP state (whether MSTP is >> used, or ye olde STP) is always accessed via the VLAN table. If STP >> is running, all MSTI pointers in that table will reference the same >> index in the STP stable - if MSTP is running, some VLANs may point >> to other trees (like in this example). >> >> The fact that in the Linux bridge, the global state (think: index 0 >> in most hardware implementations) is supposed to override the >> per-VLAN state, is very awkward to offload. In effect, this means >> that when the global state changes to blocking, drivers will have to >> iterate over all MSTIs in use, and alter them all to match. This >> also means that you have to cache whether the hardware state is >> currently tracking the global state or the per-VLAN state. In the >> first case, you also have to cache the per-VLAN state so that you >> can restore it if the global state transitions back to forwarding. >> >> This series adds a new mst_enable bridge setting (as suggested by Nik) >> that can only be changed when no VLANs are configured on the >> bridge. Enabling this mode has the following effect: >> >> - The port-global STP state is used to represent the CST (Common >> Spanning Tree) (1/15) >> >> - Ingress STP filtering is deferred until the frame's VLAN has been >> resolved (1/15) >> >> - The preexisting per-VLAN states can no longer be controlled directly >> (1/15). They are instead placed under the MST module's control, >> which is managed using a new netlink interface (described in 3/15) >> >> - VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a >> new global VLAN option (2/15) >> >> Switchdev notifications are added so that a driver can track: >> - MST enabled state >> - VID to MSTI mappings >> - MST port states >> >> An offloading implementation is this provided for mv88e6xxx. >> >> A proposal for the corresponding iproute2 interface is available here: >> >> https://github.com/wkz/iproute2/tree/mst >> > > Hi Tobias, > One major missing thing is the selftests for this new feature. Do you > have a plan to upstream them?100% agree. I have an internal test that I plan to adapt to run as a kselftest. There's a bootstrapping problem here though. I can't send the iproute2 series until the kernel support is merged - and until I know how the iproute2 support ends up looking I can't add a kselftest. Ideally, tools/iproute2 would be a thing in the kernel. Then you could send the entire implementation as one series. I'm sure that's probably been discussed many times already, but my Google-fu fails me.