Stephen Hemminger
2020-Jan-09 16:19 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
On Thu, 9 Jan 2020 16:06:37 +0100 Horatiu Vultur <horatiu.vultur at microchip.com> wrote:> Media Redundancy Protocol is a data network protocol standardized by > International Electrotechnical Commission as IEC 62439-2. It allows rings of > Ethernet switches to overcome any single failure with recovery time faster than > STP. It is primarily used in Industrial Ethernet applications. > > This is the first proposal of implementing a subset of the standard. It supports > only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and > Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and > in a ring can be only one MRM and multiple MRC. It is possible to have multiple > instances of MRP on a single node. But a port can be part of only one MRP > instance. > > The MRM is responsible for detecting when there is a loop in the ring. It is > sending the frame MRP_Test to detect the loops. It would send MRP_Test on both > ports in the ring and if the frame is received at the other end, then the ring > is closed. Meaning that there is a loop. In this case it sets the port state to > BLOCKED, not allowing traffic to pass through except MRP frames. In case it > stops receiving MRP_Test frames from itself then the MRM will detect that the > ring is open, therefor it would notify the other nodes of this change and will > set the state of the port to be FORWARDING. > > The MRC is responsible for forwarding MRP_Test frames between the ring ports > (and not to flood on other ports) and to listen when there is a change in the > network to clear the FDB. > > Similar with STP, MRP is implemented on top of the bridge and they can't be > enable at the same time. While STP runs on all ports of the bridge, MRP needs to > run only on 2 ports. > > The bridge needs to: > - notify when the link of one of the ports goes down or up, because MRP instance > needs to react to link changes by sending MRP_LinkChange frames. > - notify when one of the ports are removed from the bridge or when the bridge > is destroyed, because if the port is part of the MRP ring then MRP state > machine should be stopped. > - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. > This is similar with STP design. > - add logic for MRP frames inside the bridge. The bridge will just detect MRP > frames and it would forward them to the upper layer to allow to process it. > - update the logic to update non-MRP frames. If MRP is enabled, then look also > at the state of the port to decide to forward or not. > > To create a MRP instance on the bridge: > $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 > > Where: > p_port, s_port: can be any port under the bridge > ring_role: can have the value 1(MRC - Media Redundancy Client) or > 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. > ring_id: unique id for each MRP instance. > > It is possible to create multiple instances. Each instance has to have it's own > ring_id and a port can't be part of multiple instances: > $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 > > To see current MRP instances and their status: > $ bridge mrp show > dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 > dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 > > If this patch series is well received, the in the future it could be extended > with the following: > - add support for Media Redundancy Automanager. This role allows a node to > detect if needs to behave as a MRM or MRC. The advantage of this role is that > the user doesn't need to configure the nodes each time they are added/removed > from a ring and it adds redundancy to the manager. > - add support for Interconnect rings. This allow to connect multiple rings. > - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10 > ms). To be able to achieve 30 and 10 it is required by the HW to generate the > MRP_Test frames and detect when the ring is open/closed. > > Horatiu Vultur (3): > net: bridge: mrp: Add support for Media Redundancy Protocol > net: bridge: mrp: Integrate MRP into the bridge > net: bridge: mrp: Add netlink support to configure MRP > > include/uapi/linux/if_bridge.h | 27 + > include/uapi/linux/if_ether.h | 1 + > include/uapi/linux/rtnetlink.h | 7 + > net/bridge/Kconfig | 12 + > net/bridge/Makefile | 2 + > net/bridge/br.c | 19 + > net/bridge/br_device.c | 3 + > net/bridge/br_forward.c | 1 + > net/bridge/br_if.c | 10 + > net/bridge/br_input.c | 22 + > net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++ > net/bridge/br_mrp_timer.c | 227 +++++ > net/bridge/br_netlink.c | 9 + > net/bridge/br_private.h | 30 + > net/bridge/br_private_mrp.h | 208 +++++ > security/selinux/nlmsgtab.c | 5 +- > 16 files changed, 2099 insertions(+), 1 deletion(-) > create mode 100644 net/bridge/br_mrp.c > create mode 100644 net/bridge/br_mrp_timer.c > create mode 100644 net/bridge/br_private_mrp.h >Can this be implemented in userspace? Putting STP in the kernel was a mistake (even original author says so). Adding more control protocols in kernel is a security and stability risk.
Asbjørn Sloth Tønnesen
2020-Jan-09 17:41 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
Hi Horatiu and Stephen, Horatiu, thanks for giving this a try. I am looking forward to maybe someday be able to run ERPS on white box switches. On 1/9/20 4:19 PM, Stephen Hemminger wrote:> Can this be implemented in userspace? > > Putting STP in the kernel was a mistake (even original author says so). > Adding more control protocols in kernel is a security and stability risk.Another case is VRRP, ERPS (ITU-T G.8032), VRRP group. My use-case might not be common, but I have machines with about 10k net_dev (QinQ), I would like to be able to do VRRP group on the outer VLANs, which are only a few hundred instances without excessive context switching. I would then keep the the normal keep-alive state machine in kernel, basically a BPF-based timed periodic packet emitter facility and a XDP recieve hook. So only setup and event handling has to context switched to user-space. Unfortunately I haven't had time to explore this yet, but I think such an approach could solve a few of the reasons that scalable bridge/ring/ha protocols have to wait 20 years before being implemented in Linux. -- Best regards Asbj?rn Sloth T?nnesen Network Engineer Fiberby ApS - AS42541
Horatiu Vultur
2020-Jan-10 09:02 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
The 01/09/2020 08:19, Stephen Hemminger wrote:> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > On Thu, 9 Jan 2020 16:06:37 +0100 > Horatiu Vultur <horatiu.vultur at microchip.com> wrote: > > > Media Redundancy Protocol is a data network protocol standardized by > > International Electrotechnical Commission as IEC 62439-2. It allows rings of > > Ethernet switches to overcome any single failure with recovery time faster than > > STP. It is primarily used in Industrial Ethernet applications. > > > > This is the first proposal of implementing a subset of the standard. It supports > > only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and > > Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and > > in a ring can be only one MRM and multiple MRC. It is possible to have multiple > > instances of MRP on a single node. But a port can be part of only one MRP > > instance. > > > > The MRM is responsible for detecting when there is a loop in the ring. It is > > sending the frame MRP_Test to detect the loops. It would send MRP_Test on both > > ports in the ring and if the frame is received at the other end, then the ring > > is closed. Meaning that there is a loop. In this case it sets the port state to > > BLOCKED, not allowing traffic to pass through except MRP frames. In case it > > stops receiving MRP_Test frames from itself then the MRM will detect that the > > ring is open, therefor it would notify the other nodes of this change and will > > set the state of the port to be FORWARDING. > > > > The MRC is responsible for forwarding MRP_Test frames between the ring ports > > (and not to flood on other ports) and to listen when there is a change in the > > network to clear the FDB. > > > > Similar with STP, MRP is implemented on top of the bridge and they can't be > > enable at the same time. While STP runs on all ports of the bridge, MRP needs to > > run only on 2 ports. > > > > The bridge needs to: > > - notify when the link of one of the ports goes down or up, because MRP instance > > needs to react to link changes by sending MRP_LinkChange frames. > > - notify when one of the ports are removed from the bridge or when the bridge > > is destroyed, because if the port is part of the MRP ring then MRP state > > machine should be stopped. > > - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. > > This is similar with STP design. > > - add logic for MRP frames inside the bridge. The bridge will just detect MRP > > frames and it would forward them to the upper layer to allow to process it. > > - update the logic to update non-MRP frames. If MRP is enabled, then look also > > at the state of the port to decide to forward or not. > > > > To create a MRP instance on the bridge: > > $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 > > > > Where: > > p_port, s_port: can be any port under the bridge > > ring_role: can have the value 1(MRC - Media Redundancy Client) or > > 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. > > ring_id: unique id for each MRP instance. > > > > It is possible to create multiple instances. Each instance has to have it's own > > ring_id and a port can't be part of multiple instances: > > $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 > > > > To see current MRP instances and their status: > > $ bridge mrp show > > dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 > > dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 > > > > If this patch series is well received, the in the future it could be extended > > with the following: > > - add support for Media Redundancy Automanager. This role allows a node to > > detect if needs to behave as a MRM or MRC. The advantage of this role is that > > the user doesn't need to configure the nodes each time they are added/removed > > from a ring and it adds redundancy to the manager. > > - add support for Interconnect rings. This allow to connect multiple rings. > > - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10 > > ms). To be able to achieve 30 and 10 it is required by the HW to generate the > > MRP_Test frames and detect when the ring is open/closed. > > > > Horatiu Vultur (3): > > net: bridge: mrp: Add support for Media Redundancy Protocol > > net: bridge: mrp: Integrate MRP into the bridge > > net: bridge: mrp: Add netlink support to configure MRP > > > > include/uapi/linux/if_bridge.h | 27 + > > include/uapi/linux/if_ether.h | 1 + > > include/uapi/linux/rtnetlink.h | 7 + > > net/bridge/Kconfig | 12 + > > net/bridge/Makefile | 2 + > > net/bridge/br.c | 19 + > > net/bridge/br_device.c | 3 + > > net/bridge/br_forward.c | 1 + > > net/bridge/br_if.c | 10 + > > net/bridge/br_input.c | 22 + > > net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++ > > net/bridge/br_mrp_timer.c | 227 +++++ > > net/bridge/br_netlink.c | 9 + > > net/bridge/br_private.h | 30 + > > net/bridge/br_private_mrp.h | 208 +++++ > > security/selinux/nlmsgtab.c | 5 +- > > 16 files changed, 2099 insertions(+), 1 deletion(-) > > create mode 100644 net/bridge/br_mrp.c > > create mode 100644 net/bridge/br_mrp_timer.c > > create mode 100644 net/bridge/br_private_mrp.h > > > > Can this be implemented in userspace?The reason for putting this in kernal space is to HW offload this in switchdev/dsa driver. The switches which typically supports this are small and don't have a lot of CPU power and the bandwidth between the CPU and switch core is typically limited(at least this is the case with the switches that we are working). Therefor we need to use HW offload components which can inject the frames at the needed frequency and other components which can terminate the expected frames and just raise and interrupt if the test frames are not received as expected(and a few other HW features). To put this in user-space we see two options: 1. We need to define a netlink interface which allows a user-space control application to ask the kernel to ask the switchdev driver to setup the frame-injector or frame-terminator. In theory this would be possible, and we have considered it, but we think that this interface will be too specific for our HW and will need to be changed every time we want to add support for a new SoC. By focusing the user-space interfaces on the protocol requirement, we feel more confident that we have an interface which we can continue to be backwards compatible with, and also support future/other chips with what ever facilities (if any) they have to HW offload. 2. Do a UIO driver and keep protocol and driver in user-space. We do not really like this approach for many reasons: it pretty much prevents us from collaborating with the community to solve this and it will be really hard to have the switchdev driver controlling part of the chip and a user-space driver controlling other parts.> > Putting STP in the kernel was a mistake (even original author says so). > Adding more control protocols in kernel is a security and stability risk.-- /Horatiu