Horatiu Vultur
2020-Jan-09 15:06 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
Media Redundancy Protocol is a data network protocol standardized by International Electrotechnical Commission as IEC 62439-2. It allows rings of Ethernet switches to overcome any single failure with recovery time faster than STP. It is primarily used in Industrial Ethernet applications. This is the first proposal of implementing a subset of the standard. It supports only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and in a ring can be only one MRM and multiple MRC. It is possible to have multiple instances of MRP on a single node. But a port can be part of only one MRP instance. The MRM is responsible for detecting when there is a loop in the ring. It is sending the frame MRP_Test to detect the loops. It would send MRP_Test on both ports in the ring and if the frame is received at the other end, then the ring is closed. Meaning that there is a loop. In this case it sets the port state to BLOCKED, not allowing traffic to pass through except MRP frames. In case it stops receiving MRP_Test frames from itself then the MRM will detect that the ring is open, therefor it would notify the other nodes of this change and will set the state of the port to be FORWARDING. The MRC is responsible for forwarding MRP_Test frames between the ring ports (and not to flood on other ports) and to listen when there is a change in the network to clear the FDB. Similar with STP, MRP is implemented on top of the bridge and they can't be enable at the same time. While STP runs on all ports of the bridge, MRP needs to run only on 2 ports. The bridge needs to: - notify when the link of one of the ports goes down or up, because MRP instance needs to react to link changes by sending MRP_LinkChange frames. - notify when one of the ports are removed from the bridge or when the bridge is destroyed, because if the port is part of the MRP ring then MRP state machine should be stopped. - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. This is similar with STP design. - add logic for MRP frames inside the bridge. The bridge will just detect MRP frames and it would forward them to the upper layer to allow to process it. - update the logic to update non-MRP frames. If MRP is enabled, then look also at the state of the port to decide to forward or not. To create a MRP instance on the bridge: $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 Where: p_port, s_port: can be any port under the bridge ring_role: can have the value 1(MRC - Media Redundancy Client) or 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. ring_id: unique id for each MRP instance. It is possible to create multiple instances. Each instance has to have it's own ring_id and a port can't be part of multiple instances: $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 To see current MRP instances and their status: $ bridge mrp show dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 If this patch series is well received, the in the future it could be extended with the following: - add support for Media Redundancy Automanager. This role allows a node to detect if needs to behave as a MRM or MRC. The advantage of this role is that the user doesn't need to configure the nodes each time they are added/removed from a ring and it adds redundancy to the manager. - add support for Interconnect rings. This allow to connect multiple rings. - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10 ms). To be able to achieve 30 and 10 it is required by the HW to generate the MRP_Test frames and detect when the ring is open/closed. Horatiu Vultur (3): net: bridge: mrp: Add support for Media Redundancy Protocol net: bridge: mrp: Integrate MRP into the bridge net: bridge: mrp: Add netlink support to configure MRP include/uapi/linux/if_bridge.h | 27 + include/uapi/linux/if_ether.h | 1 + include/uapi/linux/rtnetlink.h | 7 + net/bridge/Kconfig | 12 + net/bridge/Makefile | 2 + net/bridge/br.c | 19 + net/bridge/br_device.c | 3 + net/bridge/br_forward.c | 1 + net/bridge/br_if.c | 10 + net/bridge/br_input.c | 22 + net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++ net/bridge/br_mrp_timer.c | 227 +++++ net/bridge/br_netlink.c | 9 + net/bridge/br_private.h | 30 + net/bridge/br_private_mrp.h | 208 +++++ security/selinux/nlmsgtab.c | 5 +- 16 files changed, 2099 insertions(+), 1 deletion(-) create mode 100644 net/bridge/br_mrp.c create mode 100644 net/bridge/br_mrp_timer.c create mode 100644 net/bridge/br_private_mrp.h -- 2.17.1
Horatiu Vultur
2020-Jan-09 15:06 UTC
[Bridge] [RFC net-next Patch 1/3] net: bridge: mrp: Add support for Media Redundancy Protocol
This patch implements the core MRP state-machines and generation/parsing of the frames. The implementation is on top of the bridge. All MRP frames are received by the function 'br_mrp_recv' which adds the frame in a queue to process them. For each frame it needs to decide if the frame needs to be dropped, process or forward. And these decisions are taken by the functions: 'br_mrp_should_drop', 'br_mrp_should_process' and 'br_mrp_should_forward'. Currently there only 3 types for MRP that are supported: MRP_Test, MRP_Topo and MRP_LinkChange. - MRP_Test are generated by MRM to detect if the ring is open or closed. - MRP_Topo are generated by MRM to notify the network that is changed. - MRP_LinkChange are generated by MRC to notify MRM that the node lost connectivity on one of the ports. All these frames need to be send out multiple times at different intervals. To do that, there is a special workqueue, where all this work is added. This is implemented in the file 'br_mr_timer.c'. Each role has it's own state machine: - Media Redundancy Manager(MRM) can be in one of the states: AC_STAT1, PRM_UP, CHK_RO, CHK_RC. This one is responsible to send MRP_Test and MRP_Topo on both of the ring ports. It also needs to process the MRP_Test, and in case it receives one it means that the ring is closed and it would change the state to CHK_RC. Whenever it detects that the ring is open(didn't receive MRP_Test frames in a configured interval), it would send MRP_Topo frames on both of the ring ports to notify other nodes in the ring that the topology of the network is different. MRM needs to process MRP_LinkChange because these frames indicate a change in the topology. If the MRM is in the state CHK_RC, then it would block one of the ports, not allowing traffic to be flow except MRP frames and frames specified in IEEE 802.1D-2004 Table 7-10. - Media Redundancy Client(MRC) can be in one of the states: AC_STAT1, DE_IDLE, PT, DE, PT_IDLE. MRC is responsible to send MRP_LinkChange when one of the ring ports lost the connectivity. It needs to process MRP_Topo frames, this frame contains a field which indicates the time in which it needs to clear the FDB. MRC will need to forward all the MRP frames. In all the states the MRC will set the port in forwarding state, except when the port is down. The standards supports multiple recovery times. There are 4 recovery: 500ms, 200ms, 30ms, 10ms. The implementation adds support for all of them, default(500ms), but it looks hard to achieve the result 30ms or 10ms without hardware support. To decide if a non-MRP frame can be send to the other ring port the function 'should_deliver' is extended to check also the function 'br_mrp_allow_egress'. Question: the function 'br_mrp_allow_egress' is looking at the MRP state of the port which is a pointer. But this could be a race condition, because while this function is called the port can be removed, because the function 'br_mrp_allow_egres' is not protected by rtnl_lock. It would be overkill to take this lock for each frame. What is the correct solution here? Should I make the mrp_port a RCU pointer? In case the MRP is not enable then MRP frames and non-MRP frames are forward as before. Signed-off-by: Horatiu Vultur <horatiu.vultur at microchip.com> --- include/uapi/linux/if_ether.h | 1 + net/bridge/Kconfig | 12 + net/bridge/Makefile | 2 + net/bridge/br_mrp.c | 1236 +++++++++++++++++++++++++++++++++ net/bridge/br_mrp_timer.c | 227 ++++++ net/bridge/br_private_mrp.h | 199 ++++++ 6 files changed, 1677 insertions(+) create mode 100644 net/bridge/br_mrp.c create mode 100644 net/bridge/br_mrp_timer.c create mode 100644 net/bridge/br_private_mrp.h diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h index f6ceb2e63d1e..fb8a140b3bba 100644 --- a/include/uapi/linux/if_ether.h +++ b/include/uapi/linux/if_ether.h @@ -92,6 +92,7 @@ #define ETH_P_PREAUTH 0x88C7 /* 802.11 Preauthentication */ #define ETH_P_TIPC 0x88CA /* TIPC */ #define ETH_P_LLDP 0x88CC /* Link Layer Discovery Protocol */ +#define ETH_P_MRP 0x88E3 /* Media Redundancy Protocol */ #define ETH_P_MACSEC 0x88E5 /* 802.1ae MACsec */ #define ETH_P_8021AH 0x88E7 /* 802.1ah Backbone Service Tag */ #define ETH_P_MVRP 0x88F5 /* 802.1Q MVRP */ diff --git a/net/bridge/Kconfig b/net/bridge/Kconfig index e4fb050e2078..d07e3901aff6 100644 --- a/net/bridge/Kconfig +++ b/net/bridge/Kconfig @@ -61,3 +61,15 @@ config BRIDGE_VLAN_FILTERING Say N to exclude this support and reduce the binary size. If unsure, say Y. + +config BRIDGE_MRP + bool "MRP protocol" + depends on BRIDGE + default n + help + If you say Y here, then the Ethernet bridge will be able to run MRP + protocol to detect loops. + + Say N to exclude this support and reduce the binary size. + + If unsure, say N. diff --git a/net/bridge/Makefile b/net/bridge/Makefile index ac9ef337f0fa..917826c9d8de 100644 --- a/net/bridge/Makefile +++ b/net/bridge/Makefile @@ -25,3 +25,5 @@ bridge-$(CONFIG_BRIDGE_VLAN_FILTERING) += br_vlan.o br_vlan_tunnel.o bridge-$(CONFIG_NET_SWITCHDEV) += br_switchdev.o obj-$(CONFIG_NETFILTER) += netfilter/ + +bridge-$(CONFIG_BRIDGE_MRP) += br_mrp.o br_mrp_timer.o diff --git a/net/bridge/br_mrp.c b/net/bridge/br_mrp.c new file mode 100644 index 000000000000..a84aab3f7114 --- /dev/null +++ b/net/bridge/br_mrp.c @@ -0,0 +1,1236 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Copyright (c) 2020 Microchip Corporation */ + +#include <linux/netdevice.h> +#include <linux/netfilter_bridge.h> + +#include "br_private_mrp.h" + +static const u8 mrp_test_dmac[ETH_ALEN] = { 0x1, 0x15, 0x4e, 0x0, 0x0, 0x1 }; +static const u8 mrp_control_dmac[ETH_ALEN] = { 0x1, 0x15, 0x4e, 0x0, 0x0, 0x2 }; + +static bool br_mrp_is_port_up(const struct net_bridge_port *p) +{ + return netif_running(p->dev) && netif_oper_up(p->dev); +} + +static bool br_mrp_is_ring_port(const struct net_bridge_port *p) +{ + return p->mrp_port->role == BR_MRP_PORT_ROLE_PRIMARY || + p->mrp_port->role == BR_MRP_PORT_ROLE_SECONDARY; +} + +/* Determins if a port is part of a MRP instance */ +static bool br_mrp_is_mrp_port(const struct net_bridge_port *p) +{ + if (!p->mrp_port || !p->mrp_port->mrp) + return false; + + return true; +} + +static void br_mrp_reset_ring_state(struct br_mrp *mrp) +{ + br_mrp_timer_stop(mrp); + mrp->mrm_state = BR_MRP_MRM_STATE_AC_STAT1; + mrp->mrc_state = BR_MRP_MRC_STATE_AC_STAT1; +} + +static char *br_mrp_get_mrm_state(enum br_mrp_mrm_state_type state) +{ + switch (state) { + case BR_MRP_MRM_STATE_AC_STAT1: return "AC_STAT1"; + case BR_MRP_MRM_STATE_PRM_UP: return "PRM_UP"; + case BR_MRP_MRM_STATE_CHK_RO: return "CHK_RO"; + case BR_MRP_MRM_STATE_CHK_RC: return "CHK_RC"; + default: return "Unknown MRM state"; + } +} + +static char *br_mrp_get_mrc_state(enum br_mrp_mrc_state_type state) +{ + switch (state) { + case BR_MRP_MRC_STATE_AC_STAT1: return "AC_STAT1"; + case BR_MRP_MRC_STATE_DE_IDLE: return "DE_IDLE"; + case BR_MRP_MRC_STATE_PT: return "PT"; + case BR_MRP_MRC_STATE_DE: return "DE"; + case BR_MRP_MRC_STATE_PT_IDLE: return "PT_IDLE"; + default: return "Unknown MRC state"; + } +} + +static void br_mrp_set_mrm_init(struct br_mrp *mrp) +{ + mrp->add_test = false; + mrp->no_tc = false; + mrp->ring_test_curr = 0; +} + +static void br_mrp_set_mrc_init(struct br_mrp *mrp) +{ + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + mrp->ring_test_curr = 0; +} + +void br_mrp_set_mrm_state(struct br_mrp *mrp, + enum br_mrp_mrm_state_type state) +{ + br_debug(mrp->br, "mrm_state: %s\n", br_mrp_get_mrm_state(state)); + mrp->mrm_state = state; +} + +void br_mrp_set_mrc_state(struct br_mrp *mrp, + enum br_mrp_mrc_state_type state) +{ + br_debug(mrp->br, "mrc_state: %s\n", br_mrp_get_mrc_state(state)); + mrp->mrc_state = state; +} + +static int br_mrp_set_mrm_role(struct br_mrp *mrp) +{ + /* If MRP instance doesn't have set both ports, then it can't have a + * role + */ + if (!mrp->p_port || !mrp->s_port) + return -EINVAL; + + /* When changing the role everything is reset */ + br_mrp_reset_ring_state(mrp); + br_mrp_set_mrm_init(mrp); + + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_AC_STAT1); + + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->ring_role = BR_MRP_RING_ROLE_MRM; + + if (br_mrp_is_port_up(mrp->p_port)) + br_mrp_port_link_change(mrp->p_port, true); + + if (br_mrp_is_port_up(mrp->s_port)) + br_mrp_port_link_change(mrp->s_port, true); + + return 0; +} + +static int br_mrp_set_mrc_role(struct br_mrp *mrp) +{ + /* If MRP instance doesn't have set both ports, then it can't have a + * role + */ + if (!mrp->p_port || !mrp->s_port) + return -EINVAL; + + /* When changing the role everything is reset */ + br_mrp_reset_ring_state(mrp); + br_mrp_set_mrc_init(mrp); + + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_AC_STAT1); + + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->ring_role = BR_MRP_RING_ROLE_MRC; + + if (br_mrp_is_port_up(mrp->p_port)) + br_mrp_port_link_change(mrp->p_port, true); + + if (br_mrp_is_port_up(mrp->s_port)) + br_mrp_port_link_change(mrp->s_port, true); + + return 0; +} + +static int br_mrp_send_finish(struct net *net, struct sock *sk, + struct sk_buff *skb) +{ + return dev_queue_xmit(skb); +} + +/* According to the standard each frame has a different sequence number. If it + * is MRP_Test, MRP_TopologyChange or MRP_LinkChange + */ +static u16 br_mrp_next_seq(struct br_mrp *mrp) +{ + mrp->seq_id++; + return mrp->seq_id; +} + +static enum br_mrp_ring_state_type br_mrp_ring_state(struct br_mrp *mrp) +{ + return mrp->mrm_state == BR_MRP_MRM_STATE_CHK_RC ? + BR_MRP_RING_STATE_CLOSED : BR_MRP_RING_STATE_OPEN; +} + +/* Allocates MRP frame and set head part of the frames. This is the ethernet + * and the MRP version + */ +static struct sk_buff *br_mrp_skb_alloc(struct net_bridge_port *p, + const u8 *src, const u8 *dst) +{ + struct ethhdr *eth_hdr; + struct sk_buff *skb; + u16 *version; + + skb = dev_alloc_skb(MRP_MAX_FRAME_LENGTH); + if (!skb) + return NULL; + + skb->dev = p->dev; + skb->protocol = htons(ETH_P_MRP); + skb->priority = MRP_FRAME_PRIO; + skb_reserve(skb, sizeof(*eth_hdr)); + + eth_hdr = skb_push(skb, sizeof(*eth_hdr)); + ether_addr_copy(eth_hdr->h_dest, dst); + ether_addr_copy(eth_hdr->h_source, src); + eth_hdr->h_proto = htons(ETH_P_MRP); + + version = skb_put(skb, sizeof(*version)); + *version = cpu_to_be16(MRP_VERSION); + + return skb; +} + +static void br_mrp_skb_tlv(struct sk_buff *skb, + enum br_mrp_tlv_header_type type, + u8 length) +{ + struct br_mrp_tlv_hdr *hdr; + + hdr = skb_put(skb, sizeof(*hdr)); + hdr->type = type; + hdr->length = length; +} + +static void br_mrp_skb_common(struct sk_buff *skb, + struct net_bridge_port *p) +{ + struct br_mrp_common_hdr *hdr; + + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_COMMON, sizeof(*hdr)); + + hdr = skb_put(skb, sizeof(*hdr)); + hdr->seq_id = cpu_to_be16(br_mrp_next_seq(p->mrp_port->mrp)); + memcpy(hdr->domain, p->mrp_port->mrp->domain, MRP_DOMAIN_UUID_LENGTH); +} + +/* Compose MRP_Test frame and forward the frame to the port p. + * The MRP_Test frame has the following format: + * MRP_Version, MRP_TLVHeader, MRP_Prio, MRP_SA, MRP_PortRole, MRP_RingState, + * MRP_Transitions, MRP_Timestamping, MRP_Common + */ +static void br_mrp_send_ring_test(struct net_bridge_port *p) +{ + struct br_mrp_ring_test_hdr *hdr = NULL; + struct br_mrp *mrp = p->mrp_port->mrp; + struct sk_buff *skb = NULL; + + skb = br_mrp_skb_alloc(p, p->dev->dev_addr, mrp_test_dmac); + if (!skb) + return; + + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_RING_TEST, sizeof(*hdr)); + hdr = skb_put(skb, sizeof(*hdr)); + + hdr->prio = cpu_to_be16(mrp->prio); + ether_addr_copy(hdr->sa, p->br->dev->dev_addr); + hdr->port_role = cpu_to_be16(p->mrp_port->role); + hdr->state = cpu_to_be16(br_mrp_ring_state(mrp)); + hdr->transitions = cpu_to_be16(mrp->ring_transitions); + hdr->timestamp = cpu_to_be32(jiffies_to_msecs(jiffies)); + + br_mrp_skb_common(skb, p); + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_END, 0x0); + + NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_OUT, + dev_net(p->dev), NULL, skb, NULL, skb->dev, + br_mrp_send_finish); +} + +/* Send MRP_Test frames on both MRP ports and start a timer to send continuously + * frames with specific interval + */ +void br_mrp_ring_test_req(struct br_mrp *mrp, u32 interval) +{ + br_mrp_send_ring_test(mrp->p_port); + br_mrp_send_ring_test(mrp->s_port); + + br_mrp_ring_test_start(mrp, interval); +} + +/* Compose MRP_TopologyChange frame and forward the frame to the port p. + * The MRP_TopologyChange frame has the following format: + * MRP_Version, MRP_TLVHeader, MRP_Prio, MRP_SA, MRP_Interval + */ +static void br_mrp_send_ring_topo(struct net_bridge_port *p, u32 interval) +{ + struct br_mrp_ring_topo_hdr *hdr = NULL; + struct br_mrp *mrp = p->mrp_port->mrp; + struct sk_buff *skb = NULL; + + skb = br_mrp_skb_alloc(p, p->dev->dev_addr, mrp_control_dmac); + if (!skb) + return; + + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_RING_TOPO, sizeof(*hdr)); + hdr = skb_put(skb, sizeof(*hdr)); + + hdr->prio = cpu_to_be16(mrp->prio); + ether_addr_copy(hdr->sa, p->br->dev->dev_addr); + hdr->interval = interval == 0 ? 0 : cpu_to_be16(interval / 1000); + + br_mrp_skb_common(skb, p); + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_END, 0x0); + + NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_OUT, + dev_net(p->dev), NULL, skb, NULL, skb->dev, + br_mrp_send_finish); +} + +/* Send MRP_TopologyChange frames on both MRP ports and start a timer to send + * continuously frames with specific interval. If the interval is 0, then the + * FDB needs to be clear, meaning that there was a change in the topology of the + * network. + */ +void br_mrp_ring_topo_req(struct br_mrp *mrp, u32 time) +{ + br_debug(mrp->br, "topo_req: %d\n", time); + + br_mrp_send_ring_topo(mrp->p_port, time * mrp->ring_topo_conf_max); + br_mrp_send_ring_topo(mrp->s_port, time * mrp->ring_topo_conf_max); + + if (!time) { + br_fdb_flush(mrp->br); + } else { + u32 delay = mrp->ring_topo_conf_interval; + + br_mrp_ring_topo_start(mrp, delay); + } +} + +/* Compose MRP_LinkChange frame and forward the frame to the port p. + * The MRP_LinkChange frame has the following format: + * MRP_Version, MRP_TLVHeader, MRP_SA, MRP_PortRole, MRP_Interval, MRP_Blocked + */ +static void br_mrp_send_ring_link(struct net_bridge_port *p, bool up, + u32 interval) +{ + struct br_mrp_ring_link_hdr *hdr = NULL; + struct br_mrp *mrp = p->mrp_port->mrp; + struct sk_buff *skb = NULL; + + skb = br_mrp_skb_alloc(p, p->dev->dev_addr, mrp_control_dmac); + if (!skb) + return; + + br_mrp_skb_tlv(skb, up ? BR_MRP_TLV_HEADER_RING_LINK_UP : + BR_MRP_TLV_HEADER_RING_LINK_DOWN, + sizeof(*hdr)); + hdr = skb_put(skb, sizeof(*hdr)); + + ether_addr_copy(hdr->sa, p->br->dev->dev_addr); + hdr->port_role = cpu_to_be16(p->mrp_port->role); + hdr->interval = interval == 0 ? 0 : cpu_to_be16(interval / 1000); + hdr->blocked = cpu_to_be16(mrp->blocked); + + br_mrp_skb_common(skb, p); + br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_END, 0x0); + + NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_OUT, + dev_net(p->dev), NULL, skb, NULL, skb->dev, + br_mrp_send_finish); +} + +/* Send MRP_LinkChange frames on one of MRP ports */ +void br_mrp_ring_link_req(struct net_bridge_port *p, bool up, u32 interval) +{ + br_debug(p->br, "link_req up: %d interval: %d\n", up, interval); + + br_mrp_send_ring_link(p, up, interval); +} + +/* Returns the MRP_TLVHeader */ +static enum br_mrp_tlv_header_type +br_mrp_get_tlv_hdr(const struct sk_buff *skb) +{ + struct br_mrp_tlv_hdr *hdr; + + /* First 2 bytes in each MRP frame is the version and after that + * is the tlv header, therefor skip the version + */ + hdr = (struct br_mrp_tlv_hdr *)(skb->data + sizeof(u16)); + return hdr->type; +} + +/* Represents the state machine for when a MRP_Test frame was received on one + * of the MRP ports and the MRP instance has the role MRM. When MRP instance has + * the role MRC, it doesn't need to process MRP_Test frames. + */ +static void br_mrp_mrm_recv_ring_test(struct br_mrp *mrp) +{ + u32 topo_interval = mrp->ring_topo_conf_interval; + + switch (mrp->mrm_state) { + case BR_MRP_MRM_STATE_AC_STAT1: + /* Ignore */ + break; + case BR_MRP_MRM_STATE_PRM_UP: + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + mrp->no_tc = false; + + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RC); + break; + case BR_MRP_MRM_STATE_CHK_RO: + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + mrp->no_tc = false; + + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + + topo_interval = !mrp->react_on_link_change ? 0 : topo_interval; + br_mrp_ring_topo_req(mrp, topo_interval); + + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RC); + break; + case BR_MRP_MRM_STATE_CHK_RC: + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + mrp->no_tc = false; + + break; + } +} + +static void br_mrp_recv_ring_test(struct net_bridge_port *port, + struct sk_buff *skb) +{ + struct br_mrp *mrp = port->mrp_port->mrp; + struct br_mrp_ring_test_hdr *hdr; + + /* remove MRP version, tlv and get test header */ + hdr = skb_pull(skb, sizeof(struct br_mrp_tlv_hdr) + sizeof(u16)); + if (!hdr) + return; + + /* If the MRP_Test frames was not send by this instance, then don't + * process it. + */ + if (!ether_addr_equal(hdr->sa, port->br->dev->dev_addr)) + return; + + br_mrp_mrm_recv_ring_test(mrp); +} + +/* Represents the state machine for when a MRP_TopologyChange frame was + * received on one of the MRP ports and the MRP instance has the role MRC. When + * MRP instance has the role MRM it doesn't need to process the frame. + */ +static void br_mrp_recv_ring_topo(struct net_bridge_port *port, + struct sk_buff *skb) +{ + struct br_mrp *mrp = port->mrp_port->mrp; + struct br_mrp_ring_topo_hdr *hdr; + + br_debug(mrp->br, "recv ring_topo, mrc state: %s\n", + br_mrp_get_mrc_state(mrp->mrc_state)); + + /* remove MRP version, tlv and get ring topo header */ + hdr = skb_pull(skb, sizeof(struct br_mrp_tlv_hdr) + sizeof(u16)); + if (!hdr) + return; + + switch (mrp->mrc_state) { + case BR_MRP_MRC_STATE_AC_STAT1: + /* Ignore */ + break; + case BR_MRP_MRC_STATE_DE_IDLE: + br_mrp_clear_fdb_start(mrp, ntohs(hdr->interval)); + break; + case BR_MRP_MRC_STATE_PT: + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_up_stop(mrp); + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_clear_fdb_start(mrp, ntohs(hdr->interval)); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_PT_IDLE); + break; + case BR_MRP_MRC_STATE_DE: + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_down_stop(mrp); + br_mrp_clear_fdb_start(mrp, ntohs(hdr->interval)); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE_IDLE); + break; + case BR_MRP_MRC_STATE_PT_IDLE: + br_mrp_clear_fdb_start(mrp, ntohs(hdr->interval)); + break; + } +} + +/* Represents the state machine for when a MRP_LinkChange frame was + * received on one of the MRP ports and the MRP instance has the role MRM. When + * MRP instance has the role MRC it doesn't need to process the frame. + */ +static void br_mrp_recv_ring_link(struct net_bridge_port *port, + struct sk_buff *skb) +{ + struct br_mrp *mrp = port->mrp_port->mrp; + enum br_mrp_tlv_header_type type; + struct br_mrp_tlv_hdr *tlv; + + br_debug(mrp->br, "recv ring_link, mrm state: %s\n", + br_mrp_get_mrm_state(mrp->mrm_state)); + + /* remove MRP version to get the tlv */ + tlv = skb_pull(skb, sizeof(u16)); + if (!tlv) + return; + + type = tlv->type; + + switch (mrp->mrm_state) { + case BR_MRP_MRM_STATE_AC_STAT1: + /* Ignore */ + break; + case BR_MRP_MRM_STATE_PRM_UP: + if (mrp->blocked) { + if (mrp->add_test) + break; + mrp->add_test = true; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + } else { + if (type == BR_MRP_TLV_HEADER_RING_LINK_DOWN) + break; + + if (!mrp->add_test) { + mrp->add_test = true; + br_mrp_ring_test_req(mrp, + mrp->ring_test_conf_short); + } + br_mrp_ring_topo_req(mrp, 0); + } + break; + case BR_MRP_MRM_STATE_CHK_RO: + if (!mrp->add_test && + type == BR_MRP_TLV_HEADER_RING_LINK_UP && + mrp->blocked) { + mrp->add_test = true; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_short); + break; + } + + if (mrp->add_test && type == BR_MRP_TLV_HEADER_RING_LINK_UP && + mrp->blocked) + break; + + if (mrp->add_test && type == BR_MRP_TLV_HEADER_RING_LINK_DOWN) + break; + + if (!mrp->add_test && + type == BR_MRP_TLV_HEADER_RING_LINK_DOWN) { + mrp->add_test = true; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_short); + break; + } + + if (type == BR_MRP_TLV_HEADER_RING_LINK_UP && !mrp->blocked) { + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + + if (!mrp->add_test) { + br_mrp_ring_test_req(mrp, + mrp->ring_test_conf_short); + mrp->add_test = true; + } else { + br_mrp_ring_test_req(mrp, + mrp->ring_test_conf_interval); + } + + br_mrp_ring_topo_req(mrp, 0); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RC); + break; + } + break; + case BR_MRP_MRM_STATE_CHK_RC: + if (mrp->add_test && !mrp->react_on_link_change && + mrp->blocked) + break; + + if (!mrp->add_test && !mrp->react_on_link_change && + mrp->blocked) { + mrp->add_test = true; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_short); + break; + } + + if (type == BR_MRP_TLV_HEADER_RING_LINK_DOWN && + mrp->react_on_link_change) { + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + mrp->ring_transitions++; + br_mrp_ring_topo_req(mrp, 0); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RO); + break; + } + + if (type == BR_MRP_TLV_HEADER_RING_LINK_UP && + mrp->react_on_link_change && !mrp->blocked) { + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + br_mrp_ring_topo_req(mrp, 0); + } + + if (type == BR_MRP_TLV_HEADER_RING_LINK_UP && + mrp->react_on_link_change && mrp->blocked) { + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + br_mrp_ring_topo_req(mrp, 0); + } + break; + } +} + +/* Check if the MRP frame needs to be dropped */ +static bool br_mrp_should_drop(const struct net_bridge_port *p, + const struct sk_buff *skb, + enum br_mrp_tlv_header_type type) +{ + /* All frames should be dropped if the state of the port is disabled */ + if (p->mrp_port->state == BR_MRP_PORT_STATE_DISABLED) + return true; + + /* If receiving a MRP frame on a port which is not in MRP ring + * then the frame should be drop + */ + if (!br_mrp_is_mrp_port(p)) + return true; + + /* In case the port is in blocked state then the function + * br_handle_frame will drop all NON-MRP frames and it would send all + * MRP frames to the upper layer. So here is needed to drop MRP frames + * if the port is in blocked state. + */ + if (br_mrp_is_ring_port(p) && p->state == BR_MRP_PORT_STATE_BLOCKED && + type != BR_MRP_TLV_HEADER_RING_TOPO && + type != BR_MRP_TLV_HEADER_RING_TEST && + type != BR_MRP_TLV_HEADER_RING_LINK_UP && + type != BR_MRP_TLV_HEADER_RING_LINK_DOWN) { + return true; + } + + return false; +} + +/* Check if the MRP frame needs to be process. It depends of the MRP instance + * role and the frame type if the frame needs to be processed or not. + */ +static bool br_mrp_should_process(const struct net_bridge_port *p, + const struct sk_buff *skb, + enum br_mrp_tlv_header_type type) +{ + struct br_mrp *mrp = p->mrp_port->mrp; + + switch (type) { + case BR_MRP_TLV_HEADER_RING_TEST: + case BR_MRP_TLV_HEADER_RING_LINK_DOWN: + case BR_MRP_TLV_HEADER_RING_LINK_UP: + if (mrp->ring_role == BR_MRP_RING_ROLE_MRM) + return true; + break; + case BR_MRP_TLV_HEADER_RING_TOPO: + if (mrp->ring_role == BR_MRP_RING_ROLE_MRC) + return true; + break; + default: + break; + } + + return false; +} + +static void br_mrp_process(struct net_bridge_port *p, struct sk_buff *skb, + enum br_mrp_tlv_header_type type) +{ + switch (type) { + case BR_MRP_TLV_HEADER_RING_TEST: + br_mrp_recv_ring_test(p, skb); + break; + case BR_MRP_TLV_HEADER_RING_TOPO: + br_mrp_recv_ring_topo(p, skb); + break; + case BR_MRP_TLV_HEADER_RING_LINK_DOWN: + case BR_MRP_TLV_HEADER_RING_LINK_UP: + br_mrp_recv_ring_link(p, skb); + break; + default: + WARN(1, "Unknown type: %d\n", type); + } +} + +/* Check if the MRP frame needs to be forward to the other ports */ +static bool br_mrp_should_forward(const struct net_bridge_port *p, + struct sk_buff *skb, + enum br_mrp_tlv_header_type type) +{ + struct br_mrp *mrp = p->mrp_port->mrp; + + if (p->mrp_port->role == BR_MRP_PORT_ROLE_NONE) + return true; + + switch (type) { + case BR_MRP_TLV_HEADER_RING_TEST: + case BR_MRP_TLV_HEADER_RING_TOPO: + case BR_MRP_TLV_HEADER_RING_LINK_DOWN: + case BR_MRP_TLV_HEADER_RING_LINK_UP: + if (mrp->ring_role == BR_MRP_RING_ROLE_MRC) + return true; + break; + default: + /* All unknown frames types will not be processed */ + break; + } + + return false; +} + +/* Forward the frame to the port to */ +static void br_mrp_forward_to_port(struct net_bridge_port *to, + struct sk_buff *skb) +{ + skb->dev = to->dev; + skb = skb_clone(skb, GFP_ATOMIC); + + NF_HOOK(NFPROTO_BRIDGE, NF_BR_FORWARD, + dev_net(to->dev), NULL, skb, NULL, skb->dev, + br_mrp_send_finish); +} + +/* Recreate the data layer part of the frame and send forward the frame to + * port p + */ +static void br_mrp_forward_to_dst(struct sk_buff *skb, + struct net_bridge_port *p) +{ + skb_push(skb, sizeof(struct ethhdr)); + + br_mrp_forward_to_port(p, skb); +} + +/* All received MRP frames are added to a list of skbs and this function + * pops the frame and process them. It decides if the MRP instance needs to + * process it, forward it or dropp it + */ +static void br_mrp_process_skbs(struct work_struct *work) +{ + struct br_mrp *mrp = container_of(work, struct br_mrp, skbs_work); + struct sk_buff *skb; + + while ((skb = skb_dequeue(&mrp->skbs)) != NULL) { + struct net_bridge_port *port = br_port_get_rtnl(skb->dev); + enum br_mrp_tlv_header_type type; + + type = br_mrp_get_tlv_hdr(skb); + + mutex_lock(&mrp->lock); + + if (br_mrp_should_process(port, skb, type)) { + struct sk_buff *nskb; + + /* Because there are cases when a frame needs to be + * proccesed and also forward, it is required to clone + * the frame for processing not to alter the original + * one. + */ + nskb = skb_clone(skb, GFP_KERNEL); + if (!nskb) + goto next_skb; + + br_mrp_process(port, nskb, type); + dev_kfree_skb_any(nskb); + } + + if (br_mrp_should_forward(port, skb, type)) { + if (port == mrp->p_port) + br_mrp_forward_to_dst(skb, mrp->s_port); + if (port == mrp->s_port) + br_mrp_forward_to_dst(skb, mrp->p_port); + } + +next_skb: + mutex_unlock(&mrp->lock); + + dev_kfree_skb_any(skb); + } +} + +/* Receives all MRP frames and add them in a queue to be processed */ +int br_mrp_recv(struct sk_buff *skb, struct net_device *dev, + struct packet_type *pt, struct net_device *orig_dev) +{ + enum br_mrp_tlv_header_type type; + struct net_bridge_port *port; + struct br_mrp *mrp; + + port = br_port_get_rtnl(dev); + if (!port) + goto out; + + type = br_mrp_get_tlv_hdr(skb); + + if (br_mrp_should_drop(port, skb, type)) + goto out; + + skb->dev = dev; + mrp = port->mrp_port->mrp; + + skb_queue_tail(&mrp->skbs, skb); + schedule_work(&mrp->skbs_work); + + return 0; +out: + kfree_skb(skb); + return 0; +} + +/* Represents the state machine for when MRP instance has the role MRM and the + * link of one of the MRP ports is changed. + */ +static void br_mrp_mrm_port_link(struct net_bridge_port *p, bool up) +{ + struct br_mrp *mrp = p->mrp_port->mrp; + u32 topo_interval = mrp->ring_topo_conf_interval; + + br_debug(mrp->br, "port: %s, up: %d, mrm_state: %s\n", + p->dev->name, up, br_mrp_get_mrm_state(mrp->mrm_state)); + + switch (mrp->mrm_state) { + case BR_MRP_MRM_STATE_AC_STAT1: + if (up && p == mrp->p_port) { + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + } + if (up && p != mrp->p_port) { + mrp->s_port = mrp->p_port; + mrp->p_port = p; + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + } + break; + case BR_MRP_MRM_STATE_PRM_UP: + if (!up && p == mrp->p_port) { + br_mrp_ring_test_stop(mrp); + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_AC_STAT1); + } + if (up && p != mrp->p_port) { + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + mrp->no_tc = true; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RC); + } + break; + case BR_MRP_MRM_STATE_CHK_RO: + if (!up && p == mrp->p_port) { + mrp->s_port = mrp->p_port; + mrp->p_port = p; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + br_mrp_ring_topo_req(mrp, topo_interval); + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + break; + } + if (!up && p != mrp->p_port) { + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + } + break; + case BR_MRP_MRM_STATE_CHK_RC: + if (!up && p == mrp->p_port) { + mrp->p_port = mrp->s_port; + mrp->s_port = p; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + br_mrp_ring_topo_req(mrp, topo_interval); + mrp->ring_transitions++; + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + break; + } + if (!up && p != mrp->p_port) { + mrp->ring_transitions++; + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_PRM_UP); + break; + } + + break; + } + + br_debug(mrp->br, "new mrm_state: %s\n", + br_mrp_get_mrm_state(mrp->mrm_state)); +} + +/* Represents the state machine for when MRP instance has the role MRC and the + * link of one of the MRP ports is changed. + */ +static void br_mrp_mrc_port_link(struct net_bridge_port *p, bool up) +{ + struct br_mrp *mrp = p->mrp_port->mrp; + + br_debug(mrp->br, "port: %s, up: %d, mrc_state: %s\n", + p->dev->name, up, br_mrp_get_mrc_state(mrp->mrc_state)); + + switch (mrp->mrc_state) { + case BR_MRP_MRC_STATE_AC_STAT1: + if (up && p == mrp->p_port) { + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE_IDLE); + } + if (up && p != mrp->p_port) { + mrp->s_port = mrp->p_port; + mrp->p_port = p; + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE_IDLE); + } + break; + case BR_MRP_MRC_STATE_DE_IDLE: + if (up && p != mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_up_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_PT); + } + if (!up && p == mrp->p_port) { + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_AC_STAT1); + } + break; + case BR_MRP_MRC_STATE_PT: + if (!up && p != mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_up_stop(mrp); + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_link_down_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE); + break; + } + if (!up && p == mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_up_stop(mrp); + mrp->p_port = mrp->s_port; + mrp->s_port = p; + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_link_down_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE); + } + break; + case BR_MRP_MRC_STATE_DE: + if (up && p != mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + br_mrp_ring_link_down_stop(mrp); + br_mrp_ring_link_up_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_PT); + } + if (!up && p == mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + mrp->p_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_link_down_stop(mrp); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_AC_STAT1); + } + break; + case BR_MRP_MRC_STATE_PT_IDLE: + if (!up && p != mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_link_down_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE); + } + if (!up && p == mrp->p_port) { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + mrp->p_port = mrp->s_port; + mrp->s_port = p; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_BLOCKED; + br_mrp_ring_link_down_start(mrp, + mrp->ring_link_conf_interval); + br_mrp_ring_link_req(mrp->p_port, up, + mrp->ring_link_curr_max * + mrp->ring_link_conf_interval); + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE); + } + break; + } + + br_debug(mrp->br, "new mrc_state: %s\n", + br_mrp_get_mrc_state(mrp->mrc_state)); +} + +/* Whenever the port link changes, this function is called */ +void br_mrp_port_link_change(struct net_bridge_port *p, bool up) +{ + struct br_mrp *mrp; + + /* If the port which changed its status is not a ring port then + * nothing to do + */ + if (!br_mrp_is_mrp_port(p)) + return; + + mrp = p->mrp_port->mrp; + + if (mrp->ring_role == BR_MRP_RING_ROLE_MRM) + return br_mrp_mrm_port_link(p, up); + + if (mrp->ring_role == BR_MRP_RING_ROLE_MRC) + return br_mrp_mrc_port_link(p, up); +} + +/* There are 4 different recovery times in which an MRP ring can recover. Based + * on the each time updates all the configuration variables. The interval are + * represented in ns. + */ +static void br_mrp_update_recovery(struct br_mrp *mrp, + enum br_mrp_ring_recovery_type ring_recv) +{ + switch (ring_recv) { + case BR_MRP_RING_RECOVERY_500: + mrp->ring_topo_conf_interval = 20 * 1000; + mrp->ring_topo_conf_max = 3; + mrp->ring_topo_curr_max = mrp->ring_topo_conf_max - 1; + mrp->ring_test_conf_short = 30 * 1000; + mrp->ring_test_conf_interval = 50 * 1000; + mrp->ring_test_conf_max = 5; + mrp->ring_test_curr_max = mrp->ring_test_conf_max; + mrp->ring_test_conf_ext_max = 15; + mrp->ring_link_conf_interval = 100 * 1000; + mrp->ring_link_conf_max = 4; + mrp->ring_link_curr_max = 0; + break; + case BR_MRP_RING_RECOVERY_200: + mrp->ring_topo_conf_interval = 10 * 1000; + mrp->ring_topo_conf_max = 3; + mrp->ring_topo_curr_max = mrp->ring_topo_conf_max - 1; + mrp->ring_test_conf_short = 10 * 1000; + mrp->ring_test_conf_interval = 20 * 1000; + mrp->ring_test_conf_max = 3; + mrp->ring_test_curr_max = mrp->ring_test_conf_max; + mrp->ring_test_conf_ext_max = 15; + mrp->ring_link_conf_interval = 20 * 1000; + mrp->ring_link_conf_max = 4; + mrp->ring_link_curr_max = 0; + break; + case BR_MRP_RING_RECOVERY_30: + mrp->ring_topo_conf_interval = 500; + mrp->ring_topo_conf_max = 3; + mrp->ring_topo_curr_max = mrp->ring_topo_conf_max - 1; + mrp->ring_test_conf_short = 1 * 1000; + mrp->ring_test_conf_interval = 3500; + mrp->ring_test_conf_max = 3; + mrp->ring_test_curr_max = mrp->ring_test_conf_max; + mrp->ring_test_conf_ext_max = 15; + mrp->ring_link_conf_interval = 1; + mrp->ring_link_conf_max = 4; + mrp->ring_link_curr_max = 0; + break; + case BR_MRP_RING_RECOVERY_10: + mrp->ring_topo_conf_interval = 500; + mrp->ring_topo_conf_max = 3; + mrp->ring_topo_curr_max = mrp->ring_topo_conf_max - 1; + mrp->ring_test_conf_short = 500; + mrp->ring_test_conf_interval = 1000; + mrp->ring_test_conf_max = 3; + mrp->ring_test_curr_max = mrp->ring_test_conf_max; + mrp->ring_test_conf_ext_max = 15; + mrp->ring_link_conf_interval = 1; + mrp->ring_link_conf_max = 4; + mrp->ring_link_curr_max = 0; + break; + default: + break; + } +} + +bool br_mrp_allow_egress(const struct net_bridge_port *p, + const struct sk_buff *skb) +{ + /* TODO - Here can be a race condition. While this function is called + * it is possible to delete/add MRP instances, because this code is not + * protected by rtnl_lock. This needs to be fix somehow + */ + return (!p->mrp_port || + (p->mrp_port && + p->mrp_port->state == BR_MRP_PORT_STATE_FORWARDING)); +} + +struct br_mrp *br_mrp_find(struct net_bridge *br, u32 ring_nr) +{ + struct br_mrp *mrp; + + list_for_each_entry(mrp, &br->mrp_list, list) { + if (mrp->ring_nr == ring_nr) + return mrp; + } + + return NULL; +} + +/* Creates an MRP instance and initialize it */ +static int br_mrp_create(struct net_bridge *br, u32 ring_nr) +{ + struct br_mrp *mrp; + + mrp = devm_kzalloc(&br->dev->dev, sizeof(struct br_mrp), GFP_KERNEL); + if (!mrp) + return -ENOMEM; + + mutex_init(&mrp->lock); + + INIT_WORK(&mrp->skbs_work, br_mrp_process_skbs); + skb_queue_head_init(&mrp->skbs); + + mrp->br = br; + mrp->p_port = NULL; + mrp->s_port = NULL; + mrp->ring_nr = ring_nr; + + mrp->ring_role = BR_MRP_RING_ROLE_MRC; + mrp->ring_transitions = 0; + + mrp->seq_id = 0; + mrp->prio = MRP_DEFAULT_PRIO; + memset(mrp->domain, 0xFF, MRP_DOMAIN_UUID_LENGTH); + + br_mrp_update_recovery(mrp, BR_MRP_RING_RECOVERY_200); + + mrp->blocked = 1; + mrp->react_on_link_change = 1; + + br_mrp_timer_init(mrp); + + list_add_tail(&mrp->list, &br->mrp_list); + + return 0; +} + +/* Uninitialize MRP instance and remove it */ +static void br_mrp_destroy(struct net_bridge *br, u32 ring_nr) +{ + struct br_mrp *mrp = br_mrp_find(br, ring_nr); + + if (!mrp) + return; + + mutex_lock(&mrp->lock); + + br_mrp_reset_ring_state(mrp); + + cancel_work_sync(&mrp->skbs_work); + skb_queue_purge(&mrp->skbs); + + mrp->ring_role = BR_MRP_RING_ROLE_DISABLED; + mrp->p_port = NULL; + mrp->s_port = NULL; + mrp->br = NULL; + + mutex_unlock(&mrp->lock); + + list_del(&mrp->list); + devm_kfree(&br->dev->dev, mrp); +} + +void br_mrp_uninit(struct net_bridge *br) +{ + struct br_mrp *mrp, *tmp; + + /* The MRP ports are already uninitialized, therefore only + * destroy the MRP instances. + */ + list_for_each_entry_safe(mrp, tmp, &br->mrp_list, list) { + br_mrp_destroy(br, mrp->ring_nr); + } +} + +/* Initialize an MRP port */ +static int br_mrp_port_init(struct net_bridge_port *port, struct br_mrp *mrp, + enum br_mrp_port_role_type role) +{ + /* When a port is initialized, stop all timers and disable the states. + * The reason is that, it should not be possible to change the ports + * while MRP is running. Therefore after setting a port it is required + * to set again the role(MRM or MRC) + */ + br_mrp_reset_ring_state(mrp); + mrp->ring_role = BR_MRP_RING_ROLE_DISABLED; + + if (!port->mrp_port) { + port->mrp_port = devm_kzalloc(&port->br->dev->dev, + sizeof(struct br_mrp_port), + GFP_KERNEL); + if (!port->mrp_port) + return -ENOMEM; + } + + port->mrp_port->mrp = mrp; + port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + port->mrp_port->role = role; + + if (role == BR_MRP_PORT_ROLE_PRIMARY) + mrp->p_port = port; + if (role == BR_MRP_PORT_ROLE_SECONDARY) + mrp->s_port = port; + + return 0; +} + +/* Uninitialize MRP port */ +void br_mrp_port_uninit(struct net_bridge_port *port) +{ + struct br_mrp *mrp; + + if (!port->mrp_port || !port->mrp_port->mrp) + return; + + mrp = port->mrp_port->mrp; + + mutex_lock(&mrp->lock); + + br_mrp_reset_ring_state(mrp); + mrp->ring_role = BR_MRP_RING_ROLE_DISABLED; + + if (port->mrp_port->role == BR_MRP_PORT_ROLE_PRIMARY) + mrp->p_port = NULL; + if (port->mrp_port->role == BR_MRP_PORT_ROLE_SECONDARY) + mrp->s_port = NULL; + + port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + port->mrp_port->role = BR_MRP_PORT_ROLE_NONE; + port->mrp_port->mrp = NULL; + + devm_kfree(&port->br->dev->dev, port->mrp_port); + port->mrp_port = NULL; + + mutex_unlock(&mrp->lock); +} diff --git a/net/bridge/br_mrp_timer.c b/net/bridge/br_mrp_timer.c new file mode 100644 index 000000000000..59aa8c05724f --- /dev/null +++ b/net/bridge/br_mrp_timer.c @@ -0,0 +1,227 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Copyright (c) 2020 Microchip Corporation */ + +#include "br_private_mrp.h" + +static void br_mrp_ring_open(struct br_mrp *mrp) +{ + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + + mrp->ring_test_curr_max = mrp->ring_test_conf_max - 1; + mrp->ring_test_curr = 0; + + mrp->add_test = false; + + if (!mrp->no_tc) + br_mrp_ring_topo_req(mrp, mrp->ring_topo_conf_interval); + + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + + mrp->ring_transitions++; + br_mrp_set_mrm_state(mrp, BR_MRP_MRM_STATE_CHK_RO); +} + +static void br_mrp_clear_fdb_expired(struct work_struct *work) +{ + struct delayed_work *del_work = to_delayed_work(work); + struct br_mrp *mrp = container_of(del_work, struct br_mrp, + clear_fdb_work); + + br_fdb_flush(mrp->br); +} + +static void br_mrp_ring_test_expired(struct work_struct *work) +{ + struct delayed_work *del_work = to_delayed_work(work); + struct br_mrp *mrp = container_of(del_work, struct br_mrp, + ring_test_work); + + mutex_lock(&mrp->lock); + + if (mrp->mrm_state == BR_MRP_MRM_STATE_AC_STAT1) + goto out; + + if (mrp->mrm_state == BR_MRP_MRM_STATE_CHK_RO || + mrp->mrm_state == BR_MRP_MRM_STATE_PRM_UP) { + mrp->add_test = false; + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + goto out; + } + + if (mrp->ring_test_curr < mrp->ring_test_curr_max) { + mrp->ring_test_curr++; + + mrp->add_test = false; + + br_mrp_ring_test_req(mrp, mrp->ring_test_conf_interval); + } else { + br_mrp_ring_open(mrp); + } + +out: + mutex_unlock(&mrp->lock); +} + +static void br_mrp_ring_topo_expired(struct work_struct *work) +{ + struct delayed_work *del_work = to_delayed_work(work); + struct br_mrp *mrp = container_of(del_work, struct br_mrp, + ring_topo_work); + + br_debug(mrp->br, "ring topo expired: ring_topo_curr_max: %d\n", + mrp->ring_topo_curr_max); + + mutex_lock(&mrp->lock); + + if (mrp->ring_topo_curr_max > 0) { + mrp->ring_topo_curr_max--; + + br_mrp_ring_topo_req(mrp, mrp->ring_topo_curr_max); + } else { + mrp->ring_topo_curr_max = mrp->ring_topo_conf_max - 1; + br_mrp_ring_topo_req(mrp, 0); + } + + mutex_unlock(&mrp->lock); +} + +static void br_mrp_ring_link_up_expired(struct work_struct *work) +{ + struct delayed_work *del_work = to_delayed_work(work); + struct br_mrp *mrp = container_of(del_work, struct br_mrp, + ring_link_up_work); + u32 interval; + u32 delay; + + br_debug(mrp->br, "ring link up expired: ring_link_curr_max: %d\n", + mrp->ring_link_curr_max); + + mutex_lock(&mrp->lock); + + delay = mrp->ring_link_conf_interval; + + if (mrp->ring_link_curr_max > 0) { + mrp->ring_link_curr_max--; + + br_mrp_ring_link_up_start(mrp, delay); + + interval = mrp->ring_link_curr_max * delay; + + br_mrp_ring_link_req(mrp->p_port, true, interval); + } else { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + mrp->s_port->mrp_port->state = BR_MRP_PORT_STATE_FORWARDING; + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_PT_IDLE); + } + + mutex_unlock(&mrp->lock); +} + +static void br_mrp_ring_link_down_expired(struct work_struct *work) +{ + struct delayed_work *del_work = to_delayed_work(work); + struct br_mrp *mrp = container_of(del_work, struct br_mrp, + ring_link_down_work); + u32 interval; + u32 delay; + + br_debug(mrp->br, "ring link down expired: ring_link_curr_max: %d\n", + mrp->ring_link_curr_max); + + mutex_lock(&mrp->lock); + + delay = mrp->ring_link_conf_interval; + + if (mrp->ring_link_curr_max > 0) { + mrp->ring_link_curr_max--; + + br_mrp_ring_link_down_start(mrp, delay); + + interval = mrp->ring_link_curr_max * delay; + + br_mrp_ring_link_req(mrp->p_port, false, interval); + } else { + mrp->ring_link_curr_max = mrp->ring_link_conf_max; + + br_mrp_set_mrc_state(mrp, BR_MRP_MRC_STATE_DE_IDLE); + } + + mutex_unlock(&mrp->lock); +} + +void br_mrp_ring_test_start(struct br_mrp *mrp, u32 interval) +{ + queue_delayed_work(mrp->timers_queue, &mrp->ring_test_work, + usecs_to_jiffies(interval)); +} + +void br_mrp_ring_test_stop(struct br_mrp *mrp) +{ + cancel_delayed_work(&mrp->ring_test_work); +} + +void br_mrp_ring_topo_start(struct br_mrp *mrp, u32 interval) +{ + queue_delayed_work(mrp->timers_queue, &mrp->ring_topo_work, + usecs_to_jiffies(interval)); +} + +static void br_mrp_ring_topo_stop(struct br_mrp *mrp) +{ + cancel_delayed_work(&mrp->ring_topo_work); +} + +void br_mrp_ring_link_up_start(struct br_mrp *mrp, u32 interval) +{ + queue_delayed_work(mrp->timers_queue, &mrp->ring_link_up_work, + usecs_to_jiffies(interval)); +} + +void br_mrp_ring_link_up_stop(struct br_mrp *mrp) +{ + cancel_delayed_work(&mrp->ring_link_up_work); +} + +void br_mrp_ring_link_down_start(struct br_mrp *mrp, u32 interval) +{ + queue_delayed_work(mrp->timers_queue, &mrp->ring_link_down_work, + usecs_to_jiffies(interval)); +} + +void br_mrp_ring_link_down_stop(struct br_mrp *mrp) +{ + cancel_delayed_work(&mrp->ring_link_down_work); +} + +void br_mrp_clear_fdb_start(struct br_mrp *mrp, u32 interval) +{ + queue_delayed_work(mrp->timers_queue, &mrp->clear_fdb_work, + usecs_to_jiffies(interval)); +} + +static void br_mrp_clear_fdb_stop(struct br_mrp *mrp) +{ + cancel_delayed_work(&mrp->clear_fdb_work); +} + +/* Stops all the timers */ +void br_mrp_timer_stop(struct br_mrp *mrp) +{ + br_mrp_clear_fdb_stop(mrp); + br_mrp_ring_topo_stop(mrp); + br_mrp_ring_link_up_stop(mrp); + br_mrp_ring_link_down_stop(mrp); + br_mrp_ring_test_stop(mrp); +} + +void br_mrp_timer_init(struct br_mrp *mrp) +{ + mrp->timers_queue = create_singlethread_workqueue("mrp_timers"); + INIT_DELAYED_WORK(&mrp->ring_topo_work, br_mrp_ring_topo_expired); + INIT_DELAYED_WORK(&mrp->clear_fdb_work, br_mrp_clear_fdb_expired); + INIT_DELAYED_WORK(&mrp->ring_test_work, br_mrp_ring_test_expired); + INIT_DELAYED_WORK(&mrp->ring_link_up_work, br_mrp_ring_link_up_expired); + INIT_DELAYED_WORK(&mrp->ring_link_down_work, + br_mrp_ring_link_down_expired); +} diff --git a/net/bridge/br_private_mrp.h b/net/bridge/br_private_mrp.h new file mode 100644 index 000000000000..00ee20582ac9 --- /dev/null +++ b/net/bridge/br_private_mrp.h @@ -0,0 +1,199 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _BR_PRIVATE_MRP_H_ +#define _BR_PRIVATE_MRP_H_ + +#include "br_private.h" + +#define MRP_MAX_FRAME_LENGTH 200 +#define MRP_DOMAIN_UUID_LENGTH 16 +#define MRP_VERSION 0x1 +#define MRP_FRAME_PRIO 7 +#define MRP_DEFAULT_PRIO 0x8000 + +enum br_mrp_port_state_type { + BR_MRP_PORT_STATE_DISABLED, + BR_MRP_PORT_STATE_BLOCKED, + BR_MRP_PORT_STATE_FORWARDING, + BR_MRP_PORT_STATE_NOT_CONNECTED, +}; + +enum br_mrp_port_role_type { + BR_MRP_PORT_ROLE_PRIMARY, + BR_MRP_PORT_ROLE_SECONDARY, + BR_MRP_PORT_ROLE_NONE, +}; + +enum br_mrp_ring_role_type { + BR_MRP_RING_ROLE_DISABLED, + BR_MRP_RING_ROLE_MRC, + BR_MRP_RING_ROLE_MRM, +}; + +enum br_mrp_ring_state_type { + BR_MRP_RING_STATE_OPEN, + BR_MRP_RING_STATE_CLOSED, +}; + +enum br_mrp_ring_recovery_type { + BR_MRP_RING_RECOVERY_500, + BR_MRP_RING_RECOVERY_200, + BR_MRP_RING_RECOVERY_30, + BR_MRP_RING_RECOVERY_10, +}; + +enum br_mrp_mrm_state_type { + /* Awaiting Connection State 1 */ + BR_MRP_MRM_STATE_AC_STAT1 = 0x0, + /* Primary Ring port with Link Up */ + BR_MRP_MRM_STATE_PRM_UP = 0x1, + /* Check Ring, Ring Open State */ + BR_MRP_MRM_STATE_CHK_RO = 0x2, + /* Check Ring, Ring Closed State */ + BR_MRP_MRM_STATE_CHK_RC = 0x3, +}; + +enum br_mrp_mrc_state_type { + /* Awaiting Connection State 1 */ + BR_MRP_MRC_STATE_AC_STAT1 = 0x0, + /* Data Exchange Idle state */ + BR_MRP_MRC_STATE_DE_IDLE = 0x1, + /* Pass Through */ + BR_MRP_MRC_STATE_PT = 0x2, + /* Data Exchange */ + BR_MRP_MRC_STATE_DE = 0x3, + /* Pass Through Idle state */ + BR_MRP_MRC_STATE_PT_IDLE = 0x4, +}; + +enum br_mrp_tlv_header_type { + BR_MRP_TLV_HEADER_END = 0x0, + BR_MRP_TLV_HEADER_COMMON = 0x1, + BR_MRP_TLV_HEADER_RING_TEST = 0x2, + BR_MRP_TLV_HEADER_RING_TOPO = 0x3, + BR_MRP_TLV_HEADER_RING_LINK_DOWN = 0x4, + BR_MRP_TLV_HEADER_RING_LINK_UP = 0x5, +}; + +struct br_mrp_tlv_hdr { + u8 type; + u8 length; +} __packed; + +struct br_mrp_end_hdr { + struct br_mrp_tlv_hdr hdr; +} __packed; + +struct br_mrp_common_hdr { + u16 seq_id; + u8 domain[MRP_DOMAIN_UUID_LENGTH]; +} __packed; + +struct br_mrp_ring_test_hdr { + u16 prio; + u8 sa[ETH_ALEN]; + u16 port_role; + u16 state; + u16 transitions; + u32 timestamp; +} __packed; + +struct br_mrp_ring_topo_hdr { + u16 prio; + u8 sa[ETH_ALEN]; + u16 interval; +} __packed; + +struct br_mrp_ring_link_hdr { + u8 sa[ETH_ALEN]; + u16 port_role; + u16 interval; + u16 blocked; +} __packed; + +struct br_mrp_port { + struct br_mrp *mrp; + enum br_mrp_port_state_type state; + enum br_mrp_port_role_type role; +}; + +struct br_mrp { + /* list of mrp instances */ + struct list_head list; + + struct sk_buff_head skbs; + struct work_struct skbs_work; + + /* lock for each MRP instance */ + struct mutex lock; + + struct net_bridge *br; + struct net_bridge_port *p_port; + struct net_bridge_port *s_port; + + u32 ring_nr; + enum br_mrp_ring_role_type ring_role; + enum br_mrp_ring_recovery_type ring_recv; + + enum br_mrp_mrm_state_type mrm_state; + enum br_mrp_mrc_state_type mrc_state; + + bool add_test; + bool no_tc; + + u16 ring_transitions; + + u16 seq_id; + u32 prio; + u8 domain[MRP_DOMAIN_UUID_LENGTH]; + + struct workqueue_struct *timers_queue; + + struct delayed_work clear_fdb_work; + + struct delayed_work ring_test_work; + u32 ring_test_conf_short; + u32 ring_test_conf_interval; + u32 ring_test_conf_max; + u32 ring_test_conf_ext_max; + u32 ring_test_curr; + u32 ring_test_curr_max; + + struct delayed_work ring_topo_work; + u32 ring_topo_conf_interval; + u32 ring_topo_conf_max; + u32 ring_topo_curr_max; + + struct delayed_work ring_link_up_work; + struct delayed_work ring_link_down_work; + u32 ring_link_conf_interval; + u32 ring_link_conf_max; + u32 ring_link_curr_max; + + u16 blocked; + u16 react_on_link_change; +}; + +/* br_mrp.c */ +void br_mrp_ring_test_req(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_topo_req(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_link_req(struct net_bridge_port *p, bool up, u32 interval); + +void br_mrp_set_mrm_state(struct br_mrp *mrp, enum br_mrp_mrm_state_type state); +void br_mrp_set_mrc_state(struct br_mrp *mrp, enum br_mrp_mrc_state_type state); + +/* br_mrp_timer.c */ +void br_mrp_timer_init(struct br_mrp *mrp); +void br_mrp_timer_stop(struct br_mrp *mrp); + +void br_mrp_clear_fdb_start(struct br_mrp *mrp, u32 interval); + +void br_mrp_ring_test_start(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_test_stop(struct br_mrp *mrp); +void br_mrp_ring_topo_start(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_link_up_start(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_link_up_stop(struct br_mrp *mrp); +void br_mrp_ring_link_down_start(struct br_mrp *mrp, u32 interval); +void br_mrp_ring_link_down_stop(struct br_mrp *mrp); + +#endif /* BR_PRIVATE_MRP_H_ */ -- 2.17.1
Horatiu Vultur
2020-Jan-09 15:06 UTC
[Bridge] [RFC net-next Patch 2/3] net: bridge: mrp: Integrate MRP into the bridge
To integrate MRP into the bridge, the bridge needs to do the following: - notify when the link of one of the ports goes down or up, because MRP instance needs to react to link changes by sending MRP frames. - notify when one of the ports are removed from the bridge or when the bridge is destroyed, because if the port is part of the MRP ring then MRP state machine should be stopped. - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. This is similar with STP design. - add logic for MRP frames inside the bridge. The bridge will just detect MRP frames and it would forward them to the upper layer to allow to process it. - update the logic to update non-MRP frames. If MRP is enabled, then look at the state of the port to decide to forward or not. Signed-off-by: Horatiu Vultur <horatiu.vultur at microchip.com> --- net/bridge/br.c | 19 +++++++++++++++++++ net/bridge/br_device.c | 3 +++ net/bridge/br_forward.c | 1 + net/bridge/br_if.c | 10 ++++++++++ net/bridge/br_input.c | 22 ++++++++++++++++++++++ net/bridge/br_private.h | 28 ++++++++++++++++++++++++++++ 6 files changed, 83 insertions(+) diff --git a/net/bridge/br.c b/net/bridge/br.c index b6fe30e3768f..9053378ca1e4 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -94,6 +94,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v spin_lock_bh(&br->lock); if (br->dev->flags & IFF_UP) { br_stp_disable_port(p); + br_mrp_port_link_change(p, false); notified = true; } spin_unlock_bh(&br->lock); @@ -103,6 +104,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v if (netif_running(br->dev) && netif_oper_up(dev)) { spin_lock_bh(&br->lock); br_stp_enable_port(p); + br_mrp_port_link_change(p, true); notified = true; spin_unlock_bh(&br->lock); } @@ -308,6 +310,13 @@ static const struct stp_proto br_stp_proto = { .rcv = br_stp_rcv, }; +#if IS_ENABLED(CONFIG_BRIDGE_MRP) +static struct packet_type mrp_packet_type __read_mostly = { + .type = cpu_to_be16(ETH_P_MRP), + .func = br_mrp_recv, +}; +#endif + static int __init br_init(void) { int err; @@ -320,6 +329,13 @@ static int __init br_init(void) return err; } +#if IS_ENABLED(CONFIG_BRIDGE_MRP) + /* Allow all MRP frames to be processed by the upper layer. The MRP + * frames can be dropped or forward on other MRP ports + */ + dev_add_pack(&mrp_packet_type); +#endif + err = br_fdb_init(); if (err) goto err_out; @@ -376,6 +392,9 @@ static int __init br_init(void) static void __exit br_deinit(void) { stp_proto_unregister(&br_stp_proto); +#if IS_ENABLED(CONFIG_BRIDGE_MRP) + dev_remove_pack(&mrp_packet_type); +#endif br_netlink_fini(); unregister_switchdev_notifier(&br_switchdev_notifier); unregister_netdevice_notifier(&br_device_notifier); diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index fb38add21b37..29966754d86a 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -464,6 +464,9 @@ void br_dev_setup(struct net_device *dev) spin_lock_init(&br->lock); INIT_LIST_HEAD(&br->port_list); INIT_HLIST_HEAD(&br->fdb_list); +#ifdef CONFIG_BRIDGE_MRP + INIT_LIST_HEAD(&br->mrp_list); +#endif spin_lock_init(&br->hash_lock); br->bridge_id.prio[0] = 0x80; diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 86637000f275..306425bc9899 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -27,6 +27,7 @@ static inline int should_deliver(const struct net_bridge_port *p, return ((p->flags & BR_HAIRPIN_MODE) || skb->dev != p->dev) && br_allowed_egress(vg, skb) && p->state == BR_STATE_FORWARDING && nbp_switchdev_allowed_egress(p, skb) && + br_mrp_allow_egress(p, skb) && !br_skb_isolated(p, skb); } diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 4fe30b182ee7..bf7a467b5f33 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -80,11 +80,13 @@ void br_port_carrier_check(struct net_bridge_port *p, bool *notified) br_stp_enable_port(p); *notified = true; } + br_mrp_port_link_change(p, true); } else { if (p->state != BR_STATE_DISABLED) { br_stp_disable_port(p); *notified = true; } + br_mrp_port_link_change(p, false); } spin_unlock_bh(&br->lock); } @@ -331,6 +333,9 @@ static void del_nbp(struct net_bridge_port *p) spin_lock_bh(&br->lock); br_stp_disable_port(p); +#ifdef CONFIG_BRIDGE_MRP + br_mrp_port_uninit(p); +#endif spin_unlock_bh(&br->lock); br_ifinfo_notify(RTM_DELLINK, NULL, p); @@ -373,6 +378,11 @@ void br_dev_delete(struct net_device *dev, struct list_head *head) del_nbp(p); } +#ifdef CONFIG_BRIDGE_MRP + /* Remove MRP instance. This function will remove also the MRP ports */ + br_mrp_uninit(br); +#endif + br_recalculate_neigh_suppress_enabled(br); br_fdb_delete_by_port(br, NULL, 0, 1); diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 8944ceb47fe9..c65049586dbd 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -21,6 +21,9 @@ #include <linux/rculist.h> #include "br_private.h" #include "br_private_tunnel.h" +#ifdef CONFIG_BRIDGE_MRP +#include "br_private_mrp.h" +#endif static int br_netif_receive_skb(struct net *net, struct sock *sk, struct sk_buff *skb) @@ -338,6 +341,25 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb) return RX_HANDLER_CONSUMED; } } +#ifdef CONFIG_BRIDGE_MRP + /* If the port is part of the MRP ring and the state of the port is + * disabled then all the frames must be dropped + */ + if (p->mrp_port && p->mrp_port->state == BR_MRP_PORT_STATE_DISABLED) + goto drop; + + /* MRP frames need special processing, therefor allow the upper level + * to decide what to do with the frame + */ + if (p->mrp_port && skb->protocol == ntohs(ETH_P_MRP)) + return RX_HANDLER_PASS; + + /* Frames received on a blocked port, shall be dropped, except + * MRP frames and frames specified in IEEE 802.1D-2004 Table 7-10. + */ + if (p->mrp_port && p->mrp_port->state == BR_MRP_PORT_STATE_BLOCKED) + goto drop; +#endif forward: switch (p->state) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index f540f3bdf294..0c008b3d24cc 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -285,6 +285,9 @@ struct net_bridge_port { u16 backup_redirected_cnt; struct bridge_stp_xstats stp_xstats; +#ifdef CONFIG_BRIDGE_MRP + struct br_mrp_port *mrp_port; +#endif }; #define kobj_to_brport(obj) container_of(obj, struct net_bridge_port, kobj) @@ -424,6 +427,10 @@ struct net_bridge { int offload_fwd_mark; #endif struct hlist_head fdb_list; + +#ifdef CONFIG_BRIDGE_MRP + struct list_head mrp_list; +#endif }; struct br_input_skb_cb { @@ -1160,6 +1167,27 @@ void br_stp_timer_init(struct net_bridge *br); void br_stp_port_timer_init(struct net_bridge_port *p); unsigned long br_timer_value(const struct timer_list *timer); +#if IS_ENABLED(CONFIG_BRIDGE_MRP) +/* br_mrp.c */ +void br_mrp_uninit(struct net_bridge *br); +void br_mrp_port_uninit(struct net_bridge_port *p); +void br_mrp_port_link_change(struct net_bridge_port *br, bool up); +int br_mrp_recv(struct sk_buff *skb, struct net_device *dev, + struct packet_type *pt, struct net_device *orig_dev); +bool br_mrp_allow_egress(const struct net_bridge_port *p, + const struct sk_buff *skb); +#else +static inline bool br_mrp_allow_egress(const struct net_bridge_port *p, + const struct sk_buff *skb) +{ + return true; +} + +static inline void br_mrp_port_link_change(struct net_bridge_port *br, bool up) +{ +} +#endif + /* br.c */ #if IS_ENABLED(CONFIG_ATM_LANE) extern int (*br_fdb_test_addr_hook)(struct net_device *dev, unsigned char *addr); -- 2.17.1
Horatiu Vultur
2020-Jan-09 15:06 UTC
[Bridge] [RFC net-next Patch 3/3] net: bridge: mrp: Add netlink support to configure MRP
Extend br_netlink to be able to create/delete MRP instances. The current configurations options for each instance are: - set primary port - set secondary port - set MRP ring role (MRM or MRC) - set MRP ring id. To create a MRP instance on the bridge: $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 Where: p_port, s_port: can be any port under the bridge ring_role: can have the value 1(MRC - Media Redundancy Client) or 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. ring_id: unique id for each MRP instance. It is possible to create multiple instances. Each instance has to have it's own ring_id and a port can't be part of multiple instances: $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 To see current MRP instances and their status: $ bridge mrp show dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 Where: p_port, s_port, ring_role, ring_id: represent the configuration values. It is possible for primary port to change the role with the secondary port. It depends on the states through which the node goes. ring_state: depends on the ring_role. If mrp_ring_role is 1(MRC) then the values of mrp_ring_state can be: 0(AC_STAT1), 1(DE_IDLE), 2(PT), 3(DE), 4(PT_IDLE). If mrp_ring_role is 2(MRM) then the values of mrp_ring_state can be: 0(AC_STAT1), 1(PRM_UP), 2(CHK_RO), 3(CHK_RC). Signed-off-by: Horatiu Vultur <horatiu.vultur at microchip.com> --- include/uapi/linux/if_bridge.h | 27 ++++ include/uapi/linux/rtnetlink.h | 7 + net/bridge/br_mrp.c | 281 +++++++++++++++++++++++++++++++++ net/bridge/br_netlink.c | 9 ++ net/bridge/br_private.h | 2 + net/bridge/br_private_mrp.h | 9 ++ security/selinux/nlmsgtab.c | 5 +- 7 files changed, 339 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index 4a58e3d7de46..00f4f465d62a 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -265,6 +265,33 @@ enum { }; #define MDBA_SET_ENTRY_MAX (__MDBA_SET_ENTRY_MAX - 1) +#ifdef CONFIG_BRIDGE_MRP +enum { + MRPA_UNSPEC, + MRPA_MRP, + __MRPA_MAX, +}; +#define MRPA_MAX (__MRPA_MAX - 1) + +enum { + MRPA_MRP_UNSPEC, + MRPA_MRP_ENTRY, + __MRPA_MRP_MAX, +}; +#define MRPA_MRP_MAX (__MRPA_MRP_MAX - 1) + +enum { + MRP_ATTR_UNSPEC, + MRP_ATTR_P_IFINDEX, + MRP_ATTR_S_IFINDEX, + MRP_ATTR_RING_ROLE, + MRP_ATTR_RING_NR, + MRP_ATTR_RING_STATE, + __MRP_ATTR_MAX, +}; +#define MRP_ATTR_MAX (__MRP_ATTR_MAX - 1) +#endif + /* Embedded inside LINK_XSTATS_TYPE_BRIDGE */ enum { BRIDGE_XSTATS_UNSPEC, diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 1418a8362bb7..b1d72a5309cd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -171,6 +171,13 @@ enum { RTM_GETLINKPROP, #define RTM_GETLINKPROP RTM_GETLINKPROP + RTM_NEWMRP = 112, +#define RTM_NEWMRP RTM_NEWMRP + RTM_DELMRP, +#define RTM_DELMRP RTM_DELMRP + RTM_GETMRP, +#define RTM_GETMRP RTM_GETMRP + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/bridge/br_mrp.c b/net/bridge/br_mrp.c index a84aab3f7114..4173021d3bfa 100644 --- a/net/bridge/br_mrp.c +++ b/net/bridge/br_mrp.c @@ -1234,3 +1234,284 @@ void br_mrp_port_uninit(struct net_bridge_port *port) mutex_unlock(&mrp->lock); } + +/* Do sanity checks and obtain device and the ring */ +static int br_mrp_parse(struct sk_buff *skb, struct nlmsghdr *nlh, + struct net_device **pdev, struct br_mrp_config *conf) +{ + struct nlattr *tb[MRP_ATTR_MAX + 1]; + struct net *net = sock_net(skb->sk); + struct br_port_msg *bpm; + struct net_device *dev; + int err; + + err = nlmsg_parse_deprecated(nlh, sizeof(*bpm), tb, + MRP_ATTR_MAX, NULL, NULL); + if (err < 0) + return err; + + bpm = nlmsg_data(nlh); + if (bpm->ifindex == 0) { + pr_info("PF_BRIDGE: %s with invalid ifindex\n", __func__); + return -EINVAL; + } + + dev = __dev_get_by_index(net, bpm->ifindex); + if (!dev) { + pr_info("PF_BRIDGE: %s with unknown ifindex\n", __func__); + return -ENODEV; + } + + if (!(dev->priv_flags & IFF_EBRIDGE)) { + pr_info("PF_BRIDGE: %s with non-bridge\n", __func__); + return -EOPNOTSUPP; + } + + *pdev = dev; + + if (tb[MRP_ATTR_P_IFINDEX]) + conf->p_ifindex = nla_get_u32(tb[MRP_ATTR_P_IFINDEX]); + if (tb[MRP_ATTR_S_IFINDEX]) + conf->s_ifindex = nla_get_u32(tb[MRP_ATTR_S_IFINDEX]); + if (tb[MRP_ATTR_RING_ROLE]) + conf->ring_role = nla_get_u8(tb[MRP_ATTR_RING_ROLE]); + if (tb[MRP_ATTR_RING_NR]) + conf->ring_nr = nla_get_u8(tb[MRP_ATTR_RING_NR]); + + return 0; +} + +static int br_mrp_fill_entry(struct sk_buff *skb, struct netlink_callback *cb, + struct net_device *dev) +{ + int idx = 0, s_idx = cb->args[1], err = 0; + struct net_bridge *br = netdev_priv(dev); + struct br_mrp *mrp; + struct nlattr *nest, *nest2; + + nest = nla_nest_start_noflag(skb, MRPA_MRP); + if (!nest) + return -EMSGSIZE; + + list_for_each_entry_rcu(mrp, &br->mrp_list, list) { + if (idx < s_idx) + goto skip; + + nest2 = nla_nest_start_noflag(skb, MRPA_MRP_ENTRY); + if (!nest2) { + err = -EMSGSIZE; + mutex_unlock(&mrp->lock); + break; + } + + mutex_lock(&mrp->lock); + + if (mrp->p_port) + nla_put_u32(skb, MRP_ATTR_P_IFINDEX, + mrp->p_port->dev->ifindex); + if (mrp->s_port) + nla_put_u32(skb, MRP_ATTR_S_IFINDEX, + mrp->s_port->dev->ifindex); + + nla_put_u32(skb, MRP_ATTR_RING_NR, mrp->ring_nr); + nla_put_u32(skb, MRP_ATTR_RING_ROLE, mrp->ring_role); + + if (mrp->ring_role == BR_MRP_RING_ROLE_MRM) + nla_put_u32(skb, MRP_ATTR_RING_STATE, mrp->mrm_state); + if (mrp->ring_role == BR_MRP_RING_ROLE_MRC) + nla_put_u32(skb, MRP_ATTR_RING_STATE, mrp->mrc_state); + + mutex_unlock(&mrp->lock); + + nla_nest_end(skb, nest2); +skip: + idx++; + } + + cb->args[1] = idx; + nla_nest_end(skb, nest); + return err; +} + +static int br_mrp_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + struct net *net = sock_net(skb->sk); + struct nlmsghdr *nlh = NULL; + struct net_device *dev; + int idx = 0, s_idx; + + s_idx = cb->args[0]; + + rcu_read_lock(); + + cb->seq = net->dev_base_seq; + + for_each_netdev_rcu(net, dev) { + if (dev->priv_flags & IFF_EBRIDGE) { + struct br_port_msg *bpm; + + if (idx < s_idx) + goto skip; + + nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, + cb->nlh->nlmsg_seq, RTM_GETMRP, + sizeof(*bpm), NLM_F_MULTI); + if (!nlh) + break; + + bpm = nlmsg_data(nlh); + memset(bpm, 0, sizeof(*bpm)); + bpm->ifindex = dev->ifindex; + if (br_mrp_fill_entry(skb, cb, dev) < 0) + goto out; + + cb->args[1] = 0; + nlmsg_end(skb, nlh); +skip: + idx++; + } + } + +out: + if (nlh) + nlmsg_end(skb, nlh); + rcu_read_unlock(); + cb->args[0] = idx; + return skb->len; +} + +static int br_mrp_add(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net_bridge_port *p_port, *s_port; + struct net *net = sock_net(skb->sk); + enum br_mrp_ring_role_type role; + struct br_mrp_config conf; + struct net_device *dev; + struct net_bridge *br; + struct br_mrp *mrp; + u32 ring_nr; + int err; + + err = br_mrp_parse(skb, nlh, &dev, &conf); + if (err < 0) + return err; + + br = netdev_priv(dev); + + /* Get priority and secondary ports */ + dev = __dev_get_by_index(net, conf.p_ifindex); + if (!dev) + return -ENODEV; + + p_port = br_port_get_rtnl(dev); + if (!p_port || p_port->br != br) + return -EINVAL; + + dev = __dev_get_by_index(net, conf.s_ifindex); + if (!dev) + return -ENODEV; + + s_port = br_port_get_rtnl(dev); + if (!s_port || s_port->br != br) + return -EINVAL; + + /* Get role */ + role = conf.ring_role; + + /* Get ring number */ + ring_nr = conf.ring_nr; + + /* It is not possible to have MRP instances with the same ID */ + mrp = br_mrp_find(br, ring_nr); + if (mrp) + return -EINVAL; + + /* Create the mrp instance */ + err = br_mrp_create(br, ring_nr); + if (err < 0) + return err; + + mrp = br_mrp_find(br, ring_nr); + + mutex_lock(&mrp->lock); + + /* Initialize the ports */ + err = br_mrp_port_init(p_port, mrp, BR_MRP_PORT_ROLE_PRIMARY); + if (err < 0) { + mutex_unlock(&mrp->lock); + goto delete_mrp; + } + + err = br_mrp_port_init(s_port, mrp, BR_MRP_PORT_ROLE_SECONDARY); + if (err < 0) { + mutex_unlock(&mrp->lock); + goto delete_port; + } + + if (role == BR_MRP_RING_ROLE_MRM) + br_mrp_set_mrm_role(mrp); + if (role == BR_MRP_RING_ROLE_MRC) + br_mrp_set_mrc_role(mrp); + + mutex_unlock(&mrp->lock); + + return 0; + +delete_port: + br_mrp_port_uninit(p_port); + +delete_mrp: + br_mrp_destroy(br, ring_nr); + return err; +} + +static int br_mrp_del(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct br_mrp_config conf; + struct net_device *dev; + struct net_bridge *br; + struct br_mrp *mrp; + u32 ring_nr; + int err; + + err = br_mrp_parse(skb, nlh, &dev, &conf); + if (err < 0) + return err; + + br = netdev_priv(dev); + + /* Get ring number */ + ring_nr = conf.ring_nr; + + mrp = br_mrp_find(br, ring_nr); + if (!mrp) { + pr_info("PF_BRIDGE: %s with invalid ring_nr\n", __func__); + return -EINVAL; + } + + br_mrp_port_uninit(mrp->p_port); + br_mrp_port_uninit(mrp->s_port); + + br_mrp_destroy(br, ring_nr); + + return 0; +} + +void br_mrp_netlink_init(void) +{ + rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_GETMRP, NULL, + br_mrp_dump, 0); + rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_NEWMRP, br_mrp_add, + NULL, 0); + rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_DELMRP, br_mrp_del, + NULL, 0); +} + +void br_mrp_netlink_uninit(void) +{ + rtnl_unregister(PF_BRIDGE, RTM_GETMRP); + rtnl_unregister(PF_BRIDGE, RTM_NEWMRP); + rtnl_unregister(PF_BRIDGE, RTM_DELMRP); +} diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 60136575aea4..6d8f84ed8b0d 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1664,6 +1664,9 @@ int __init br_netlink_init(void) int err; br_mdb_init(); +#ifdef CONFIG_BRIDGE_MRP + br_mrp_netlink_init(); +#endif rtnl_af_register(&br_af_ops); err = rtnl_link_register(&br_link_ops); @@ -1674,12 +1677,18 @@ int __init br_netlink_init(void) out_af: rtnl_af_unregister(&br_af_ops); +#ifdef CONFIG_BRIDGE_MRP + br_mrp_netlink_uninit(); +#endif br_mdb_uninit(); return err; } void br_netlink_fini(void) { +#ifdef CONFIG_BRIDGE_MRP + br_mrp_netlink_uninit(); +#endif br_mdb_uninit(); rtnl_af_unregister(&br_af_ops); rtnl_link_unregister(&br_link_ops); diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 0c008b3d24cc..9a060c3c7713 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -1169,6 +1169,8 @@ unsigned long br_timer_value(const struct timer_list *timer); #if IS_ENABLED(CONFIG_BRIDGE_MRP) /* br_mrp.c */ +void br_mrp_netlink_init(void); +void br_mrp_netlink_uninit(void); void br_mrp_uninit(struct net_bridge *br); void br_mrp_port_uninit(struct net_bridge_port *p); void br_mrp_port_link_change(struct net_bridge_port *br, bool up); diff --git a/net/bridge/br_private_mrp.h b/net/bridge/br_private_mrp.h index 00ee20582ac9..13fd2330ccfc 100644 --- a/net/bridge/br_private_mrp.h +++ b/net/bridge/br_private_mrp.h @@ -174,6 +174,15 @@ struct br_mrp { u16 react_on_link_change; }; +/* Represents the configuration of the MRP instance */ +struct br_mrp_config { + u32 p_ifindex; + u32 s_ifindex; + u32 ring_role; + u32 ring_nr; + u32 ring_state; +}; + /* br_mrp.c */ void br_mrp_ring_test_req(struct br_mrp *mrp, u32 interval); void br_mrp_ring_topo_req(struct br_mrp *mrp, u32 interval); diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index c97fdae8f71b..7c110fdb9e1e 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -85,6 +85,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] { RTM_GETNEXTHOP, NETLINK_ROUTE_SOCKET__NLMSG_READ }, { RTM_NEWLINKPROP, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELLINKPROP, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_NEWMRP, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_DELMRP, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_GETMRP, NETLINK_ROUTE_SOCKET__NLMSG_READ }, }; static const struct nlmsg_perm nlmsg_tcpdiag_perms[] @@ -168,7 +171,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWLINKPROP + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_DELMRP + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break; -- 2.17.1
Stephen Hemminger
2020-Jan-09 16:19 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
On Thu, 9 Jan 2020 16:06:37 +0100 Horatiu Vultur <horatiu.vultur at microchip.com> wrote:> Media Redundancy Protocol is a data network protocol standardized by > International Electrotechnical Commission as IEC 62439-2. It allows rings of > Ethernet switches to overcome any single failure with recovery time faster than > STP. It is primarily used in Industrial Ethernet applications. > > This is the first proposal of implementing a subset of the standard. It supports > only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and > Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and > in a ring can be only one MRM and multiple MRC. It is possible to have multiple > instances of MRP on a single node. But a port can be part of only one MRP > instance. > > The MRM is responsible for detecting when there is a loop in the ring. It is > sending the frame MRP_Test to detect the loops. It would send MRP_Test on both > ports in the ring and if the frame is received at the other end, then the ring > is closed. Meaning that there is a loop. In this case it sets the port state to > BLOCKED, not allowing traffic to pass through except MRP frames. In case it > stops receiving MRP_Test frames from itself then the MRM will detect that the > ring is open, therefor it would notify the other nodes of this change and will > set the state of the port to be FORWARDING. > > The MRC is responsible for forwarding MRP_Test frames between the ring ports > (and not to flood on other ports) and to listen when there is a change in the > network to clear the FDB. > > Similar with STP, MRP is implemented on top of the bridge and they can't be > enable at the same time. While STP runs on all ports of the bridge, MRP needs to > run only on 2 ports. > > The bridge needs to: > - notify when the link of one of the ports goes down or up, because MRP instance > needs to react to link changes by sending MRP_LinkChange frames. > - notify when one of the ports are removed from the bridge or when the bridge > is destroyed, because if the port is part of the MRP ring then MRP state > machine should be stopped. > - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. > This is similar with STP design. > - add logic for MRP frames inside the bridge. The bridge will just detect MRP > frames and it would forward them to the upper layer to allow to process it. > - update the logic to update non-MRP frames. If MRP is enabled, then look also > at the state of the port to decide to forward or not. > > To create a MRP instance on the bridge: > $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 > > Where: > p_port, s_port: can be any port under the bridge > ring_role: can have the value 1(MRC - Media Redundancy Client) or > 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. > ring_id: unique id for each MRP instance. > > It is possible to create multiple instances. Each instance has to have it's own > ring_id and a port can't be part of multiple instances: > $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 > > To see current MRP instances and their status: > $ bridge mrp show > dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 > dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 > > If this patch series is well received, the in the future it could be extended > with the following: > - add support for Media Redundancy Automanager. This role allows a node to > detect if needs to behave as a MRM or MRC. The advantage of this role is that > the user doesn't need to configure the nodes each time they are added/removed > from a ring and it adds redundancy to the manager. > - add support for Interconnect rings. This allow to connect multiple rings. > - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10 > ms). To be able to achieve 30 and 10 it is required by the HW to generate the > MRP_Test frames and detect when the ring is open/closed. > > Horatiu Vultur (3): > net: bridge: mrp: Add support for Media Redundancy Protocol > net: bridge: mrp: Integrate MRP into the bridge > net: bridge: mrp: Add netlink support to configure MRP > > include/uapi/linux/if_bridge.h | 27 + > include/uapi/linux/if_ether.h | 1 + > include/uapi/linux/rtnetlink.h | 7 + > net/bridge/Kconfig | 12 + > net/bridge/Makefile | 2 + > net/bridge/br.c | 19 + > net/bridge/br_device.c | 3 + > net/bridge/br_forward.c | 1 + > net/bridge/br_if.c | 10 + > net/bridge/br_input.c | 22 + > net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++ > net/bridge/br_mrp_timer.c | 227 +++++ > net/bridge/br_netlink.c | 9 + > net/bridge/br_private.h | 30 + > net/bridge/br_private_mrp.h | 208 +++++ > security/selinux/nlmsgtab.c | 5 +- > 16 files changed, 2099 insertions(+), 1 deletion(-) > create mode 100644 net/bridge/br_mrp.c > create mode 100644 net/bridge/br_mrp_timer.c > create mode 100644 net/bridge/br_private_mrp.h >Can this be implemented in userspace? Putting STP in the kernel was a mistake (even original author says so). Adding more control protocols in kernel is a security and stability risk.
Nikolay Aleksandrov
2020-Jan-10 14:13 UTC
[Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)
On 09/01/2020 17:06, Horatiu Vultur wrote:> Media Redundancy Protocol is a data network protocol standardized by > International Electrotechnical Commission as IEC 62439-2. It allows rings of > Ethernet switches to overcome any single failure with recovery time faster than > STP. It is primarily used in Industrial Ethernet applications. > > This is the first proposal of implementing a subset of the standard. It supports > only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and > Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and > in a ring can be only one MRM and multiple MRC. It is possible to have multiple > instances of MRP on a single node. But a port can be part of only one MRP > instance. > > The MRM is responsible for detecting when there is a loop in the ring. It is > sending the frame MRP_Test to detect the loops. It would send MRP_Test on both > ports in the ring and if the frame is received at the other end, then the ring > is closed. Meaning that there is a loop. In this case it sets the port state to > BLOCKED, not allowing traffic to pass through except MRP frames. In case it > stops receiving MRP_Test frames from itself then the MRM will detect that the > ring is open, therefor it would notify the other nodes of this change and will > set the state of the port to be FORWARDING. > > The MRC is responsible for forwarding MRP_Test frames between the ring ports > (and not to flood on other ports) and to listen when there is a change in the > network to clear the FDB. > > Similar with STP, MRP is implemented on top of the bridge and they can't be > enable at the same time. While STP runs on all ports of the bridge, MRP needs to > run only on 2 ports. > > The bridge needs to: > - notify when the link of one of the ports goes down or up, because MRP instance > needs to react to link changes by sending MRP_LinkChange frames. > - notify when one of the ports are removed from the bridge or when the bridge > is destroyed, because if the port is part of the MRP ring then MRP state > machine should be stopped. > - add a handler to allow MRP instance to process MRP frames, if MRP is enabled. > This is similar with STP design. > - add logic for MRP frames inside the bridge. The bridge will just detect MRP > frames and it would forward them to the upper layer to allow to process it. > - update the logic to update non-MRP frames. If MRP is enabled, then look also > at the state of the port to decide to forward or not. > > To create a MRP instance on the bridge: > $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 > > Where: > p_port, s_port: can be any port under the bridge > ring_role: can have the value 1(MRC - Media Redundancy Client) or > 2(MRM - Media Redundancy Manager). In a ring can be only one MRM. > ring_id: unique id for each MRP instance. > > It is possible to create multiple instances. Each instance has to have it's own > ring_id and a port can't be part of multiple instances: > $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 > > To see current MRP instances and their status: > $ bridge mrp show > dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3 > dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4 > > If this patch series is well received, the in the future it could be extended > with the following: > - add support for Media Redundancy Automanager. This role allows a node to > detect if needs to behave as a MRM or MRC. The advantage of this role is that > the user doesn't need to configure the nodes each time they are added/removed > from a ring and it adds redundancy to the manager. > - add support for Interconnect rings. This allow to connect multiple rings. > - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10 > ms). To be able to achieve 30 and 10 it is required by the HW to generate the > MRP_Test frames and detect when the ring is open/closed. > > Horatiu Vultur (3): > net: bridge: mrp: Add support for Media Redundancy Protocol > net: bridge: mrp: Integrate MRP into the bridge > net: bridge: mrp: Add netlink support to configure MRP > > include/uapi/linux/if_bridge.h | 27 + > include/uapi/linux/if_ether.h | 1 + > include/uapi/linux/rtnetlink.h | 7 + > net/bridge/Kconfig | 12 + > net/bridge/Makefile | 2 + > net/bridge/br.c | 19 + > net/bridge/br_device.c | 3 + > net/bridge/br_forward.c | 1 + > net/bridge/br_if.c | 10 + > net/bridge/br_input.c | 22 + > net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++ > net/bridge/br_mrp_timer.c | 227 +++++ > net/bridge/br_netlink.c | 9 + > net/bridge/br_private.h | 30 + > net/bridge/br_private_mrp.h | 208 +++++ > security/selinux/nlmsgtab.c | 5 +- > 16 files changed, 2099 insertions(+), 1 deletion(-) > create mode 100644 net/bridge/br_mrp.c > create mode 100644 net/bridge/br_mrp_timer.c > create mode 100644 net/bridge/br_private_mrp.h >Hi all, I agree with Stephen here, IMO you have to take note of how STP has progressed and that bringing it in the kernel was a mistake, these days mstpd has an active community and much better support which is being extended. This looks best implemented in user-space in my opinion with minimal kernel changes to support it. You could simply open a packet socket with a filter and work through that, you don't need new netlink sockets. I'm not familiar with the protocol so can't really be the judge of that, if you present a good argument for needing a new netlink socket for these packets - then sure, ok. If you do decide to continue with the kernel version (which I would again discourage) a few general points (from a quick scan): - the single 1.6+k line patch is just hard to review, please break it into more digestable and logical pieces - the locking is wrong, also there're a few use-after-free bugs - please re-work the bridge integration code, it can be simplified and tests can be eliminated - your netlink helpers usage is generally wrong and needs more work - use the already existing port states instead of adding new ones and you can avoid some tests in fast-path - perhaps look into using br_afspec() for configuration/retrieval initially ? I don't think you need the new rtm messages yet. - I'm sure I can go on, but I really think all of this should be put in user-space - in-kernel STP is a great example of how _not_ to do it. :) As a bonus you'll avoid 90% of the problems above just by making your own abstractions and using them for it. Thanks, Nik