thr3ads.net - Linux Ethernet Bridging - [Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP [Jan 2020]

If this information is useful, please help other people find it:
Share via:

Allan W. Nielsen

2020-Jan-27 11:04 UTC

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

On 26.01.2020 16:59, Andrew Lunn wrote:>EXTERNAL EMAIL: Do not click links or open attachments unless you know the
content is safe
>
>On Sun, Jan 26, 2020 at 02:22:13PM +0100, Horatiu Vultur wrote:
>> The 01/25/2020 17:35, Andrew Lunn wrote:
>> > EXTERNAL EMAIL: Do not click links or open attachments unless you
know the content is safe
>> >
>> > > SWITCHDEV_OBJ_ID_RING_TEST_MRP: This is used when to
start/stop sending
>> > >   MRP_Test frames on the mrp ring ports. This is called only
on nodes that have
>> > >   the role Media Redundancy Manager.
>> >
>> > How do you handle the 'headless chicken' scenario? User
space tells
>> > the port to start sending MRP_Test frames. It then dies. The
hardware
>> > continues sending these messages, and the neighbours thinks
everything
>> > is O.K, but in reality the state machine is dead, and when the
ring
>> > breaks, the daemon is not there to fix it?I agree, we need to find a solution to this issue.
>> > And it is not just the daemon that could die. The kernel could
opps or
>> > deadlock, etc.
>> >
>> > For a robust design, it seems like SWITCHDEV_OBJ_ID_RING_TEST_MRP
>> > should mean: start sending MRP_Test frames for the next X seconds,
and
>> > then stop. And the request is repeated every X-1 seconds.Sounds like a good idea to me.
>> I totally missed this case, I will update this as you suggest.
>
>What does your hardware actually provide?
>
>Given the design of the protocol, if the hardware decides the OS etc
>is dead, it should stop sending MRP_TEST frames and unblock the ports.
>If then becomes a 'dumb switch', and for a short time there will be
a
>broadcast storm. Hopefully one of the other nodes will then take over
>the role and block a port.As far as I know, the only feature HW has to prevent this is a
watch-dog timer. Which will reset the entire system (not a bad idea if
the kernel has dead-locked).

/Allan

Jürgen Lambrecht

2020-Jan-27 14:26 UTC

head link

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

On 1/27/20 12:04 PM, Allan W. Nielsen wrote:>>> > How do you handle the 'headless chicken' scenario?
User space tells
>>> > the port to start sending MRP_Test frames. It then dies. The
hardware
Andrew, I am a bit confused here - maybe I missed an email-thread, I'm sorry
then.

In previous emails you and others talked about hardware support to send packets
(inside the switch). But somebody also talked about data-plane and control-plane
(about STP in-kernel being a bad idea), and that data-plane is in-kernel, and
control plane is a mrp-daemon (in user space).
And in my mind, the "hardware" you mention is a frame-injector and can
be both real hardware and a driver in the kernel.

Do I see it right?
>>> > continues sending these messages, and the neighbours thinks
everything
>>> > is O.K, but in reality the state machine is dead, and when the
ring
>>> > breaks, the daemon is not there to fix it?
> I agree, we need to find a solution to this issue.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.linuxfoundation.org/pipermail/bridge/attachments/20200127/ec67ea14/attachment-0001.html>

Jürgen Lambrecht

2020-Jan-27 14:41 UTC

head link

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

On 1/27/20 12:04 PM, Allan W. Nielsen wrote:> CAUTION: This Email originated from outside Televic. Do not click links or
open attachments unless you recognize the sender and know the content is safe.
>
>
> On 26.01.2020 16:59, Andrew Lunn wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know
the content is safe
>>
>> On Sun, Jan 26, 2020 at 02:22:13PM +0100, Horatiu Vultur wrote:
>>> The 01/25/2020 17:35, Andrew Lunn wrote:
>>> > EXTERNAL EMAIL: Do not click links or open attachments unless
you know the content is safe
>>> >
>>> > > SWITCHDEV_OBJ_ID_RING_TEST_MRP: This is used when to
start/stop sending
>>> > >?? MRP_Test frames on the mrp ring ports. This is called
only on nodes that have
>>> > >?? the role Media Redundancy Manager.
>>> >
>>> > How do you handle the 'headless chicken' scenario?
User space tells
>>> > the port to start sending MRP_Test frames. It then dies. The
hardware
>>> > continues sending these messages, and the neighbours thinks
everything
>>> > is O.K, but in reality the state machine is dead, and when the
ring
>>> > breaks, the daemon is not there to fix it?
> I agree, we need to find a solution to this issue.
>
>>> > And it is not just the daemon that could die. The kernel could
opps or
>>> > deadlock, etc.
>>> >
>>> > For a robust design, it seems like
SWITCHDEV_OBJ_ID_RING_TEST_MRP
>>> > should mean: start sending MRP_Test frames for the next X
seconds, and
>>> > then stop. And the request is repeated every X-1 seconds.
> Sounds like a good idea to me.
Indeed, and it should then do the same as mentioned below and "... come a
'dumb switch' ", except that I propose to make it configurable how
to fallback: with auto-recovery ('dumb switch') or safe mode that keeps
the ports blocked, and then some higher layer protocol should fix it.
>
>>> I totally missed this case, I will update this as you suggest.
>>
>> What does your hardware actually provide?
>>
>> Given the design of the protocol, if the hardware decides the OS etc
>> is dead, it should stop sending MRP_TEST frames and unblock the ports.
>> If then becomes a 'dumb switch', and for a short time there
will be a
>> broadcast storm. Hopefully one of the other nodes will then take over
>> the role and block a port.
> As far as I know, the only feature HW has to prevent this is a
> watch-dog timer. Which will reset the entire system (not a bad idea if
> the kernel has dead-locked).Indeed. Our designs always have a watchdog.

And then I again propose to have 2 bootup options.

I refer here also to my answer on Allan's answer on my email of 12:29PM.

Kind regards,

J?rgen
>
> /Allan
>

Andrew Lunn

2020-Jan-27 15:06 UTC

head link

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

On Mon, Jan 27, 2020 at 03:26:38PM +0100, J?rgen Lambrecht
wrote:> On 1/27/20 12:04 PM, Allan W. Nielsen wrote:
> 
>             > How do you handle the 'headless chicken' scenario?
User space
>             tells
>             > the port to start sending MRP_Test frames. It then dies.
The
>             hardware
> 
> Andrew, I am a bit confused here - maybe I missed an email-thread, I'm
sorry
> then.
> 
> In previous emails you and others talked about hardware support to send
packets
> (inside the switch). But somebody also talked about data-plane and
> control-plane (about STP in-kernel being a bad idea), and that data-plane
is
> in-kernel, and control plane is a mrp-daemon (in user space).
> And in my mind, the "hardware" you mention is a frame-injector
and can be both
> real hardware and a driver in the kernel.
> 
> Do I see it right?
Hi J?rgen

It i still unclear where the MRP_Test frames should be generated,
forward and consumed, either in kernel, or in user space.

The userspace RSTP daemon generates and consumes all the BPDUs in
userspace. But BPDUs are never forwarded. However MRP_Test frames are
forwarded by client nodes. Are the MRP_Test frames then part of the
data plane in a client?

What i think is clear is that the state machine is in user space.

For the rest, we are still exploring possibilities.

    Andrew

Linux Ethernet Bridging - Jan 2020 - [Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP

[Bridge] [RFC net-next v3 06/10] net: bridge: mrp: switchdev: Extend switchdev API to offload MRP