Andrew Lunn
2022-Mar-17 14:19 UTC
[Bridge] [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation
On Thu, Mar 17, 2022 at 09:52:15AM +0100, Hans Schultz wrote:> On tor, mar 17, 2022 at 01:34, Vladimir Oltean <olteanv at gmail.com> wrote: > > On Mon, Mar 14, 2022 at 11:46:51AM +0100, Hans Schultz wrote: > >> >> @@ -396,6 +414,13 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) > >> >> "ATU miss violation for %pM portvec %x spid %d\n", > >> >> entry.mac, entry.portvec, spid); > >> >> chip->ports[spid].atu_miss_violation++; > >> >> + if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port)) > >> >> + err = mv88e6xxx_switchdev_handle_atu_miss_violation(chip, > >> >> + chip->ports[spid].port, > >> >> + &entry, > >> >> + fid); > >> > > >> > Do we want to suppress the ATU miss violation warnings if we're going to > >> > notify the bridge, or is it better to keep them for some reason? > >> > My logic is that they're part of normal operation, so suppressing makes > >> > sense. > >> > > >> > >> I have been seeing many ATU member violations after the miss violation is > >> handled (using ping), and I think it could be considered to suppress the ATU member > >> violations interrupts by setting the IgnoreWrongData bit for the > >> port (sect 4.4.7). This would be something to do whenever a port is set in locked mode? > > > > So the first packet with a given MAC SA triggers an ATU miss violation > > interrupt. > > > > You program that MAC SA into the ATU with a destination port mask of all > > zeroes. This suppresses further ATU miss interrupts for this MAC SA, but > > now generates ATU member violations, because the MAC SA _is_ present in > > the ATU, but not towards the expected port (in fact, towards _no_ port). > > > > Especially if user space decides it doesn't want to authorize this MAC > > SA, it really becomes a problem because this is now a vector for denial > > of service, with every packet triggering an ATU member violation > > interrupt. > > > > So your suggestion is to set the IgnoreWrongData bit on locked ports, > > and this will suppress the actual member violation interrupts for > > traffic coming from these ports. > > > > So if the user decides to unplug a previously authorized printer from > > switch port 1 and move it to port 2, how is this handled? If there isn't > > a mechanism in place to delete the locked FDB entry when the printer > > goes away, then by setting IgnoreWrongData you're effectively also > > suppressing migration notifications. > > I don't think such a scenario is so realistic, as changing port is not > just something done casually, besides port 2 then must also be a locked > port to have the same policy.I think it is very realistic. It is also something which does not work is going to cause a lot of confusion. People will blame the printer, when in fact they should be blaming the switch. They will be rebooting the printer, when in fact, they need to reboot the switch etc. I expect there is a way to cleanly support this, you just need to figure it out.> The other aspect is that the user space daemon that authorizes catches > the fdb add entry events and checks if it is a locked entry. So it will > be up to said daemon to decide the policy, like remove the fdb entry > after a timeout. > > > > > Oh, btw, my question was: could you consider suppressing the _prints_ on > > an ATU miss violation on a locked port? > > As there will only be such on the first packet, I think it should be > logged and those prints serve that purpose, so I think it is best to > keep the print. > If in the future some tests or other can argue for suppressing the > prints, it is an easy thing to do.Please use a traffic generator and try to DOS one of your own switches. Can you? Andrew
Vladimir Oltean
2022-Mar-17 15:36 UTC
[Bridge] [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation
On Thu, Mar 17, 2022 at 03:19:46PM +0100, Andrew Lunn wrote:> On Thu, Mar 17, 2022 at 09:52:15AM +0100, Hans Schultz wrote: > > On tor, mar 17, 2022 at 01:34, Vladimir Oltean <olteanv at gmail.com> wrote: > > > On Mon, Mar 14, 2022 at 11:46:51AM +0100, Hans Schultz wrote: > > >> >> @@ -396,6 +414,13 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) > > >> >> "ATU miss violation for %pM portvec %x spid %d\n", > > >> >> entry.mac, entry.portvec, spid); > > >> >> chip->ports[spid].atu_miss_violation++; > > >> >> + if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port)) > > >> >> + err = mv88e6xxx_switchdev_handle_atu_miss_violation(chip, > > >> >> + chip->ports[spid].port, > > >> >> + &entry, > > >> >> + fid); > > >> > > > >> > Do we want to suppress the ATU miss violation warnings if we're going to > > >> > notify the bridge, or is it better to keep them for some reason? > > >> > My logic is that they're part of normal operation, so suppressing makes > > >> > sense. > > >> > > > >> > > >> I have been seeing many ATU member violations after the miss violation is > > >> handled (using ping), and I think it could be considered to suppress the ATU member > > >> violations interrupts by setting the IgnoreWrongData bit for the > > >> port (sect 4.4.7). This would be something to do whenever a port is set in locked mode? > > > > > > So the first packet with a given MAC SA triggers an ATU miss violation > > > interrupt. > > > > > > You program that MAC SA into the ATU with a destination port mask of all > > > zeroes. This suppresses further ATU miss interrupts for this MAC SA, but > > > now generates ATU member violations, because the MAC SA _is_ present in > > > the ATU, but not towards the expected port (in fact, towards _no_ port). > > > > > > Especially if user space decides it doesn't want to authorize this MAC > > > SA, it really becomes a problem because this is now a vector for denial > > > of service, with every packet triggering an ATU member violation > > > interrupt. > > > > > > So your suggestion is to set the IgnoreWrongData bit on locked ports, > > > and this will suppress the actual member violation interrupts for > > > traffic coming from these ports. > > > > > > So if the user decides to unplug a previously authorized printer from > > > switch port 1 and move it to port 2, how is this handled? If there isn't > > > a mechanism in place to delete the locked FDB entry when the printer > > > goes away, then by setting IgnoreWrongData you're effectively also > > > suppressing migration notifications. > > > > I don't think such a scenario is so realistic, as changing port is not > > just something done casually, besides port 2 then must also be a locked > > port to have the same policy. > > I think it is very realistic. It is also something which does not work > is going to cause a lot of confusion. People will blame the printer, > when in fact they should be blaming the switch. They will be rebooting > the printer, when in fact, they need to reboot the switch etc. > > I expect there is a way to cleanly support this, you just need to > figure it out.Hans, why must port 2 also be a locked port? The FDB entry with no destinations is present in the ATU, and static, why would just locked ports match it?> > The other aspect is that the user space daemon that authorizes catches > > the fdb add entry events and checks if it is a locked entry. So it will > > be up to said daemon to decide the policy, like remove the fdb entry > > after a timeout.When you say 'timeout', what is the moment when the timer starts counting? The last reception of the user space daemon of a packet with this MAC SA, or the moment when the FDB entry originally became unlocked? I expect that once a device is authorized, and forwarding towards the devices that it wants to talk to is handled in hardware, that the CPU no longer receives packets from this device. In other words, are you saying that you're going to break networking for the printer every 5 minutes, as a keepalive measure? I still think there should be a functional fast path for authorized station migrations.> > > Oh, btw, my question was: could you consider suppressing the _prints_ on > > > an ATU miss violation on a locked port? > > > > As there will only be such on the first packet, I think it should be > > logged and those prints serve that purpose, so I think it is best to > > keep the print. > > If in the future some tests or other can argue for suppressing the > > prints, it is an easy thing to do. > > Please use a traffic generator and try to DOS one of your own > switches. Can you? > > Andrew
Hans Schultz
2022-Mar-21 14:51 UTC
[Bridge] [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation
On tor, mar 17, 2022 at 15:19, Andrew Lunn <andrew at lunn.ch> wrote:> On Thu, Mar 17, 2022 at 09:52:15AM +0100, Hans Schultz wrote: >> On tor, mar 17, 2022 at 01:34, Vladimir Oltean <olteanv at gmail.com> wrote: >> > On Mon, Mar 14, 2022 at 11:46:51AM +0100, Hans Schultz wrote: >> >> >> @@ -396,6 +414,13 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) >> >> >> "ATU miss violation for %pM portvec %x spid %d\n", >> >> >> entry.mac, entry.portvec, spid); >> >> >> chip->ports[spid].atu_miss_violation++; >> >> >> + if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port)) >> >> >> + err = mv88e6xxx_switchdev_handle_atu_miss_violation(chip, >> >> >> + chip->ports[spid].port, >> >> >> + &entry, >> >> >> + fid); >> >> > >> >> > Do we want to suppress the ATU miss violation warnings if we're going to >> >> > notify the bridge, or is it better to keep them for some reason? >> >> > My logic is that they're part of normal operation, so suppressing makes >> >> > sense. >> >> > >> >> >> >> I have been seeing many ATU member violations after the miss violation is >> >> handled (using ping), and I think it could be considered to suppress the ATU member >> >> violations interrupts by setting the IgnoreWrongData bit for the >> >> port (sect 4.4.7). This would be something to do whenever a port is set in locked mode? >> > >> > So the first packet with a given MAC SA triggers an ATU miss violation >> > interrupt. >> > >> > You program that MAC SA into the ATU with a destination port mask of all >> > zeroes. This suppresses further ATU miss interrupts for this MAC SA, but >> > now generates ATU member violations, because the MAC SA _is_ present in >> > the ATU, but not towards the expected port (in fact, towards _no_ port). >> > >> > Especially if user space decides it doesn't want to authorize this MAC >> > SA, it really becomes a problem because this is now a vector for denial >> > of service, with every packet triggering an ATU member violation >> > interrupt. >> > >> > So your suggestion is to set the IgnoreWrongData bit on locked ports, >> > and this will suppress the actual member violation interrupts for >> > traffic coming from these ports. >> > >> > So if the user decides to unplug a previously authorized printer from >> > switch port 1 and move it to port 2, how is this handled? If there isn't >> > a mechanism in place to delete the locked FDB entry when the printer >> > goes away, then by setting IgnoreWrongData you're effectively also >> > suppressing migration notifications. >> >> I don't think such a scenario is so realistic, as changing port is not >> just something done casually, besides port 2 then must also be a locked >> port to have the same policy. > > I think it is very realistic. It is also something which does not work > is going to cause a lot of confusion. People will blame the printer, > when in fact they should be blaming the switch. They will be rebooting > the printer, when in fact, they need to reboot the switch etc. > > I expect there is a way to cleanly support this, you just need to > figure it out. > >> The other aspect is that the user space daemon that authorizes catches >> the fdb add entry events and checks if it is a locked entry. So it will >> be up to said daemon to decide the policy, like remove the fdb entry >> after a timeout. >> >> > >> > Oh, btw, my question was: could you consider suppressing the _prints_ on >> > an ATU miss violation on a locked port? >> >> As there will only be such on the first packet, I think it should be >> logged and those prints serve that purpose, so I think it is best to >> keep the print. >> If in the future some tests or other can argue for suppressing the >> prints, it is an easy thing to do. > > Please use a traffic generator and try to DOS one of your own > switches. Can you? > > AndrewHere is a trafgen report, where I sent packets to a locked port with random SAs: 42527020 packets outgoing 3104472460 bytes outgoing 329 sec, 989345 usec on CPU0 (5835746 packets) 329 sec, 985243 usec on CPU1 (2119061 packets) 329 sec, 997323 usec on CPU2 (5656546 packets) 329 sec, 989475 usec on CPU3 (5617322 packets) 330 sec, 5228 usec on CPU4 (6034671 packets) 330 sec, 1603 usec on CPU5 (5833505 packets) 329 sec, 989319 usec on CPU6 (5709841 packets) 329 sec, 989294 usec on CPU7 (5720328 packets) I could do 'bridge fdb show' after stopping the traffic, printing out a very long list (minutes to print). The ATU was normal, so there is an issue of the soft FDB locked entries not ageing out. I saw many reports of suppressed IRQs in the kernel log.