Vladimir Oltean
2022-Mar-23 11:21 UTC
[Bridge] [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation
On Wed, Mar 23, 2022 at 11:57:16AM +0100, Hans Schultz wrote:> >> >> Another issue I see, is that there is a deadlock or similar issue when > >> >> receiving violations and running 'bridge fdb show' (it seemed that > >> >> member violations also caused this, but not sure yet...), as the unit > >> >> freezes, not to return... > >> > > >> > Have you enabled lockdep, debug atomic sleep, detect hung tasks, things > >> > like that? > >> > >> I have now determined that it is the rtnl_lock() that causes the > >> "deadlock". The doit() in rtnetlink.c is under rtnl_lock() and is what > >> takes care of getting the fdb entries when running 'bridge fdb show'. In > >> principle there should be no problem with this, but I don't know if some > >> interrupt queue is getting jammed as they are blocked from rtnetlink.c? > > > > Sorry, I forgot to respond yesterday to this. > > By any chance do you maybe have an AB/BA lock inversion, where from the > > ATU interrupt handler you do mv88e6xxx_reg_lock() -> rtnl_lock(), while > > from the port_fdb_dump() handler you do rtnl_lock() -> mv88e6xxx_reg_lock()? > > If I release the mv88e6xxx_reg_lock() before calling the handler, I need > to get it again for the mv88e6xxx_g1_atu_loadpurge() call at least. But > maybe the vtu_walk also needs the mv88e6xxx_reg_lock()? > I could also just release the mv88e6xxx_reg_lock() before the > call_switchdev_notifiers() call and reacquire it immediately after?The cleanest way to go about this would be to have the call_switchdev_notifiers() portion of the ATU interrupt handling at the very end of mv88e6xxx_g1_atu_prob_irq_thread_fn(), with no hardware access needed, and therefore no reg_lock() held.
Hans Schultz
2022-Mar-23 11:43 UTC
[Bridge] [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation
On ons, mar 23, 2022 at 13:21, Vladimir Oltean <olteanv at gmail.com> wrote:> On Wed, Mar 23, 2022 at 11:57:16AM +0100, Hans Schultz wrote: >> >> >> Another issue I see, is that there is a deadlock or similar issue when >> >> >> receiving violations and running 'bridge fdb show' (it seemed that >> >> >> member violations also caused this, but not sure yet...), as the unit >> >> >> freezes, not to return... >> >> > >> >> > Have you enabled lockdep, debug atomic sleep, detect hung tasks, things >> >> > like that? >> >> >> >> I have now determined that it is the rtnl_lock() that causes the >> >> "deadlock". The doit() in rtnetlink.c is under rtnl_lock() and is what >> >> takes care of getting the fdb entries when running 'bridge fdb show'. In >> >> principle there should be no problem with this, but I don't know if some >> >> interrupt queue is getting jammed as they are blocked from rtnetlink.c? >> > >> > Sorry, I forgot to respond yesterday to this. >> > By any chance do you maybe have an AB/BA lock inversion, where from the >> > ATU interrupt handler you do mv88e6xxx_reg_lock() -> rtnl_lock(), while >> > from the port_fdb_dump() handler you do rtnl_lock() -> mv88e6xxx_reg_lock()? >> >> If I release the mv88e6xxx_reg_lock() before calling the handler, I need >> to get it again for the mv88e6xxx_g1_atu_loadpurge() call at least. But >> maybe the vtu_walk also needs the mv88e6xxx_reg_lock()? >> I could also just release the mv88e6xxx_reg_lock() before the >> call_switchdev_notifiers() call and reacquire it immediately after? > > The cleanest way to go about this would be to have the call_switchdev_notifiers() > portion of the ATU interrupt handling at the very end of mv88e6xxx_g1_atu_prob_irq_thread_fn(), > with no hardware access needed, and therefore no reg_lock() held.So something like? mv88e6xxx_reg_unlock(chip); rtnl_lock(); err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, brport, &info.info, NULL); rtnl_unlock(); mv88e6xxx_reg_lock(chip);