Linus Torvalds
2022-Oct-27 20:48 UTC
[Bridge] [RFC][PATCH v2 19/31] timers: net: Use del_timer_shutdown() before freeing timer
On Thu, Oct 27, 2022 at 1:34 PM Steven Rostedt <rostedt at goodmis.org> wrote:> > What about del_timer_try_shutdown(), that if it removes the timer, it sets > the function to NULL (making it equivalent to a successful shutdown), > otherwise it does nothing. Allowing the the timer to be rearmed.Sounds sane to me and should work, but as mentioned, I think the networking people need to say "yeah" too. And maybe that function can also disallow any future re-arming even for the case where the timer couldn't be actively removed. So any *currently* active timer wouldn't be waited for (either because locking may make that a deadlock situation, or simply due to performance issues), but at least it would guarantee that no new timer activations can happen. Because I do like the whole notion of "timer has been shutdown and cannot be used as a timer any more without re-initializing it" being a real state - even for a timer that may be "currently in flight". So this all sounds very worthwhile to me, but I'm not surprised that we have code that then knows about all the subtleties of "del_timer() might still have a running timer" and actually take advantage of it (where "advantage" is likely more of a "deal with the complexities" rather than anything really positive ;) And those existing subtle users might want particular semantics to at least make said complexities easier. Linus
Steven Rostedt
2022-Oct-27 21:07 UTC
[Bridge] [RFC][PATCH v2 19/31] timers: net: Use del_timer_shutdown() before freeing timer
On Thu, 27 Oct 2022 13:48:54 -0700 Linus Torvalds <torvalds at linux-foundation.org> wrote:> On Thu, Oct 27, 2022 at 1:34 PM Steven Rostedt <rostedt at goodmis.org> wrote: > > > > What about del_timer_try_shutdown(), that if it removes the timer, it sets > > the function to NULL (making it equivalent to a successful shutdown), > > otherwise it does nothing. Allowing the the timer to be rearmed. > > Sounds sane to me and should work, but as mentioned, I think the > networking people need to say "yeah" too. > > And maybe that function can also disallow any future re-arming even > for the case where the timer couldn't be actively removed.Well, I think this current use case will break if we prevent the timer from being rearmed or run again if it's not found. But as you said, the networking folks need to confirm or deny it. The fact that it does the sock_put() when it removes the timer makes me think that it can be called again, and we shouldn't prevent that from happening. The debug code will let us know too, as it only "frees" it for freeing if it deactivated the timer and shut it down.> > So any *currently* active timer wouldn't be waited for (either because > locking may make that a deadlock situation, or simply due to > performance issues), but at least it would guarantee that no new timer > activations can happen. > > Because I do like the whole notion of "timer has been shutdown and > cannot be used as a timer any more without re-initializing it" being a > real state - even for a timer that may be "currently in flight". > > So this all sounds very worthwhile to me, but I'm not surprised that > we have code that then knows about all the subtleties of "del_timer() > might still have a running timer" and actually take advantage of it > (where "advantage" is likely more of a "deal with the complexities" > rather than anything really positive ;)Good to hear. This has been a thorn in our side as we keep hitting these crashes in the timer code that look like a timer was freed before it triggered.> > And those existing subtle users might want particular semantics to at > least make said complexities easier. >Yeah, as someone told me recently, "If you let them play long enough without setting out the rules, they will take advantage of everything and it will be extremely hard to get them back in order". -- Steve