Sterling Augustine via llvm-dev
2021-Apr-14 16:38 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
[Continuing discussion from https://reviews.llvm.org/D69428] Llvm is fairly conservative when eliminating global variables (or fields of such) that may point to dynamically allocated memory. This behavior is entirely to help leak checking tools such as Valgrind, Google's HeapLeakChecker, and LSAN, all of which treat memory that is reachable at exit as "not leaked", even though it will never be freed. Without these global variables to hold the pointer, the leak checkers can't determine that it is actually reachable, and will report a leak. Global variables that dynamically allocate memory but don't clean themselves up are fairly common in the wild, and various leak checkers have long not reported errors. This behavior was added all the way back in 2012 in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html . https://reviews.llvm.org/D69428 removed this behavior, and I subsequently reverted it when many internal Google tests started failing, but I believe many other users who use leak checking will encounter errors when this hits more mainstream releases. So: What to do? Preventing a valid transformation (the global variables are never read and can be eliminated) to help the leak checkers leaves some performance and code size on the table. Just how much is unclear. On the other hand, having leak checkers suddenly start reporting failures where they didn't before also seems suboptimal. Cleaning this somewhat common scenario up is surprisingly difficult at the user level. Some possibilities: 1. Only do this at high optimization levels, say -O3. This would give aggressive users all the performance we can, but also make leak checkers report leaks sometimes, but not others. 2. Hide it behind a flag or configurable option. Users who care can set it as they prefer. Creates more confusing options, different testing matrices and such, but everyone can get the behaviour that they want. 3. Do it all the time, and users who encounter issues can clean up their code. Users get the most performance they possibly can, but have to clean up code or drop leak checking. Seems a little user hostile. Other possibilities?: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210414/06b51762/attachment.html>
David Blaikie via llvm-dev
2021-Apr-14 17:02 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
What was the original motivation for making the change - given the gains are unclear? Might help frame what the right path forward is. If the goal is to experiment to see if making this change is sufficiently valuable to either regress heap leak checkers or create a divergence where this is configurable/appears only in certain modes - then I'd say having a flag (maybe even an internal/cc1 only flag) is the right first step, enabling data to be gathered to help inform the design discussion. On Wed, Apr 14, 2021 at 9:39 AM Sterling Augustine via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > [Continuing discussion from https://reviews.llvm.org/D69428] > > Llvm is fairly conservative when eliminating global variables (or fields of such) that may point to dynamically allocated memory. This behavior is entirely to help leak checking tools such as Valgrind, Google's HeapLeakChecker, and LSAN, all of which treat memory that is reachable at exit as "not leaked", even though it will never be freed. Without these global variables to hold the pointer, the leak checkers can't determine that it is actually reachable, and will report a leak. Global variables that dynamically allocate memory but don't clean themselves up are fairly common in the wild, and various leak checkers have long not reported errors. > > This behavior was added all the way back in 2012 in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html. > > https://reviews.llvm.org/D69428 removed this behavior, and I subsequently reverted it when many internal Google tests started failing, but I believe many other users who use leak checking will encounter errors when this hits more mainstream releases. > > So: What to do? > > Preventing a valid transformation (the global variables are never read and can be eliminated) to help the leak checkers leaves some performance and code size on the table. Just how much is unclear. > > On the other hand, having leak checkers suddenly start reporting failures where they didn't before also seems suboptimal. Cleaning this somewhat common scenario up is surprisingly difficult at the user level. > > Some possibilities: > > 1. Only do this at high optimization levels, say -O3. This would give aggressive users all the performance we can, but also make leak checkers report leaks sometimes, but not others. > > 2. Hide it behind a flag or configurable option. Users who care can set it as they prefer. Creates more confusing options, different testing matrices and such, but everyone can get the behaviour that they want. > > 3. Do it all the time, and users who encounter issues can clean up their code. Users get the most performance they possibly can, but have to clean up code or drop leak checking. Seems a little user hostile. > > Other possibilities?: > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Nuno Lopes via llvm-dev
2021-Apr-14 18:03 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
Most (all?) leak checkers support suppression files. Isn’t that sufficient for your use case? Marking your leak roots with __attribute((used))__ is also an alternative. I understand that leaking memory on purpose happens because it’s expensive to clean it up. But reachable memory may well be a true leak. So flagging it as such is useful. None of us has data about the % of reachable memory that is a true leak, so it’s not possible to argue what’s user friendly/hostile. Programs that leak memory on purpose are often sophisticated. And sophisticated devs can handle a little bit of extra effort to hide those smarts I think. Nuno P.S.: The original patch went in almost a decade ago when the ecosystem was a bit less developed. It was always meant to be temporary. From: Sterling Augustine Sent: 14 April 2021 17:39 To: llvm-dev <llvm-dev at lists.llvm.org> Subject: [llvm-dev] Eliminating global memory roots (or not) to help leak checkers [Continuing discussion from https://reviews.llvm.org/D69428] Llvm is fairly conservative when eliminating global variables (or fields of such) that may point to dynamically allocated memory. This behavior is entirely to help leak checking tools such as Valgrind, Google's HeapLeakChecker, and LSAN, all of which treat memory that is reachable at exit as "not leaked", even though it will never be freed. Without these global variables to hold the pointer, the leak checkers can't determine that it is actually reachable, and will report a leak. Global variables that dynamically allocate memory but don't clean themselves up are fairly common in the wild, and various leak checkers have long not reported errors. This behavior was added all the way back in 2012 in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html. https://reviews.llvm.org/D69428 removed this behavior, and I subsequently reverted it when many internal Google tests started failing, but I believe many other users who use leak checking will encounter errors when this hits more mainstream releases. So: What to do? Preventing a valid transformation (the global variables are never read and can be eliminated) to help the leak checkers leaves some performance and code size on the table. Just how much is unclear. On the other hand, having leak checkers suddenly start reporting failures where they didn't before also seems suboptimal. Cleaning this somewhat common scenario up is surprisingly difficult at the user level. Some possibilities: 1. Only do this at high optimization levels, say -O3. This would give aggressive users all the performance we can, but also make leak checkers report leaks sometimes, but not others. 2. Hide it behind a flag or configurable option. Users who care can set it as they prefer. Creates more confusing options, different testing matrices and such, but everyone can get the behaviour that they want. 3. Do it all the time, and users who encounter issues can clean up their code. Users get the most performance they possibly can, but have to clean up code or drop leak checking. Seems a little user hostile. Other possibilities?: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210414/410869b1/attachment.html>
Vitaly Buka via llvm-dev
2021-Apr-14 21:52 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
On Wed, 14 Apr 2021 at 09:39, Sterling Augustine via llvm-dev < llvm-dev at lists.llvm.org> wrote:> [Continuing discussion from https://reviews.llvm.org/D69428] > > Llvm is fairly conservative when eliminating global variables (or fields > of such) that may point to dynamically allocated memory. This behavior is > entirely to help leak checking tools such as Valgrind, Google's > HeapLeakChecker, and LSAN, all of which treat memory that is reachable at > exit as "not leaked", even though it will never be freed. Without these > global variables to hold the pointer, the leak checkers can't determine > that it is actually reachable, and will report a leak. Global variables > that dynamically allocate memory but don't clean themselves up are fairly > common in the wild, and various leak checkers have long not reported errors. > > This behavior was added all the way back in 2012 in > https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html > . > > https://reviews.llvm.org/D69428 removed this behavior, and I subsequently > reverted it when many internal Google tests started failing, but I believe > many other users who use leak checking will encounter errors when this hits > more mainstream releases. > > So: What to do? > > Preventing a valid transformation (the global variables are never read and > can be eliminated) to help the leak checkers leaves some performance and > code size on the table. Just how much is unclear. > > On the other hand, having leak checkers suddenly start reporting failures > where they didn't before also seems suboptimal. Cleaning this somewhat > common scenario up is surprisingly difficult at the user level. > > Some possibilities: > > 1. Only do this at high optimization levels, say -O3. This would give > aggressive users all the performance we can, but also make leak checkers > report leaks sometimes, but not others. >This could be disabled for Asan by default as it very likely runs with lsan.> > 2. Hide it behind a flag or configurable option. Users who care can set it > as they prefer. Creates more confusing options, different testing matrices > and such, but everyone can get the behaviour that they want. > > 3. Do it all the time, and users who encounter issues can clean up their > code. Users get the most performance they possibly can, but have to clean > up code or drop leak checking. Seems a little user hostile. >I expect this requires significant cleanup effort, and not just in Google. It's quite a common pattern, but it would be nice to see some real data.> > Other possibilities?: >5. Maybe replace writes into such removed global with no-op callback which can be intercepted by LeakChecker? This will prevent other useful optimizations. _______________________________________________> LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210414/5f4f1e72/attachment.html>
Philip Reames via llvm-dev
2021-Apr-14 22:28 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
Don't really have an opinion on the question as asked, but want to point out an alternate framing. I will comment that the code being removed looks more than a bit fragile. From a very quick skim of the original code, it appears to be focused on DCE of globals. If we wanted to keep the "leak detector safe" semantic, but allow more aggressive optimization, we could re-imagine this as global SROA. We could deconstruct the struct/array/etc and keep only the fields which could potentially be allocation roots. We could also write an optimization which leverages the knowledge of the allocation root being otherwise unused to eliminate mallocs which are stored into them. I haven't fully thought that through, but it seems like we have quite a bit of room to optimize better without changing our handling for the leak detectors. Philip On 4/14/21 9:38 AM, Sterling Augustine via llvm-dev wrote:> [Continuing discussion from https://reviews.llvm.org/D69428 > <https://reviews.llvm.org/D69428>] > > Llvm is fairly conservative when eliminating global variables (or > fields of such) that may point to dynamically allocated memory. This > behavior is entirely to help leak checking tools such as Valgrind, > Google's HeapLeakChecker, and LSAN, all of which treat memory that is > reachable at exit as "not leaked", even though it will never be freed. > Without these global variables to hold the pointer, the leak checkers > can't determine that it is actually reachable, and will report a leak. > Global variables that dynamically allocate memory but don't clean > themselves up are fairly common in the wild, and various leak checkers > have long not reported errors. > > This behavior was added all the way back in 2012 in > https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html > <https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html>. > > https://reviews.llvm.org/D69428 <https://reviews.llvm.org/D69428> > removed this behavior, and I subsequently reverted it when many > internal Google tests started failing, but I believe many other users > who use leak checking will encounter errors when this hits more > mainstream releases. > > So: What to do? > > Preventing a valid transformation (the global variables are never read > and can be eliminated) to help the leak checkers leaves some > performance and code size on the table. Just how much is unclear. > > On the other hand, having leak checkers suddenly start reporting > failures where they didn't before also seems suboptimal. Cleaning this > somewhat common scenario up is surprisingly difficult at the user level. > > Some possibilities: > > 1. Only do this at high optimization levels, say -O3. This would give > aggressive users all the performance we can, but also make leak > checkers report leaks sometimes, but not others. > > 2. Hide it behind a flag or configurable option. Users who care can > set it as they prefer. Creates more confusing options, different > testing matrices and such, but everyone can get the behaviour that > they want. > > 3. Do it all the time, and users who encounter issues can clean up > their code. Users get the most performance they possibly can, but have > to clean up code or drop leak checking. Seems a little user hostile. > > Other possibilities?: > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210414/0382fd0d/attachment.html>
Chris Lattner via llvm-dev
2021-Apr-19 04:16 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
Hi Sterling, I agree with the others: there are better (and more robust ways) to disable leak checkers. This only addresses one narrow case. The code is also super verbose and fragile looking. This is also eliminating an optimization. I’d recommend dropping the code. -Chris> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > [Continuing discussion from https://reviews.llvm.org/D69428 <https://reviews.llvm.org/D69428>] > > Llvm is fairly conservative when eliminating global variables (or fields of such) that may point to dynamically allocated memory. This behavior is entirely to help leak checking tools such as Valgrind, Google's HeapLeakChecker, and LSAN, all of which treat memory that is reachable at exit as "not leaked", even though it will never be freed. Without these global variables to hold the pointer, the leak checkers can't determine that it is actually reachable, and will report a leak. Global variables that dynamically allocate memory but don't clean themselves up are fairly common in the wild, and various leak checkers have long not reported errors. > > This behavior was added all the way back in 2012 in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html <https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html>. > > https://reviews.llvm.org/D69428 <https://reviews.llvm.org/D69428> removed this behavior, and I subsequently reverted it when many internal Google tests started failing, but I believe many other users who use leak checking will encounter errors when this hits more mainstream releases. > > So: What to do? > > Preventing a valid transformation (the global variables are never read and can be eliminated) to help the leak checkers leaves some performance and code size on the table. Just how much is unclear. > > On the other hand, having leak checkers suddenly start reporting failures where they didn't before also seems suboptimal. Cleaning this somewhat common scenario up is surprisingly difficult at the user level. > > Some possibilities: > > 1. Only do this at high optimization levels, say -O3. This would give aggressive users all the performance we can, but also make leak checkers report leaks sometimes, but not others. > > 2. Hide it behind a flag or configurable option. Users who care can set it as they prefer. Creates more confusing options, different testing matrices and such, but everyone can get the behaviour that they want. > > 3. Do it all the time, and users who encounter issues can clean up their code. Users get the most performance they possibly can, but have to clean up code or drop leak checking. Seems a little user hostile. > > Other possibilities?: > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210418/2a7e00ae/attachment-0001.html>