Sterling Augustine via llvm-dev
2021-Apr-20 16:12 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
In order to understand how much benefit this change gives to code size, I built clang and related files with and without the patch, both with CMAKE_BUILD_TYPE=Release. clang itself gets about 0.4% smaller (from 145217124 to 144631078) lld gets about 1.85% smaller (from 101129181 to 99243810 bytes) Spot checking a few other random targets, in general, it seems that the benefits depend heavily on the coding style, but roughly, bigger the binary, the less benefit this brings. I suspect that a more aggressive pass as described by Philip Reames could capture a significant part of the benefit without sacrificing functionality. But that would require a more detailed study to be sure. On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at google.com> wrote:> There may be other ways to disable leak checkers, but they put a burden on > users that was not there before. Further, users who try leak checking with > and without optimization will get different answers. The bug report will > read: "clang at -O2 makes my program leak". And, as James notes, whether or > not you need to suppress the leak depends on whether or not the optimizer > does away with the variable. Subtle changes to the code that have nothing > to do with memory allocation will appear to add or fix leaks. That is not > something I would care to explain or document. > > Although the code could definitely be improved, the comments that it is > brittle or fragile seems overstated. Google uses this code without issue > against against many millions of lines of code every day, in > tens-of-thousands of targets, and has since 2012. It may be ugly, but it > does work. We are open to improving it, if it is only the ugliness that is > of concern. > > I am working on quantifying the benefit the change gives, but I won't have > results until tomorrow. > > On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at google.com> > wrote: > >> There is no problem (no leaks) in the code that users wrote, so adding >> code annotations (sanitizer suppression file, or attributes) is not a good >> solution to this issue. The problem is that this optimization introduces a >> "leak" (from the point of view of the leak checker), which wasn't there >> before. And in practice, this seems to cause a large number of false >> positives. >> >> This can apparently happen simply by having the code which reads from a >> global variable happen to be unused or optimized away in the current >> compilation. Users would be effectively randomly annotating variables to >> work around the compiler, not for any reason apparent in the code they >> wrote. >> >> I don't know what the right answer should be -- I've not looked into it >> at all. But I don't think it's correct to just dismiss this as >> not-a-problem, even if the current solution is not a good solution. >> >> >> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi Sterling, >>> >>> I agree with the others: therIe are better (and more robust ways) to >>> disable leak checkers. This only addresses one narrow case. The code is >>> also super verbose and fragile looking. This is also eliminating an >>> optimization. >>> >>> I’d recommend dropping the code. >>> >>> -Chris >>> >>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>> [Continuing discussion from https://reviews.llvm.org/D69428] >>> >>> Llvm is fairly conservative when eliminating global variables (or fields >>> of such) that may point to dynamically allocated memory. This behavior is >>> entirely to help leak checking tools such as Valgrind, Google's >>> HeapLeakChecker, and LSAN, all of which treat memory that is reachable at >>> exit as "not leaked", even though it will never be freed. Without these >>> global variables to hold the pointer, the leak checkers can't determine >>> that it is actually reachable, and will report a leak. Global variables >>> that dynamically allocate memory but don't clean themselves up are fairly >>> common in the wild, and various leak checkers have long not reported errors. >>> >>> This behavior was added all the way back in 2012 in >>> https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html >>> . >>> >>> https://reviews.llvm.org/D69428 removed this behavior, and I >>> subsequently reverted it when many internal Google tests started failing, >>> but I believe many other users who use leak checking will encounter errors >>> when this hits more mainstream releases. >>> >>> So: What to do? >>> >>> Preventing a valid transformation (the global variables are never read >>> and can be eliminated) to help the leak checkers leaves some performance >>> and code size on the table. Just how much is unclear. >>> >>> On the other hand, having leak checkers suddenly start reporting >>> failures where they didn't before also seems suboptimal. Cleaning this >>> somewhat common scenario up is surprisingly difficult at the user level. >>> >>> Some possibilities: >>> >>> 1. Only do this at high optimization levels, say -O3. This would give >>> aggressive users all the performance we can, but also make leak checkers >>> report leaks sometimes, but not others. >>> >>> 2. Hide it behind a flag or configurable option. Users who care can set >>> it as they prefer. Creates more confusing options, different testing >>> matrices and such, but everyone can get the behaviour that they want. >>> >>> 3. Do it all the time, and users who encounter issues can clean up their >>> code. Users get the most performance they possibly can, but have to clean >>> up code or drop leak checking. Seems a little user hostile. >>> >>> Other possibilities?: >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/549508be/attachment.html>
Eric Christopher via llvm-dev
2021-Apr-20 16:26 UTC
[llvm-dev] Eliminating global memory roots (or not) to help leak checkers
+Chris Lattner <clattner at nondot.org> for the last two updates as well. I really don't think this optimization is getting the overall win that it's thought to and breaking roughly every kind of leak checker mechanism I've seen. -eric On Tue, Apr 20, 2021 at 12:13 PM Sterling Augustine via llvm-dev < llvm-dev at lists.llvm.org> wrote:> In order to understand how much benefit this change gives to code size, I > built clang and related files with and without the patch, both with > CMAKE_BUILD_TYPE=Release. > > clang itself gets about 0.4% smaller (from 145217124 to 144631078) > lld gets about 1.85% smaller (from 101129181 to 99243810 bytes) > > Spot checking a few other random targets, in general, it seems that the > benefits depend heavily on the coding style, but roughly, bigger the > binary, the less benefit this brings. > > I suspect that a more aggressive pass as described by Philip Reames could > capture a significant part of the benefit without sacrificing > functionality. But that would require a more detailed study to be sure. > > > On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at google.com> > wrote: > >> There may be other ways to disable leak checkers, but they put a burden >> on users that was not there before. Further, users who try leak checking >> with and without optimization will get different answers. The bug report >> will read: "clang at -O2 makes my program leak". And, as James notes, >> whether or not you need to suppress the leak depends on whether or not the >> optimizer does away with the variable. Subtle changes to the code that have >> nothing to do with memory allocation will appear to add or fix leaks. That >> is not something I would care to explain or document. >> >> Although the code could definitely be improved, the comments that it is >> brittle or fragile seems overstated. Google uses this code without issue >> against against many millions of lines of code every day, in >> tens-of-thousands of targets, and has since 2012. It may be ugly, but it >> does work. We are open to improving it, if it is only the ugliness that is >> of concern. >> >> I am working on quantifying the benefit the change gives, but I won't >> have results until tomorrow. >> >> On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at google.com> >> wrote: >> >>> There is no problem (no leaks) in the code that users wrote, so adding >>> code annotations (sanitizer suppression file, or attributes) is not a good >>> solution to this issue. The problem is that this optimization introduces a >>> "leak" (from the point of view of the leak checker), which wasn't there >>> before. And in practice, this seems to cause a large number of false >>> positives. >>> >>> This can apparently happen simply by having the code which reads from a >>> global variable happen to be unused or optimized away in the current >>> compilation. Users would be effectively randomly annotating variables >>> to work around the compiler, not for any reason apparent in the code they >>> wrote. >>> >>> I don't know what the right answer should be -- I've not looked into it >>> at all. But I don't think it's correct to just dismiss this as >>> not-a-problem, even if the current solution is not a good solution. >>> >>> >>> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hi Sterling, >>>> >>>> I agree with the others: therIe are better (and more robust ways) to >>>> disable leak checkers. This only addresses one narrow case. The code is >>>> also super verbose and fragile looking. This is also eliminating an >>>> optimization. >>>> >>>> I’d recommend dropping the code. >>>> >>>> -Chris >>>> >>>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>> [Continuing discussion from https://reviews.llvm.org/D69428] >>>> >>>> Llvm is fairly conservative when eliminating global variables (or >>>> fields of such) that may point to dynamically allocated memory. This >>>> behavior is entirely to help leak checking tools such as Valgrind, Google's >>>> HeapLeakChecker, and LSAN, all of which treat memory that is reachable at >>>> exit as "not leaked", even though it will never be freed. Without these >>>> global variables to hold the pointer, the leak checkers can't determine >>>> that it is actually reachable, and will report a leak. Global variables >>>> that dynamically allocate memory but don't clean themselves up are fairly >>>> common in the wild, and various leak checkers have long not reported errors. >>>> >>>> This behavior was added all the way back in 2012 in >>>> https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html >>>> . >>>> >>>> https://reviews.llvm.org/D69428 removed this behavior, and I >>>> subsequently reverted it when many internal Google tests started failing, >>>> but I believe many other users who use leak checking will encounter errors >>>> when this hits more mainstream releases. >>>> >>>> So: What to do? >>>> >>>> Preventing a valid transformation (the global variables are never read >>>> and can be eliminated) to help the leak checkers leaves some performance >>>> and code size on the table. Just how much is unclear. >>>> >>>> On the other hand, having leak checkers suddenly start reporting >>>> failures where they didn't before also seems suboptimal. Cleaning this >>>> somewhat common scenario up is surprisingly difficult at the user level. >>>> >>>> Some possibilities: >>>> >>>> 1. Only do this at high optimization levels, say -O3. This would give >>>> aggressive users all the performance we can, but also make leak checkers >>>> report leaks sometimes, but not others. >>>> >>>> 2. Hide it behind a flag or configurable option. Users who care can set >>>> it as they prefer. Creates more confusing options, different testing >>>> matrices and such, but everyone can get the behaviour that they want. >>>> >>>> 3. Do it all the time, and users who encounter issues can clean up >>>> their code. Users get the most performance they possibly can, but have to >>>> clean up code or drop leak checking. Seems a little user hostile. >>>> >>>> Other possibilities?: >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/bbb5bc60/attachment.html>