thr3ads.net - llvm dev - [llvm-dev] Eliminating global memory roots (or not) to help leak checkers [Apr 2021]

If this information is useful, please help other people find it:
Share via:

Sterling Augustine via llvm-dev

2021-Apr-19 23:52 UTC

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

There may be other ways to disable leak checkers, but they put a burden on
users that was not there before. Further, users who try leak checking with
and without optimization will get different answers. The bug report will
read: "clang at -O2 makes my program leak". And, as James notes,
whether or
not you need to suppress the leak depends on whether or not the optimizer
does away with the variable. Subtle changes to the code that have nothing
to do with memory allocation will appear to add or fix leaks. That is not
something I would care to explain or document.

Although the code could definitely be improved, the comments that it is
brittle or fragile seems overstated. Google uses this code without issue
against against many millions of lines of code every day, in
tens-of-thousands of targets, and has since 2012. It may be ugly, but it
does work. We are open to improving it, if it is only the ugliness that is
of concern.

I am working on quantifying the benefit the change gives, but I won't have
results until tomorrow.

On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at google.com>
wrote:
> There is no problem (no leaks) in the code that users wrote, so adding
> code annotations (sanitizer suppression file, or attributes) is not a good
> solution to this issue. The problem is that this optimization introduces a
> "leak" (from the point of view of the leak checker), which
wasn't there
> before. And in practice, this seems to cause a large number of false
> positives.
>
> This can apparently happen simply by having the code which reads from a
> global variable happen to be unused or optimized away in the current
> compilation. Users would be effectively randomly annotating variables to
> work around the compiler, not for any reason apparent in the code they
> wrote.
>
> I don't know what the right answer should be -- I've not looked
into it at
> all. But I don't think it's correct to just dismiss this as
not-a-problem,
> even if the current solution is not a good solution.
>
>
> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi Sterling,
>>
>> I agree with the others: therIe are better (and more robust ways) to
>> disable leak checkers.  This only addresses one narrow case.  The code
is
>> also super verbose and fragile looking.  This is also eliminating an
>> optimization.
>>
>> I’d recommend dropping the code.
>>
>> -Chris
>>
>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> [Continuing discussion from https://reviews.llvm.org/D69428]
>>
>> Llvm is fairly conservative when eliminating global variables (or
fields
>> of such) that may point to dynamically allocated memory. This behavior
is
>> entirely to help leak checking tools such as Valgrind, Google's
>> HeapLeakChecker, and LSAN, all of which treat memory that is reachable
at
>> exit as "not leaked", even though it will never be freed.
Without these
>> global variables to hold the pointer, the leak checkers can't
determine
>> that it is actually reachable, and will report a leak. Global variables
>> that dynamically allocate memory but don't clean themselves up are
fairly
>> common in the wild, and various leak checkers have long not reported
errors.
>>
>> This behavior was added all the way back in 2012 in
>>
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html
>> .
>>
>> https://reviews.llvm.org/D69428 removed this behavior, and I
>> subsequently reverted it when many internal Google tests started
failing,
>> but I believe many other users who use leak checking will encounter
errors
>> when this hits more mainstream releases.
>>
>> So: What to do?
>>
>> Preventing a valid transformation (the global variables are never read
>> and can be eliminated) to help the leak checkers leaves some
performance
>> and code size on the table. Just how much is unclear.
>>
>> On the other hand, having leak checkers suddenly start reporting
failures
>> where they didn't before also seems suboptimal. Cleaning this
somewhat
>> common scenario up is surprisingly difficult at the user level.
>>
>> Some possibilities:
>>
>> 1. Only do this at high optimization levels, say -O3. This would give
>> aggressive users all the performance we can, but also make leak
checkers
>> report leaks sometimes, but not others.
>>
>> 2. Hide it behind a flag or configurable option. Users who care can set
>> it as they prefer. Creates more confusing options, different testing
>> matrices and such, but everyone can get the behaviour that they want.
>>
>> 3. Do it all the time, and users who encounter issues can clean up
their
>> code. Users get the most performance they possibly can, but have to
clean
>> up code or drop leak checking. Seems a little user hostile.
>>
>> Other possibilities?:
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210419/af1596d5/attachment.html>

Sterling Augustine via llvm-dev

2021-Apr-20 16:12 UTC

head link

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

In order to understand how much benefit this change gives to code size, I
built clang and related files with and without the patch, both with
CMAKE_BUILD_TYPE=Release.

clang itself gets about 0.4% smaller (from 145217124 to 144631078)
lld gets about 1.85% smaller (from 101129181 to 99243810 bytes)

Spot checking a few other random targets, in general, it seems that the
benefits depend heavily on the coding style, but roughly, bigger the
binary, the less benefit this brings.

I suspect that a more aggressive pass as described by Philip Reames could
capture a significant part of the benefit without sacrificing
functionality. But that would require a more detailed study to be sure.


On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at
google.com>
wrote:
> There may be other ways to disable leak checkers, but they put a burden on
> users that was not there before. Further, users who try leak checking with
> and without optimization will get different answers. The bug report will
> read: "clang at -O2 makes my program leak". And, as James notes,
whether or
> not you need to suppress the leak depends on whether or not the optimizer
> does away with the variable. Subtle changes to the code that have nothing
> to do with memory allocation will appear to add or fix leaks. That is not
> something I would care to explain or document.
>
> Although the code could definitely be improved, the comments that it is
> brittle or fragile seems overstated. Google uses this code without issue
> against against many millions of lines of code every day, in
> tens-of-thousands of targets, and has since 2012. It may be ugly, but it
> does work. We are open to improving it, if it is only the ugliness that is
> of concern.
>
> I am working on quantifying the benefit the change gives, but I won't
have
> results until tomorrow.
>
> On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at
google.com>
> wrote:
>
>> There is no problem (no leaks) in the code that users wrote, so adding
>> code annotations (sanitizer suppression file, or attributes) is not a
good
>> solution to this issue. The problem is that this optimization
introduces a
>> "leak" (from the point of view of the leak checker), which
wasn't there
>> before. And in practice, this seems to cause a large number of false
>> positives.
>>
>> This can apparently happen simply by having the code which reads from a
>> global variable happen to be unused or optimized away in the current
>> compilation. Users would be effectively randomly annotating variables
to
>> work around the compiler, not for any reason apparent in the code they
>> wrote.
>>
>> I don't know what the right answer should be -- I've not looked
into it
>> at all. But I don't think it's correct to just dismiss this as
>> not-a-problem, even if the current solution is not a good solution.
>>
>>
>> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi Sterling,
>>>
>>> I agree with the others: therIe are better (and more robust ways)
to
>>> disable leak checkers.  This only addresses one narrow case.  The
code is
>>> also super verbose and fragile looking.  This is also eliminating
an
>>> optimization.
>>>
>>> I’d recommend dropping the code.
>>>
>>> -Chris
>>>
>>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> [Continuing discussion from https://reviews.llvm.org/D69428]
>>>
>>> Llvm is fairly conservative when eliminating global variables (or
fields
>>> of such) that may point to dynamically allocated memory. This
behavior is
>>> entirely to help leak checking tools such as Valgrind, Google's
>>> HeapLeakChecker, and LSAN, all of which treat memory that is
reachable at
>>> exit as "not leaked", even though it will never be freed.
Without these
>>> global variables to hold the pointer, the leak checkers can't
determine
>>> that it is actually reachable, and will report a leak. Global
variables
>>> that dynamically allocate memory but don't clean themselves up
are fairly
>>> common in the wild, and various leak checkers have long not
reported errors.
>>>
>>> This behavior was added all the way back in 2012 in
>>>
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html
>>> .
>>>
>>> https://reviews.llvm.org/D69428 removed this behavior, and I
>>> subsequently reverted it when many internal Google tests started
failing,
>>> but I believe many other users who use leak checking will encounter
errors
>>> when this hits more mainstream releases.
>>>
>>> So: What to do?
>>>
>>> Preventing a valid transformation (the global variables are never
read
>>> and can be eliminated) to help the leak checkers leaves some
performance
>>> and code size on the table. Just how much is unclear.
>>>
>>> On the other hand, having leak checkers suddenly start reporting
>>> failures where they didn't before also seems suboptimal.
Cleaning this
>>> somewhat common scenario up is surprisingly difficult at the user
level.
>>>
>>> Some possibilities:
>>>
>>> 1. Only do this at high optimization levels, say -O3. This would
give
>>> aggressive users all the performance we can, but also make leak
checkers
>>> report leaks sometimes, but not others.
>>>
>>> 2. Hide it behind a flag or configurable option. Users who care can
set
>>> it as they prefer. Creates more confusing options, different
testing
>>> matrices and such, but everyone can get the behaviour that they
want.
>>>
>>> 3. Do it all the time, and users who encounter issues can clean up
their
>>> code. Users get the most performance they possibly can, but have to
clean
>>> up code or drop leak checking. Seems a little user hostile.
>>>
>>> Other possibilities?:
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/549508be/attachment.html>

llvm dev - Apr 2021 - Eliminating global memory roots (or not) to help leak checkers

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers