thr3ads.net - llvm dev - [llvm-dev] Eliminating global memory roots (or not) to help leak checkers [Apr 2021]

If this information is useful, please help other people find it:
Share via:

Sterling Augustine via llvm-dev

2021-Apr-20 16:12 UTC

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

In order to understand how much benefit this change gives to code size, I
built clang and related files with and without the patch, both with
CMAKE_BUILD_TYPE=Release.

clang itself gets about 0.4% smaller (from 145217124 to 144631078)
lld gets about 1.85% smaller (from 101129181 to 99243810 bytes)

Spot checking a few other random targets, in general, it seems that the
benefits depend heavily on the coding style, but roughly, bigger the
binary, the less benefit this brings.

I suspect that a more aggressive pass as described by Philip Reames could
capture a significant part of the benefit without sacrificing
functionality. But that would require a more detailed study to be sure.


On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at
google.com>
wrote:
> There may be other ways to disable leak checkers, but they put a burden on
> users that was not there before. Further, users who try leak checking with
> and without optimization will get different answers. The bug report will
> read: "clang at -O2 makes my program leak". And, as James notes,
whether or
> not you need to suppress the leak depends on whether or not the optimizer
> does away with the variable. Subtle changes to the code that have nothing
> to do with memory allocation will appear to add or fix leaks. That is not
> something I would care to explain or document.
>
> Although the code could definitely be improved, the comments that it is
> brittle or fragile seems overstated. Google uses this code without issue
> against against many millions of lines of code every day, in
> tens-of-thousands of targets, and has since 2012. It may be ugly, but it
> does work. We are open to improving it, if it is only the ugliness that is
> of concern.
>
> I am working on quantifying the benefit the change gives, but I won't
have
> results until tomorrow.
>
> On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at
google.com>
> wrote:
>
>> There is no problem (no leaks) in the code that users wrote, so adding
>> code annotations (sanitizer suppression file, or attributes) is not a
good
>> solution to this issue. The problem is that this optimization
introduces a
>> "leak" (from the point of view of the leak checker), which
wasn't there
>> before. And in practice, this seems to cause a large number of false
>> positives.
>>
>> This can apparently happen simply by having the code which reads from a
>> global variable happen to be unused or optimized away in the current
>> compilation. Users would be effectively randomly annotating variables
to
>> work around the compiler, not for any reason apparent in the code they
>> wrote.
>>
>> I don't know what the right answer should be -- I've not looked
into it
>> at all. But I don't think it's correct to just dismiss this as
>> not-a-problem, even if the current solution is not a good solution.
>>
>>
>> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi Sterling,
>>>
>>> I agree with the others: therIe are better (and more robust ways)
to
>>> disable leak checkers.  This only addresses one narrow case.  The
code is
>>> also super verbose and fragile looking.  This is also eliminating
an
>>> optimization.
>>>
>>> I’d recommend dropping the code.
>>>
>>> -Chris
>>>
>>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> [Continuing discussion from https://reviews.llvm.org/D69428]
>>>
>>> Llvm is fairly conservative when eliminating global variables (or
fields
>>> of such) that may point to dynamically allocated memory. This
behavior is
>>> entirely to help leak checking tools such as Valgrind, Google's
>>> HeapLeakChecker, and LSAN, all of which treat memory that is
reachable at
>>> exit as "not leaked", even though it will never be freed.
Without these
>>> global variables to hold the pointer, the leak checkers can't
determine
>>> that it is actually reachable, and will report a leak. Global
variables
>>> that dynamically allocate memory but don't clean themselves up
are fairly
>>> common in the wild, and various leak checkers have long not
reported errors.
>>>
>>> This behavior was added all the way back in 2012 in
>>>
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html
>>> .
>>>
>>> https://reviews.llvm.org/D69428 removed this behavior, and I
>>> subsequently reverted it when many internal Google tests started
failing,
>>> but I believe many other users who use leak checking will encounter
errors
>>> when this hits more mainstream releases.
>>>
>>> So: What to do?
>>>
>>> Preventing a valid transformation (the global variables are never
read
>>> and can be eliminated) to help the leak checkers leaves some
performance
>>> and code size on the table. Just how much is unclear.
>>>
>>> On the other hand, having leak checkers suddenly start reporting
>>> failures where they didn't before also seems suboptimal.
Cleaning this
>>> somewhat common scenario up is surprisingly difficult at the user
level.
>>>
>>> Some possibilities:
>>>
>>> 1. Only do this at high optimization levels, say -O3. This would
give
>>> aggressive users all the performance we can, but also make leak
checkers
>>> report leaks sometimes, but not others.
>>>
>>> 2. Hide it behind a flag or configurable option. Users who care can
set
>>> it as they prefer. Creates more confusing options, different
testing
>>> matrices and such, but everyone can get the behaviour that they
want.
>>>
>>> 3. Do it all the time, and users who encounter issues can clean up
their
>>> code. Users get the most performance they possibly can, but have to
clean
>>> up code or drop leak checking. Seems a little user hostile.
>>>
>>> Other possibilities?:
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/549508be/attachment.html>

Eric Christopher via llvm-dev

2021-Apr-20 16:26 UTC

head link

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

+Chris Lattner <clattner at nondot.org> for the last two updates as well.
I
really don't think this optimization is getting the overall win that
it's
thought to and breaking roughly every kind of leak checker mechanism I've
seen.

-eric

On Tue, Apr 20, 2021 at 12:13 PM Sterling Augustine via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> In order to understand how much benefit this change gives to code size, I
> built clang and related files with and without the patch, both with
> CMAKE_BUILD_TYPE=Release.
>
> clang itself gets about 0.4% smaller (from 145217124 to 144631078)
> lld gets about 1.85% smaller (from 101129181 to 99243810 bytes)
>
> Spot checking a few other random targets, in general, it seems that the
> benefits depend heavily on the coding style, but roughly, bigger the
> binary, the less benefit this brings.
>
> I suspect that a more aggressive pass as described by Philip Reames could
> capture a significant part of the benefit without sacrificing
> functionality. But that would require a more detailed study to be sure.
>
>
> On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at
google.com>
> wrote:
>
>> There may be other ways to disable leak checkers, but they put a burden
>> on users that was not there before. Further, users who try leak
checking
>> with and without optimization will get different answers. The bug
report
>> will read: "clang at -O2 makes my program leak". And, as
James notes,
>> whether or not you need to suppress the leak depends on whether or not
the
>> optimizer does away with the variable. Subtle changes to the code that
have
>> nothing to do with memory allocation will appear to add or fix leaks.
That
>> is not something I would care to explain or document.
>>
>> Although the code could definitely be improved, the comments that it is
>> brittle or fragile seems overstated. Google uses this code without
issue
>> against against many millions of lines of code every day, in
>> tens-of-thousands of targets, and has since 2012. It may be ugly, but
it
>> does work. We are open to improving it, if it is only the ugliness that
is
>> of concern.
>>
>> I am working on quantifying the benefit the change gives, but I
won't
>> have results until tomorrow.
>>
>> On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at
google.com>
>> wrote:
>>
>>> There is no problem (no leaks) in the code that users wrote, so
adding
>>> code annotations (sanitizer suppression file, or attributes) is not
a good
>>> solution to this issue. The problem is that this optimization
introduces a
>>> "leak" (from the point of view of the leak checker),
which wasn't there
>>> before. And in practice, this seems to cause a large number of
false
>>> positives.
>>>
>>> This can apparently happen simply by having the code which reads
from a
>>> global variable happen to be unused or optimized away in the
current
>>> compilation. Users would be effectively randomly annotating
variables
>>> to work around the compiler, not for any reason apparent in the
code they
>>> wrote.
>>>
>>> I don't know what the right answer should be -- I've not
looked into it
>>> at all. But I don't think it's correct to just dismiss this
as
>>> not-a-problem, even if the current solution is not a good solution.
>>>
>>>
>>> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi Sterling,
>>>>
>>>> I agree with the others: therIe are better (and more robust
ways) to
>>>> disable leak checkers.  This only addresses one narrow case. 
The code is
>>>> also super verbose and fragile looking.  This is also
eliminating an
>>>> optimization.
>>>>
>>>> I’d recommend dropping the code.
>>>>
>>>> -Chris
>>>>
>>>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev
<
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> [Continuing discussion from https://reviews.llvm.org/D69428]
>>>>
>>>> Llvm is fairly conservative when eliminating global variables
(or
>>>> fields of such) that may point to dynamically allocated memory.
This
>>>> behavior is entirely to help leak checking tools such as
Valgrind, Google's
>>>> HeapLeakChecker, and LSAN, all of which treat memory that is
reachable at
>>>> exit as "not leaked", even though it will never be
freed. Without these
>>>> global variables to hold the pointer, the leak checkers
can't determine
>>>> that it is actually reachable, and will report a leak. Global
variables
>>>> that dynamically allocate memory but don't clean themselves
up are fairly
>>>> common in the wild, and various leak checkers have long not
reported errors.
>>>>
>>>> This behavior was added all the way back in 2012 in
>>>>
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html
>>>> .
>>>>
>>>> https://reviews.llvm.org/D69428 removed this behavior, and I
>>>> subsequently reverted it when many internal Google tests
started failing,
>>>> but I believe many other users who use leak checking will
encounter errors
>>>> when this hits more mainstream releases.
>>>>
>>>> So: What to do?
>>>>
>>>> Preventing a valid transformation (the global variables are
never read
>>>> and can be eliminated) to help the leak checkers leaves some
performance
>>>> and code size on the table. Just how much is unclear.
>>>>
>>>> On the other hand, having leak checkers suddenly start
reporting
>>>> failures where they didn't before also seems suboptimal.
Cleaning this
>>>> somewhat common scenario up is surprisingly difficult at the
user level.
>>>>
>>>> Some possibilities:
>>>>
>>>> 1. Only do this at high optimization levels, say -O3. This
would give
>>>> aggressive users all the performance we can, but also make leak
checkers
>>>> report leaks sometimes, but not others.
>>>>
>>>> 2. Hide it behind a flag or configurable option. Users who care
can set
>>>> it as they prefer. Creates more confusing options, different
testing
>>>> matrices and such, but everyone can get the behaviour that they
want.
>>>>
>>>> 3. Do it all the time, and users who encounter issues can clean
up
>>>> their code. Users get the most performance they possibly can,
but have to
>>>> clean up code or drop leak checking. Seems a little user
hostile.
>>>>
>>>> Other possibilities?:
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/bbb5bc60/attachment.html>

llvm dev - Apr 2021 - Eliminating global memory roots (or not) to help leak checkers

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

[llvm-dev] Eliminating global memory roots (or not) to help leak checkers