thr3ads.net - llvm dev - [llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools [Mar 2017]

If this information is useful, please help other people find it:
Share via:

Manuel Rigger via llvm-dev

2017-Mar-21 12:46 UTC

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

Hi everyone,

I found that Clang -O0 performs optimizations that undermine dynamic
bug-finding tools.

First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap,
Purify and Safe Sulong (on which I am working) rely on detecting errors
during the execution of the program. They either insert additional checks
during compile-time or during run-time which are executed when the program
is running. To find errors with these tools, it is necessary that these
errors stay in the program and are not optimized away.

I think it is widely known that bugs are sometimes optimized away when
compiling with optimizations turned on (-O1, -O2, -O3), and that there is a
consensus that this is legit. However, I found that also bugs are optimized
away while compiling with -O0. For example, I recently opened a bug report
on the LLVM sanitizers Github space [1] to describe a case where ASan did
not find an out-of-bounds access (see below).

int count[7] = {0, 0, 0, 0, 0, 0, 0};

int main(int argc, char** args) {
    return count[7];
}

Note, that Clang printed a warning and then optimized the invalid access
away (which is legit since it is UB). However, note that that cases exist
where no warning is printed. For example, consider the following program:

#include <ctype.h>

int main() {
    isalnum(1000000);
    isalpha(1000000);
    iscntrl(1000000);
    isdigit(1000000);
    isgraph(1000000);
    islower(1000000);
    isprint(1000000);
    ispunct(1000000);
    isspace(1000000);
    isupper(1000000);
    isxdigit(1000000);
}

The glibc (on my system) implements the macros by calling __ctype_b_loc()
which returns a lookup array that can be indexed by values between -128 and
255. Thus, I expected that, when compiling with -O0, the calls above would
result in out-of-bounds accesses that (at least in theory) could be
detected by bug finding tools. However, Clang optimizes the calls away, so
bug finding tools have no chance to find the out-of-bounds accesses. Note,
that in this example no warning is printed.

I think the calls are removed since __ctype_b_loc() has an __attribute__
((__const__)). When the attribute is used, Clang -O0 also removes calls in
other instances, for example in the function below. Using pure instead of
const as an attribute has the same effect.

#include <stdio.h>

int arr[10];

void test() __attribute__ ((__const__));

void test(int index) {
    printf("%d\n", arr[index]);
}

int main() {
    test(10000);
}

I have not yet found further cases but I feel unsettled to know that even
when compiling with -O0 Clang optimizes bugs away that then cannot be found
any longer by dynamic bug finding tools. The cases that I presented exhibit
undefined behavior. However, according to the "Principle of least
astonishment", I think that the errors should be compiled in a way so that
bug finding tools can still detect them.

Following, I have the following questions/suggestions:
- Is it known that Clang performs optimizations that hide program bugs,
even when compiling with -O0?
- Are there command line options to specify that no optimizations should be
performed? Until recently, I thought that -O0 had this effect.
- In each case, I would propose to not perform optimizations at -O0 to
allow dynamic bug finding tools to find such bugs, or at least offer a flag
to turn off optimizations altogether.


[1] https://github.com/google/sanitizers/issues/773
[2]
https://refspecs.linuxfoundation.org/LSB_2.0.1/LSB-Core/LSB-Core/baselib---ctype-b-loc.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/2f5806d4/attachment.html>

Reid Kleckner via llvm-dev

2017-Mar-21 16:41 UTC

head link

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

I think deleting dead calls to __attribute__((const)) functions at -O0 is
just a bug. We need to inline always_inline functions at -O0, so we run the
inliner with a specially configured inline cost analysis. The generic
inliner deletes dead calls to functions with no side effects, so these
calls get deleted. We shouldn't do that. I filed
https://bugs.llvm.org//show_bug.cgi?id=32363 for this.

On Tue, Mar 21, 2017 at 5:46 AM, Manuel Rigger via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi everyone,
>
> I found that Clang -O0 performs optimizations that undermine dynamic
> bug-finding tools.
>
> First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap,
> Purify and Safe Sulong (on which I am working) rely on detecting errors
> during the execution of the program. They either insert additional checks
> during compile-time or during run-time which are executed when the program
> is running. To find errors with these tools, it is necessary that these
> errors stay in the program and are not optimized away.
>
> I think it is widely known that bugs are sometimes optimized away when
> compiling with optimizations turned on (-O1, -O2, -O3), and that there is a
> consensus that this is legit. However, I found that also bugs are optimized
> away while compiling with -O0. For example, I recently opened a bug report
> on the LLVM sanitizers Github space [1] to describe a case where ASan did
> not find an out-of-bounds access (see below).
>
> int count[7] = {0, 0, 0, 0, 0, 0, 0};
>
> int main(int argc, char** args) {
>     return count[7];
> }
>
> Note, that Clang printed a warning and then optimized the invalid access
> away (which is legit since it is UB). However, note that that cases exist
> where no warning is printed. For example, consider the following program:
>
> #include <ctype.h>
>
> int main() {
>     isalnum(1000000);
>     isalpha(1000000);
>     iscntrl(1000000);
>     isdigit(1000000);
>     isgraph(1000000);
>     islower(1000000);
>     isprint(1000000);
>     ispunct(1000000);
>     isspace(1000000);
>     isupper(1000000);
>     isxdigit(1000000);
> }
>
> The glibc (on my system) implements the macros by calling __ctype_b_loc()
> which returns a lookup array that can be indexed by values between -128 and
> 255. Thus, I expected that, when compiling with -O0, the calls above would
> result in out-of-bounds accesses that (at least in theory) could be
> detected by bug finding tools. However, Clang optimizes the calls away, so
> bug finding tools have no chance to find the out-of-bounds accesses. Note,
> that in this example no warning is printed.
>
> I think the calls are removed since __ctype_b_loc() has an __attribute__
> ((__const__)). When the attribute is used, Clang -O0 also removes calls in
> other instances, for example in the function below. Using pure instead of
> const as an attribute has the same effect.
>
> #include <stdio.h>
>
> int arr[10];
>
> void test() __attribute__ ((__const__));
>
> void test(int index) {
>     printf("%d\n", arr[index]);
> }
>
> int main() {
>     test(10000);
> }
>
> I have not yet found further cases but I feel unsettled to know that even
> when compiling with -O0 Clang optimizes bugs away that then cannot be found
> any longer by dynamic bug finding tools. The cases that I presented exhibit
> undefined behavior. However, according to the "Principle of least
> astonishment", I think that the errors should be compiled in a way so
that
> bug finding tools can still detect them.
>
> Following, I have the following questions/suggestions:
> - Is it known that Clang performs optimizations that hide program bugs,
> even when compiling with -O0?
> - Are there command line options to specify that no optimizations should
> be performed? Until recently, I thought that -O0 had this effect.
> - In each case, I would propose to not perform optimizations at -O0 to
> allow dynamic bug finding tools to find such bugs, or at least offer a flag
> to turn off optimizations altogether.
>
>
> [1] https://github.com/google/sanitizers/issues/773
> [2] https://refspecs.linuxfoundation.org/LSB_2.0.1/
> LSB-Core/LSB-Core/baselib---ctype-b-loc.html
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/a0ad8255/attachment-0001.html>

Daniel Berlin via llvm-dev

2017-Mar-21 16:47 UTC

head link

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

>
>  can still detect them.
>
> Following, I have the following questions/suggestions:
> - Is it known that Clang performs optimizations that hide program bugs,
> even when compiling with -O0?
>Some, yes, some no.

> - Are there command line options to specify that no optimizations should
> be performed?
>
It is not possible to compile all code correctly without optimization,
interestingly enough.  It would be nice though.
In fact, i expect things like C++ constexpr make this significantly worse.


> Until recently, I thought that -O0 had this effect.
> - In each case, I would propose to not perform optimizations at -O0 to
> allow dynamic bug finding tools to find such bugs, or at least offer a flag
> to turn off optimizations altogether.
>
> Again, this is impossible :)For example, there are high profile things that depend on always_inline
functions not existing after inlining.
This inlining can definitely hide bugs (smashing call stacks, etc).
But we have to do it anyway.

So as a general statement, your proposal will not work.
If you revised it to "the minimum set of optimizations necessary for
correctness", it would be doable, but that set already conflicts in a
number of ways with "dynamic bug finding tools" :(
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/52d5d31b/attachment.html>

Manuel Rigger via llvm-dev

2017-Mar-21 21:14 UTC

head link

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

2017-03-21 17:47 GMT+01:00 Daniel Berlin <dberlin at dberlin.org>:
>  can still detect them.
>>
>> Following, I have the following questions/suggestions:
>> - Is it known that Clang performs optimizations that hide program bugs,
>> even when compiling with -O0?
>>
> Some, yes, some no.
>
I was too unspecific about this question. I can imagine that constant
folding at -O0 could hide signed integer overflow bugs and undefined
behavior in arithmetic expressions. However, I'm mainly interested in
memory errors (buffer overflows, NULL dereferences, double-free and invalid
free errors, reads of uninitialized data and others). Are there any records
where memory errors were optimized away at -O0, besides the ones that I
found?
>
>
>> - Are there command line options to specify that no optimizations
should
>> be performed?
>>
>
> It is not possible to compile all code correctly without optimization,
> interestingly enough.  It would be nice though.
> In fact, i expect things like C++ constexpr make this significantly worse.
>
> I think that there is a difference in language semantics and how thecompiler implements it. For example, even though constexpr requires that a
value can be evaluated at compile-time, a compiler is not required to
actually do it. I think that if optimizations in the compiler are required
to correctly implement a language construct then it is an implementation
detail of the compiler, or not an optimization. I'm not sure if you were
talking about this implementation level or in general. In general, I'm not
(yet) convinced that any code requires compiler optimizations to be
implemented correctly.

>
>> Until recently, I thought that -O0 had this effect.
>> - In each case, I would propose to not perform optimizations at -O0 to
>> allow dynamic bug finding tools to find such bugs, or at least offer a
flag
>> to turn off optimizations altogether.
>>
>> Again, this is impossible :)
> For example, there are high profile things that depend on always_inline
> functions not existing after inlining.
>
IMO, always_inline functions that are inlined fall into the category of
language semantics. Anyway, I get your point that the line cannot always be
clearly drawn.

This inlining can definitely hide bugs (smashing call stacks,
etc).> But we have to do it anyway.
>
Inline functions can hide bugs if the bug finding tool is based on a canary
approach. However, if the bug finding tool provides detection of
out-of-bounds accesses (which all tools that I consider to be bug finding
tools do) then out-of-bounds writes to the stack, such as overwriting the
return address, are detectable either way.
>
> So as a general statement, your proposal will not work.
> If you revised it to "the minimum set of optimizations necessary for
> correctness", it would be doable, but that set already conflicts in a
> number of ways with "dynamic bug finding tools" :(
>
> It certainly also depends on whether there is a consensus (is there one?)on how Clang should behave when compiling with -O0. I see two different
ways here:

   - Clang should optimize code if it does not increase compilation time
   but has a clear run-time performance advantage.
   - Clang should refrain from performing any (or as many as possible)
   optimizations for debugging purposes and bug finding tools.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/e8be8089/attachment.html>

Manuel Rigger via llvm-dev

2017-Mar-24 13:21 UTC

head link

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

2017-03-21 13:46 GMT+01:00 Manuel Rigger <rigger.manuel at gmail.com>:
> Hi everyone,
>
> I found that Clang -O0 performs optimizations that undermine dynamic
> bug-finding tools.
>
> First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap,
> Purify and Safe Sulong (on which I am working) rely on detecting errors
> during the execution of the program. They either insert additional checks
> during compile-time or during run-time which are executed when the program
> is running. To find errors with these tools, it is necessary that these
> errors stay in the program and are not optimized away.
>
> I think it is widely known that bugs are sometimes optimized away when
> compiling with optimizations turned on (-O1, -O2, -O3), and that there is a
> consensus that this is legit. However, I found that also bugs are optimized
> away while compiling with -O0. For example, I recently opened a bug report
> on the LLVM sanitizers Github space [1] to describe a case where ASan did
> not find an out-of-bounds access (see below).
>
> int count[7] = {0, 0, 0, 0, 0, 0, 0};
>
> int main(int argc, char** args) {
>     return count[7];
> }
>
The LLVM IR produced by Clang still contains code for the undefined access:

@count = global [7 x i32] zeroinitializer, align 16

define i32 @main(i32 %argc, i8** %args) {
  ; ...
  %4 = load i32, i32* getelementptr inbounds ([7 x i32], [7 x i32]* @count,
i64 1, i64 0), align 4
  ret i32 %4
}

Probably, the access is removed in the back end. Would it be reasonable to
not optimize the access away when compiling with -O0?

>
> Note, that Clang printed a warning and then optimized the invalid access
> away (which is legit since it is UB). However, note that that cases exist
> where no warning is printed. For example, consider the following program:
>
> #include <ctype.h>
>
> int main() {
>     isalnum(1000000);
>     isalpha(1000000);
>     iscntrl(1000000);
>     isdigit(1000000);
>     isgraph(1000000);
>     islower(1000000);
>     isprint(1000000);
>     ispunct(1000000);
>     isspace(1000000);
>     isupper(1000000);
>     isxdigit(1000000);
> }
>
> The glibc (on my system) implements the macros by calling __ctype_b_loc()
> which returns a lookup array that can be indexed by values between -128 and
> 255. Thus, I expected that, when compiling with -O0, the calls above would
> result in out-of-bounds accesses that (at least in theory) could be
> detected by bug finding tools. However, Clang optimizes the calls away, so
> bug finding tools have no chance to find the out-of-bounds accesses. Note,
> that in this example no warning is printed.
>
> I think the calls are removed since __ctype_b_loc() has an __attribute__
> ((__const__)). When the attribute is used, Clang -O0 also removes calls in
> other instances, for example in the function below. Using pure instead of
> const as an attribute has the same effect.
>
> #include <stdio.h>
>
> int arr[10];
>
> void test() __attribute__ ((__const__));
>
> void test(int index) {
>     printf("%d\n", arr[index]);
> }
>
> int main() {
>     test(10000);
> }
>
> I have not yet found further cases but I feel unsettled to know that even
> when compiling with -O0 Clang optimizes bugs away that then cannot be found
> any longer by dynamic bug finding tools. The cases that I presented exhibit
> undefined behavior. However, according to the "Principle of least
> astonishment", I think that the errors should be compiled in a way so
that
> bug finding tools can still detect them.
>
> Following, I have the following questions/suggestions:
> - Is it known that Clang performs optimizations that hide program bugs,
> even when compiling with -O0?
> - Are there command line options to specify that no optimizations should
> be performed? Until recently, I thought that -O0 had this effect.
> - In each case, I would propose to not perform optimizations at -O0 to
> allow dynamic bug finding tools to find such bugs, or at least offer a flag
> to turn off optimizations altogether.
>
>
> [1] https://github.com/google/sanitizers/issues/773
> [2] https://refspecs.linuxfoundation.org/LSB_2.0.1/
> LSB-Core/LSB-Core/baselib---ctype-b-loc.html
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170324/199630c5/attachment.html>

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Mar 2017 - Clang -O0 performs optimizations that undermine dynamic bug-finding tools

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

Seemingly Similar Threads