Manuel Rigger via llvm-dev
2017-Mar-21 12:46 UTC
[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools
Hi everyone, I found that Clang -O0 performs optimizations that undermine dynamic bug-finding tools. First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap, Purify and Safe Sulong (on which I am working) rely on detecting errors during the execution of the program. They either insert additional checks during compile-time or during run-time which are executed when the program is running. To find errors with these tools, it is necessary that these errors stay in the program and are not optimized away. I think it is widely known that bugs are sometimes optimized away when compiling with optimizations turned on (-O1, -O2, -O3), and that there is a consensus that this is legit. However, I found that also bugs are optimized away while compiling with -O0. For example, I recently opened a bug report on the LLVM sanitizers Github space [1] to describe a case where ASan did not find an out-of-bounds access (see below). int count[7] = {0, 0, 0, 0, 0, 0, 0}; int main(int argc, char** args) { return count[7]; } Note, that Clang printed a warning and then optimized the invalid access away (which is legit since it is UB). However, note that that cases exist where no warning is printed. For example, consider the following program: #include <ctype.h> int main() { isalnum(1000000); isalpha(1000000); iscntrl(1000000); isdigit(1000000); isgraph(1000000); islower(1000000); isprint(1000000); ispunct(1000000); isspace(1000000); isupper(1000000); isxdigit(1000000); } The glibc (on my system) implements the macros by calling __ctype_b_loc() which returns a lookup array that can be indexed by values between -128 and 255. Thus, I expected that, when compiling with -O0, the calls above would result in out-of-bounds accesses that (at least in theory) could be detected by bug finding tools. However, Clang optimizes the calls away, so bug finding tools have no chance to find the out-of-bounds accesses. Note, that in this example no warning is printed. I think the calls are removed since __ctype_b_loc() has an __attribute__ ((__const__)). When the attribute is used, Clang -O0 also removes calls in other instances, for example in the function below. Using pure instead of const as an attribute has the same effect. #include <stdio.h> int arr[10]; void test() __attribute__ ((__const__)); void test(int index) { printf("%d\n", arr[index]); } int main() { test(10000); } I have not yet found further cases but I feel unsettled to know that even when compiling with -O0 Clang optimizes bugs away that then cannot be found any longer by dynamic bug finding tools. The cases that I presented exhibit undefined behavior. However, according to the "Principle of least astonishment", I think that the errors should be compiled in a way so that bug finding tools can still detect them. Following, I have the following questions/suggestions: - Is it known that Clang performs optimizations that hide program bugs, even when compiling with -O0? - Are there command line options to specify that no optimizations should be performed? Until recently, I thought that -O0 had this effect. - In each case, I would propose to not perform optimizations at -O0 to allow dynamic bug finding tools to find such bugs, or at least offer a flag to turn off optimizations altogether. [1] https://github.com/google/sanitizers/issues/773 [2] https://refspecs.linuxfoundation.org/LSB_2.0.1/LSB-Core/LSB-Core/baselib---ctype-b-loc.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/2f5806d4/attachment.html>
Reid Kleckner via llvm-dev
2017-Mar-21 16:41 UTC
[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools
I think deleting dead calls to __attribute__((const)) functions at -O0 is just a bug. We need to inline always_inline functions at -O0, so we run the inliner with a specially configured inline cost analysis. The generic inliner deletes dead calls to functions with no side effects, so these calls get deleted. We shouldn't do that. I filed https://bugs.llvm.org//show_bug.cgi?id=32363 for this. On Tue, Mar 21, 2017 at 5:46 AM, Manuel Rigger via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi everyone, > > I found that Clang -O0 performs optimizations that undermine dynamic > bug-finding tools. > > First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap, > Purify and Safe Sulong (on which I am working) rely on detecting errors > during the execution of the program. They either insert additional checks > during compile-time or during run-time which are executed when the program > is running. To find errors with these tools, it is necessary that these > errors stay in the program and are not optimized away. > > I think it is widely known that bugs are sometimes optimized away when > compiling with optimizations turned on (-O1, -O2, -O3), and that there is a > consensus that this is legit. However, I found that also bugs are optimized > away while compiling with -O0. For example, I recently opened a bug report > on the LLVM sanitizers Github space [1] to describe a case where ASan did > not find an out-of-bounds access (see below). > > int count[7] = {0, 0, 0, 0, 0, 0, 0}; > > int main(int argc, char** args) { > return count[7]; > } > > Note, that Clang printed a warning and then optimized the invalid access > away (which is legit since it is UB). However, note that that cases exist > where no warning is printed. For example, consider the following program: > > #include <ctype.h> > > int main() { > isalnum(1000000); > isalpha(1000000); > iscntrl(1000000); > isdigit(1000000); > isgraph(1000000); > islower(1000000); > isprint(1000000); > ispunct(1000000); > isspace(1000000); > isupper(1000000); > isxdigit(1000000); > } > > The glibc (on my system) implements the macros by calling __ctype_b_loc() > which returns a lookup array that can be indexed by values between -128 and > 255. Thus, I expected that, when compiling with -O0, the calls above would > result in out-of-bounds accesses that (at least in theory) could be > detected by bug finding tools. However, Clang optimizes the calls away, so > bug finding tools have no chance to find the out-of-bounds accesses. Note, > that in this example no warning is printed. > > I think the calls are removed since __ctype_b_loc() has an __attribute__ > ((__const__)). When the attribute is used, Clang -O0 also removes calls in > other instances, for example in the function below. Using pure instead of > const as an attribute has the same effect. > > #include <stdio.h> > > int arr[10]; > > void test() __attribute__ ((__const__)); > > void test(int index) { > printf("%d\n", arr[index]); > } > > int main() { > test(10000); > } > > I have not yet found further cases but I feel unsettled to know that even > when compiling with -O0 Clang optimizes bugs away that then cannot be found > any longer by dynamic bug finding tools. The cases that I presented exhibit > undefined behavior. However, according to the "Principle of least > astonishment", I think that the errors should be compiled in a way so that > bug finding tools can still detect them. > > Following, I have the following questions/suggestions: > - Is it known that Clang performs optimizations that hide program bugs, > even when compiling with -O0? > - Are there command line options to specify that no optimizations should > be performed? Until recently, I thought that -O0 had this effect. > - In each case, I would propose to not perform optimizations at -O0 to > allow dynamic bug finding tools to find such bugs, or at least offer a flag > to turn off optimizations altogether. > > > [1] https://github.com/google/sanitizers/issues/773 > [2] https://refspecs.linuxfoundation.org/LSB_2.0.1/ > LSB-Core/LSB-Core/baselib---ctype-b-loc.html > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/a0ad8255/attachment-0001.html>
Daniel Berlin via llvm-dev
2017-Mar-21 16:47 UTC
[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools
> > can still detect them. > > Following, I have the following questions/suggestions: > - Is it known that Clang performs optimizations that hide program bugs, > even when compiling with -O0? >Some, yes, some no.> - Are there command line options to specify that no optimizations should > be performed? >It is not possible to compile all code correctly without optimization, interestingly enough. It would be nice though. In fact, i expect things like C++ constexpr make this significantly worse.> Until recently, I thought that -O0 had this effect. > - In each case, I would propose to not perform optimizations at -O0 to > allow dynamic bug finding tools to find such bugs, or at least offer a flag > to turn off optimizations altogether. > > Again, this is impossible :)For example, there are high profile things that depend on always_inline functions not existing after inlining. This inlining can definitely hide bugs (smashing call stacks, etc). But we have to do it anyway. So as a general statement, your proposal will not work. If you revised it to "the minimum set of optimizations necessary for correctness", it would be doable, but that set already conflicts in a number of ways with "dynamic bug finding tools" :( -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/52d5d31b/attachment.html>
Manuel Rigger via llvm-dev
2017-Mar-21 21:14 UTC
[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools
2017-03-21 17:47 GMT+01:00 Daniel Berlin <dberlin at dberlin.org>:> can still detect them. >> >> Following, I have the following questions/suggestions: >> - Is it known that Clang performs optimizations that hide program bugs, >> even when compiling with -O0? >> > Some, yes, some no. >I was too unspecific about this question. I can imagine that constant folding at -O0 could hide signed integer overflow bugs and undefined behavior in arithmetic expressions. However, I'm mainly interested in memory errors (buffer overflows, NULL dereferences, double-free and invalid free errors, reads of uninitialized data and others). Are there any records where memory errors were optimized away at -O0, besides the ones that I found?> > >> - Are there command line options to specify that no optimizations should >> be performed? >> > > It is not possible to compile all code correctly without optimization, > interestingly enough. It would be nice though. > In fact, i expect things like C++ constexpr make this significantly worse. > > I think that there is a difference in language semantics and how thecompiler implements it. For example, even though constexpr requires that a value can be evaluated at compile-time, a compiler is not required to actually do it. I think that if optimizations in the compiler are required to correctly implement a language construct then it is an implementation detail of the compiler, or not an optimization. I'm not sure if you were talking about this implementation level or in general. In general, I'm not (yet) convinced that any code requires compiler optimizations to be implemented correctly.> >> Until recently, I thought that -O0 had this effect. >> - In each case, I would propose to not perform optimizations at -O0 to >> allow dynamic bug finding tools to find such bugs, or at least offer a flag >> to turn off optimizations altogether. >> >> Again, this is impossible :) > For example, there are high profile things that depend on always_inline > functions not existing after inlining. >IMO, always_inline functions that are inlined fall into the category of language semantics. Anyway, I get your point that the line cannot always be clearly drawn. This inlining can definitely hide bugs (smashing call stacks, etc).> But we have to do it anyway. >Inline functions can hide bugs if the bug finding tool is based on a canary approach. However, if the bug finding tool provides detection of out-of-bounds accesses (which all tools that I consider to be bug finding tools do) then out-of-bounds writes to the stack, such as overwriting the return address, are detectable either way.> > So as a general statement, your proposal will not work. > If you revised it to "the minimum set of optimizations necessary for > correctness", it would be doable, but that set already conflicts in a > number of ways with "dynamic bug finding tools" :( > > It certainly also depends on whether there is a consensus (is there one?)on how Clang should behave when compiling with -O0. I see two different ways here: - Clang should optimize code if it does not increase compilation time but has a clear run-time performance advantage. - Clang should refrain from performing any (or as many as possible) optimizations for debugging purposes and bug finding tools. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/e8be8089/attachment.html>
Manuel Rigger via llvm-dev
2017-Mar-24 13:21 UTC
[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools
2017-03-21 13:46 GMT+01:00 Manuel Rigger <rigger.manuel at gmail.com>:> Hi everyone, > > I found that Clang -O0 performs optimizations that undermine dynamic > bug-finding tools. > > First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap, > Purify and Safe Sulong (on which I am working) rely on detecting errors > during the execution of the program. They either insert additional checks > during compile-time or during run-time which are executed when the program > is running. To find errors with these tools, it is necessary that these > errors stay in the program and are not optimized away. > > I think it is widely known that bugs are sometimes optimized away when > compiling with optimizations turned on (-O1, -O2, -O3), and that there is a > consensus that this is legit. However, I found that also bugs are optimized > away while compiling with -O0. For example, I recently opened a bug report > on the LLVM sanitizers Github space [1] to describe a case where ASan did > not find an out-of-bounds access (see below). > > int count[7] = {0, 0, 0, 0, 0, 0, 0}; > > int main(int argc, char** args) { > return count[7]; > } >The LLVM IR produced by Clang still contains code for the undefined access: @count = global [7 x i32] zeroinitializer, align 16 define i32 @main(i32 %argc, i8** %args) { ; ... %4 = load i32, i32* getelementptr inbounds ([7 x i32], [7 x i32]* @count, i64 1, i64 0), align 4 ret i32 %4 } Probably, the access is removed in the back end. Would it be reasonable to not optimize the access away when compiling with -O0?> > Note, that Clang printed a warning and then optimized the invalid access > away (which is legit since it is UB). However, note that that cases exist > where no warning is printed. For example, consider the following program: > > #include <ctype.h> > > int main() { > isalnum(1000000); > isalpha(1000000); > iscntrl(1000000); > isdigit(1000000); > isgraph(1000000); > islower(1000000); > isprint(1000000); > ispunct(1000000); > isspace(1000000); > isupper(1000000); > isxdigit(1000000); > } > > The glibc (on my system) implements the macros by calling __ctype_b_loc() > which returns a lookup array that can be indexed by values between -128 and > 255. Thus, I expected that, when compiling with -O0, the calls above would > result in out-of-bounds accesses that (at least in theory) could be > detected by bug finding tools. However, Clang optimizes the calls away, so > bug finding tools have no chance to find the out-of-bounds accesses. Note, > that in this example no warning is printed. > > I think the calls are removed since __ctype_b_loc() has an __attribute__ > ((__const__)). When the attribute is used, Clang -O0 also removes calls in > other instances, for example in the function below. Using pure instead of > const as an attribute has the same effect. > > #include <stdio.h> > > int arr[10]; > > void test() __attribute__ ((__const__)); > > void test(int index) { > printf("%d\n", arr[index]); > } > > int main() { > test(10000); > } > > I have not yet found further cases but I feel unsettled to know that even > when compiling with -O0 Clang optimizes bugs away that then cannot be found > any longer by dynamic bug finding tools. The cases that I presented exhibit > undefined behavior. However, according to the "Principle of least > astonishment", I think that the errors should be compiled in a way so that > bug finding tools can still detect them. > > Following, I have the following questions/suggestions: > - Is it known that Clang performs optimizations that hide program bugs, > even when compiling with -O0? > - Are there command line options to specify that no optimizations should > be performed? Until recently, I thought that -O0 had this effect. > - In each case, I would propose to not perform optimizations at -O0 to > allow dynamic bug finding tools to find such bugs, or at least offer a flag > to turn off optimizations altogether. > > > [1] https://github.com/google/sanitizers/issues/773 > [2] https://refspecs.linuxfoundation.org/LSB_2.0.1/ > LSB-Core/LSB-Core/baselib---ctype-b-loc.html >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170324/199630c5/attachment.html>