Michael Kruse via llvm-dev
2019-Aug-02 01:35 UTC
[llvm-dev] [RFC] Stack overflow and optimizations
During the review of https://reviews.llvm.org/D59978 we got to the question whether we can optimize based on the assumption that stack overflow is undefined behavior. While I think it is according to the C++ standard, Windows Structured Exception Handling and signal handlers might allow programs to recover/exit gracefully. Concretely, the patch D59978 wants add the noreturn attribute to functions that recurse infinitly. Also, a function that unconditionally calls a function with a noreturn attribute can be deduced to be noreturn. Consider the following program: ``` #include <cstdlib> #include <cstdio> void overflow(); // Secret function that triggers a stack overflow int catchoverflow() { __try { overflow(); printf("Exception NOT caught\n"); return 0; } __except(true) { printf("Exception caught\n"); return 1; } return 2; } int main() { auto result = catchoverflow(); printf("Done execution result=%i\n", result); return EXIT_SUCCESS; } ``` An implementation of the overflow function could be: ``` void overflow() { overflow(); printf("x"); } ``` Compiled with msvc or clang-cl, this prints: ``` Exception caught Done execution result=1 ``` That is, even though overflow() does not return normally, catchoverflow() does. The documentation for the "noreturn" attribute currently says (https://llvm.org/docs/LangRef.html#function-attributes):> This function attribute indicates that the function never returns normally. This produces undefined behavior at runtime if the function ever does dynamically return.The motivation of marking catchoverflow() is that a program cannot reasonably expect to continue normal execution after a stack overflow happened, and shouldn't inhibit optimizations. In this RFC, I would like to clarify the following questions: 1. Does the "noreturn" attribute imply that the function is not returning via the unwind edge of an invoke instruction? 2. Can/should overflow() be marked noreturn? 3. Can/should catchoverflow() be marked noreturn? 4a. More generally, for the purpose of optimizations, can we assume that functions do not overflow? That is, is stack overflow is undefined behavior? 4b. If not undefined behavior, can we assume that if a stack overflow occurs, the program will be terminated? This would e.g. stop side-effect code to be moved before the overflowing call; otherwise it would be executed on overflow termination. How would we check whether a function can overflow the stack? Whatever the answers to these questions are, I think we should clarify what the function attributes noreturn, nounwind, willreturn mean. The most explicit way would be listing all the possible outcomes of calling a function and exclude possibilities per attribute. 1. Call returns to next instruction / invoke branches to "to" block. 2a. Call throws synchronous exception and stack is unwound to parent caller 2b. Invoke throws synchronous exception and branches to "unwind to" block 3a. Call throws asynchronous exception and stack is unwound to parent caller 3b. Invoke throws asynchronous exception and branches to "unwind to" block 4a. Call/invoke triggers thread/program termination and dtors/atexit funcs are called (e.g. by calling exit()) 4b. Call/invoke triggers thread/program immediate termination (e.g. by calling abort()) 5. Signal handler is executed and which kills the program 6. Immediate thread/program termination by external cause (e.g. by , `TerminateThread`, `kill -9`, kernel panic, power switch) Michael
Michael Kruse via llvm-dev
2019-Aug-02 02:27 UTC
[llvm-dev] [RFC] Stack overflow and optimizations
Two aspects that I forgot to mention in the RFC mail: First: In some situations, we already assume that stack overflows are unobservable behaviours. For instance, optimizations of stack variables cause the stack frames to be smaller, such that the stack overflow does not occur where it would have in an unoptimized program. Also, we optimize the function ``` void overflow() { overflow(); } ``` to ``` ; Function Attrs: nounwind readnone sspstrong uwtable define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 { entry: ret void } ``` Using the assumption that a call overflow() does not return, we could optimize overflow() as defined in ``` void overflow() { overflow(); printf("x"); } ``` to ``` void overflow() { overflow(); } ``` i.e. to the same function (which does return). We currently do not apply this, but optimizations based on noreturn might change this. Second: Does the C++ "noexcept" specifier have an influence on whether the compiler can assume that a function does not overflow its stack? Michael Am Do., 1. Aug. 2019 um 20:35 Uhr schrieb Michael Kruse <llvmdev at meinersbur.de>:> > During the review of https://reviews.llvm.org/D59978 we got to the > question whether we can optimize based on the assumption that stack > overflow is undefined behavior. While I think it is according to the > C++ standard, Windows Structured Exception Handling and signal > handlers might allow programs to recover/exit gracefully. > > Concretely, the patch D59978 wants add the noreturn attribute to > functions that recurse infinitly. Also, a function that > unconditionally calls a function with a noreturn attribute can be > deduced to be noreturn. > > Consider the following program: > ``` > #include <cstdlib> > #include <cstdio> > > void overflow(); // Secret function that triggers a stack overflow > > int catchoverflow() { > __try { > overflow(); > printf("Exception NOT caught\n"); > return 0; > } __except(true) { > printf("Exception caught\n"); > return 1; > } > return 2; > } > > int main() { > auto result = catchoverflow(); > printf("Done execution result=%i\n", result); > return EXIT_SUCCESS; > } > ``` > An implementation of the overflow function could be: > ``` > void overflow() { > overflow(); > printf("x"); > } > ``` > > Compiled with msvc or clang-cl, this prints: > ``` > Exception caught > Done execution result=1 > ``` > That is, even though overflow() does not return normally, catchoverflow() does. > > The documentation for the "noreturn" attribute currently says > (https://llvm.org/docs/LangRef.html#function-attributes): > > This function attribute indicates that the function never returns normally. This produces undefined behavior at runtime if the function ever does dynamically return. > > The motivation of marking catchoverflow() is that a program cannot > reasonably expect to continue normal execution after a stack overflow > happened, and shouldn't inhibit optimizations. > > In this RFC, I would like to clarify the following questions: > 1. Does the "noreturn" attribute imply that the function is not > returning via the unwind edge of an invoke instruction? > 2. Can/should overflow() be marked noreturn? > 3. Can/should catchoverflow() be marked noreturn? > 4a. More generally, for the purpose of optimizations, can we assume > that functions do not overflow? That is, is stack overflow is > undefined behavior? > 4b. If not undefined behavior, can we assume that if a stack overflow > occurs, the program will be terminated? This would e.g. stop > side-effect code to be moved before the overflowing call; otherwise it > would be executed on overflow termination. How would we check whether > a function can overflow the stack? > > Whatever the answers to these questions are, I think we should clarify > what the function attributes noreturn, nounwind, willreturn mean. The > most explicit way would be listing all the possible outcomes of > calling a function and exclude possibilities per attribute. > > 1. Call returns to next instruction / invoke branches to "to" block. > 2a. Call throws synchronous exception and stack is unwound to parent caller > 2b. Invoke throws synchronous exception and branches to "unwind to" block > 3a. Call throws asynchronous exception and stack is unwound to parent caller > 3b. Invoke throws asynchronous exception and branches to "unwind to" block > 4a. Call/invoke triggers thread/program termination and dtors/atexit > funcs are called (e.g. by calling exit()) > 4b. Call/invoke triggers thread/program immediate termination (e.g. by > calling abort()) > 5. Signal handler is executed and which kills the program > 6. Immediate thread/program termination by external cause (e.g. by , > `TerminateThread`, `kill -9`, kernel panic, power switch) > > > Michael
Ralf Jung via llvm-dev
2019-Aug-10 16:54 UTC
[llvm-dev] [RFC] Stack overflow and optimizations
Hi Michael, Please keep in mind non-C/C++ frontends. For example, in Rust, we promise to avoid all undefined behavior in safe code. There is no reasonable compositional analysis that can statically detect stack overflows (I know safety-critical systems are subjected such analyses, but those could not reasonably be enforced on all Rust code -- most of them just forbid recursion, for example), so we defer to a dynamic check instead: we have a guard page for every thread, and we make sure that for big stack frames being pushed, every page covered by that frame is touched at least once (so that we will never skip the guard page). If LLVM considers stack overflows as UB, the guard page is not enough. Dynamic checks cannot guard against UB exploited by the compiler. So, in that case it would become basically impossible to compile a safe language to LLVM, as far as I can see.> Two aspects that I forgot to mention in the RFC mail: > > First: In some situations, we already assume that stack overflows are > unobservable behaviours. For instance, optimizations of stack > variables cause the stack frames to be smaller, such that the stack > overflow does not occur where it would have in an unoptimized program.That's fine. Basically, the abstract machine can be specified to *non-deterministically* trigger a stack overflow any time the stack grows. So essentially the running program has to be prepared that the stack might overflow any time, but at the same time cannot rely on *when* it overflows. This is a standard "trick" to formally handle resource exhaustion without having to specify exactly how much of that resource is available. This gives LLVM the freedom to grow or shrink stack frames. But this dos *not* mean that stack overflows are UB!> Also, we optimize the function > ``` > void overflow() { > overflow(); > } > ``` > to > ``` > ; Function Attrs: nounwind readnone sspstrong uwtable > define dso_local void @"?overflow@@YAXXZ"() local_unnamed_addr #0 { > entry: > ret void > } > ```This looks like an instance of the C++ forward progress assumption? (C AFAIK only has that assumption for [some] loops, not recursive functions, so arguably this is a miscompilation for C.) Kind regards, Ralf