Exception handling in LLVM is broken. It's as simple as that. We can simulate exception handling in most cases, but we cannot handle all cases. (For instance, SingleSource/UnitTests/ObjC/exceptions.m in our testsuite doesn't work on ARM at anything optimization level above -O0.) And there's no way to coerce it to work with our current EH scheme. We don't follow the exception handling ABI: http://www.codesourcery.com/public/cxx-abi/abi-eh.html This has caused problems for at least one project I know of. Also, because we don't follow the ABI, our exception handling is slow (and people have noticed). We call _Uwind_Resume_or_Rethrow, which is expensive and unnecessary. Inlining is a huge problem for our current EH scheme. Its inability to properly inline cleanups is the reason why I had to create the (very expensive) hack in DwarfEHPrepare. And in the case of SingleSource/UnitTests/ObjC/exceptions.m, it simply fails. The inlining code has to create "catch-alls" that throw and catch within the same function. To see an example of this, compile this simple code into LLVM IR: #include <iostream> struct A { ~A(); }; void bar(); void foo() __attribute__((always_inline)); void foo() { try { A a; bar(); } catch (const char *c) { std::cout << "foo() catch value: " << c << "\n"; } } int main() { try { foo(); } catch (int i) { std::cout << "main() catch value: " << i << '\n'; } } The code is much larger than it needs to be, it has catch-alls, and is very difficult to understand. All of this is because the LLVM passes cannot properly reason about the exception handling code. The EH information resides in intrinsics, which may be located far from the `unwind' edge of the invoke they're associated with (this is resolved directly before CodeGen). So it's not always possible for the inlining pass, or any other pass, to have the knowledge it needs to modify the EH code in a sensible manner. If exception handling were to use native IR instructions, it would be easy for inlining and other passes to understand what's going on. And they would be able to modify the code in well-documented ways that would retain the correct EH semantics. For all of the trouble it's causing us, exception handling is conceptually rather simple. A call within a section of code (called a `region', for lack of a better term) may throw an exception. When that occurs, execution continues at the catch handler. The existence of cleanups shouldn't complicate this. (They execute before the catch handler code, or not at all if it's C++ and there are no catch handlers on the stack.) All of the heavy lifting is done by external libraries -- the personality function and libunwind. There's only one complication that I ran into when I was rewriting EH last year. The EH information needs to be available at two places in the code for code-gen to produce the correct EH tables. (Again, this isn't meant to be DWARF-specific, but it needs to support it.) * At the throwing call -- We need it here because it's the origin of the exception, and it has the information of where we're coming from and the landing pad for the region containing the call, and * At the landing pad, but after the cleanup code -- We need it here because this is where we generate a "jump table" (something like a switch statement) to go to a specific catch block. Note that the cleanup code can be arbitrarily complex. This, coupled with the movement of the EH intrinsics, makes associating a particular set of catch blocks with a throwing call almost impossible (with our current scheme). To summarize: * Exception handling needs to be a first-class citizen of the LLVM IR in order for it to be understood and modified correctly by all passes. * The information needed to generating correct EH tables needs to be available at more than just one point in the function. -bw
Pardon the basic question, but does this apply to clang, llvm-gcc, or both? Thanks, -David -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Bill Wendling Sent: Tuesday, April 12, 2011 3:05 PM To: llvmdev List Subject: [LLVMdev] Exception Handling Problems Exception handling in LLVM is broken. It's as simple as that. We can simulate exception handling in most cases, but we cannot handle all cases. (For instance, SingleSource/UnitTests/ObjC/exceptions.m in our testsuite doesn't work on ARM at anything optimization level above -O0.) And there's no way to coerce it to work with our current EH scheme. We don't follow the exception handling ABI: http://www.codesourcery.com/public/cxx-abi/abi-eh.html This has caused problems for at least one project I know of. Also, because we don't follow the ABI, our exception handling is slow (and people have noticed). We call _Uwind_Resume_or_Rethrow, which is expensive and unnecessary. Inlining is a huge problem for our current EH scheme. Its inability to properly inline cleanups is the reason why I had to create the (very expensive) hack in DwarfEHPrepare. And in the case of SingleSource/UnitTests/ObjC/exceptions.m, it simply fails. The inlining code has to create "catch-alls" that throw and catch within the same function. To see an example of this, compile this simple code into LLVM IR: #include <iostream> struct A { ~A(); }; void bar(); void foo() __attribute__((always_inline)); void foo() { try { A a; bar(); } catch (const char *c) { std::cout << "foo() catch value: " << c << "\n"; } } int main() { try { foo(); } catch (int i) { std::cout << "main() catch value: " << i << '\n'; } } The code is much larger than it needs to be, it has catch-alls, and is very difficult to understand. All of this is because the LLVM passes cannot properly reason about the exception handling code. The EH information resides in intrinsics, which may be located far from the `unwind' edge of the invoke they're associated with (this is resolved directly before CodeGen). So it's not always possible for the inlining pass, or any other pass, to have the knowledge it needs to modify the EH code in a sensible manner. If exception handling were to use native IR instructions, it would be easy for inlining and other passes to understand what's going on. And they would be able to modify the code in well-documented ways that would retain the correct EH semantics. For all of the trouble it's causing us, exception handling is conceptually rather simple. A call within a section of code (called a `region', for lack of a better term) may throw an exception. When that occurs, execution continues at the catch handler. The existence of cleanups shouldn't complicate this. (They execute before the catch handler code, or not at all if it's C++ and there are no catch handlers on the stack.) All of the heavy lifting is done by external libraries -- the personality function and libunwind. There's only one complication that I ran into when I was rewriting EH last year. The EH information needs to be available at two places in the code for code-gen to produce the correct EH tables. (Again, this isn't meant to be DWARF-specific, but it needs to support it.) * At the throwing call -- We need it here because it's the origin of the exception, and it has the information of where we're coming from and the landing pad for the region containing the call, and * At the landing pad, but after the cleanup code -- We need it here because this is where we generate a "jump table" (something like a switch statement) to go to a specific catch block. Note that the cleanup code can be arbitrarily complex. This, coupled with the movement of the EH intrinsics, makes associating a particular set of catch blocks with a throwing call almost impossible (with our current scheme). To summarize: * Exception handling needs to be a first-class citizen of the LLVM IR in order for it to be understood and modified correctly by all passes. * The information needed to generating correct EH tables needs to be available at more than just one point in the function. -bw _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Apr 12, 2011, at 5:48 PM, David Dunkle wrote:> Pardon the basic question, but does this apply to clang, llvm-gcc, or > both?Yes. -eric
On 12 April 2011 23:04, Bill Wendling <wendling at apple.com> wrote:> * Exception handling needs to be a first-class citizen of the LLVM IR in order > for it to be understood and modified correctly by all passes.Agreed!> * The information needed to generating correct EH tables needs to be available > at more than just one point in the function.Indeed, it needs to be consistent and reachable from multiple places, code and unwind blocks. The unwind call graph must be first-class citizen and it must be tightly coupled with the normal flow (to allow inlining) and the semantics must be clear, so passes won't destroy it easily. However, since the C++ ABI is but one example on how to do EH and LLVM is language agnostic, I'm inclined to say that this is an impossible task. This is not to say that it can't be done, far from it, but that it won't be as clean as we'd hope for. There are some things (like exception handling and bitfields) that doesn't matter how hard you try refactoring, it always end up dirty. What we need is a clear set of premises (just like John has just made) that are language agnostic and follow them wholeheartedly. We should only try to come up with a plan for IR when those premises have been agreed in a document in SVN. cheers, --renato