Exception handling in LLVM is broken. It's as simple as that.
We can simulate exception handling in most cases, but we cannot handle all
cases. (For instance, SingleSource/UnitTests/ObjC/exceptions.m in our testsuite
doesn't work on ARM at anything optimization level above -O0.) And
there's no
way to coerce it to work with our current EH scheme.
We don't follow the exception handling ABI:
http://www.codesourcery.com/public/cxx-abi/abi-eh.html
This has caused problems for at least one project I know of. Also, because we
don't follow the ABI, our exception handling is slow (and people have
noticed). We call _Uwind_Resume_or_Rethrow, which is expensive and unnecessary.
Inlining is a huge problem for our current EH scheme. Its inability to properly
inline cleanups is the reason why I had to create the (very expensive) hack in
DwarfEHPrepare. And in the case of SingleSource/UnitTests/ObjC/exceptions.m, it
simply fails. The inlining code has to create "catch-alls" that throw
and catch
within the same function. To see an example of this, compile this simple code
into LLVM IR:
#include <iostream>
struct A {
~A();
};
void bar();
void foo() __attribute__((always_inline));
void foo() {
try {
A a;
bar();
} catch (const char *c) {
std::cout << "foo() catch value: " << c <<
"\n";
}
}
int main() {
try {
foo();
} catch (int i) {
std::cout << "main() catch value: " << i <<
'\n';
}
}
The code is much larger than it needs to be, it has catch-alls, and is very
difficult to understand.
All of this is because the LLVM passes cannot properly reason about the
exception handling code. The EH information resides in intrinsics, which may be
located far from the `unwind' edge of the invoke they're associated with
(this
is resolved directly before CodeGen). So it's not always possible for the
inlining pass, or any other pass, to have the knowledge it needs to modify the
EH code in a sensible manner.
If exception handling were to use native IR instructions, it would be easy for
inlining and other passes to understand what's going on. And they would be
able
to modify the code in well-documented ways that would retain the correct EH
semantics.
For all of the trouble it's causing us, exception handling is conceptually
rather simple. A call within a section of code (called a `region', for lack
of a
better term) may throw an exception. When that occurs, execution continues at
the catch handler. The existence of cleanups shouldn't complicate this.
(They
execute before the catch handler code, or not at all if it's C++ and there
are
no catch handlers on the stack.) All of the heavy lifting is done by external
libraries -- the personality function and libunwind.
There's only one complication that I ran into when I was rewriting EH last
year. The EH information needs to be available at two places in the code for
code-gen to produce the correct EH tables. (Again, this isn't meant to be
DWARF-specific, but it needs to support it.)
* At the throwing call -- We need it here because it's the origin of the
exception, and it has the information of where we're coming from and the
landing pad for the region containing the call, and
* At the landing pad, but after the cleanup code -- We need it here because this
is where we generate a "jump table" (something like a switch
statement) to go
to a specific catch block. Note that the cleanup code can be arbitrarily
complex. This, coupled with the movement of the EH intrinsics, makes
associating a particular set of catch blocks with a throwing call almost
impossible (with our current scheme).
To summarize:
* Exception handling needs to be a first-class citizen of the LLVM IR in order
for it to be understood and modified correctly by all passes.
* The information needed to generating correct EH tables needs to be available
at more than just one point in the function.
-bw
Pardon the basic question, but does this apply to clang, llvm-gcc, or
both?
Thanks,
-David
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
On Behalf Of Bill Wendling
Sent: Tuesday, April 12, 2011 3:05 PM
To: llvmdev List
Subject: [LLVMdev] Exception Handling Problems
Exception handling in LLVM is broken. It's as simple as that.
We can simulate exception handling in most cases, but we cannot handle
all cases. (For instance, SingleSource/UnitTests/ObjC/exceptions.m in
our testsuite doesn't work on ARM at anything optimization level above
-O0.) And there's no way to coerce it to work with our current EH
scheme.
We don't follow the exception handling ABI:
http://www.codesourcery.com/public/cxx-abi/abi-eh.html
This has caused problems for at least one project I know of. Also,
because we don't follow the ABI, our exception handling is slow (and
people have noticed). We call _Uwind_Resume_or_Rethrow, which is
expensive and unnecessary.
Inlining is a huge problem for our current EH scheme. Its inability to
properly inline cleanups is the reason why I had to create the (very
expensive) hack in DwarfEHPrepare. And in the case of
SingleSource/UnitTests/ObjC/exceptions.m, it simply fails. The inlining
code has to create "catch-alls" that throw and catch within the same
function. To see an example of this, compile this simple code into LLVM
IR:
#include <iostream>
struct A {
~A();
};
void bar();
void foo() __attribute__((always_inline)); void foo() {
try {
A a;
bar();
} catch (const char *c) {
std::cout << "foo() catch value: " << c <<
"\n";
}
}
int main() {
try {
foo();
} catch (int i) {
std::cout << "main() catch value: " << i <<
'\n';
}
}
The code is much larger than it needs to be, it has catch-alls, and is
very difficult to understand.
All of this is because the LLVM passes cannot properly reason about the
exception handling code. The EH information resides in intrinsics, which
may be located far from the `unwind' edge of the invoke they're
associated with (this is resolved directly before CodeGen). So it's not
always possible for the inlining pass, or any other pass, to have the
knowledge it needs to modify the EH code in a sensible manner.
If exception handling were to use native IR instructions, it would be
easy for inlining and other passes to understand what's going on. And
they would be able to modify the code in well-documented ways that would
retain the correct EH semantics.
For all of the trouble it's causing us, exception handling is
conceptually rather simple. A call within a section of code (called a
`region', for lack of a better term) may throw an exception. When that
occurs, execution continues at the catch handler. The existence of
cleanups shouldn't complicate this. (They execute before the catch
handler code, or not at all if it's C++ and there are no catch handlers
on the stack.) All of the heavy lifting is done by external libraries --
the personality function and libunwind.
There's only one complication that I ran into when I was rewriting EH
last year. The EH information needs to be available at two places in the
code for code-gen to produce the correct EH tables. (Again, this isn't
meant to be DWARF-specific, but it needs to support it.)
* At the throwing call -- We need it here because it's the origin of the
exception, and it has the information of where we're coming from and
the
landing pad for the region containing the call, and
* At the landing pad, but after the cleanup code -- We need it here
because this
is where we generate a "jump table" (something like a switch
statement) to go
to a specific catch block. Note that the cleanup code can be
arbitrarily
complex. This, coupled with the movement of the EH intrinsics, makes
associating a particular set of catch blocks with a throwing call
almost
impossible (with our current scheme).
To summarize:
* Exception handling needs to be a first-class citizen of the LLVM IR in
order
for it to be understood and modified correctly by all passes.
* The information needed to generating correct EH tables needs to be
available
at more than just one point in the function.
-bw
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Apr 12, 2011, at 5:48 PM, David Dunkle wrote:> Pardon the basic question, but does this apply to clang, llvm-gcc, or > both?Yes. -eric
On 12 April 2011 23:04, Bill Wendling <wendling at apple.com> wrote:> * Exception handling needs to be a first-class citizen of the LLVM IR in order > for it to be understood and modified correctly by all passes.Agreed!> * The information needed to generating correct EH tables needs to be available > at more than just one point in the function.Indeed, it needs to be consistent and reachable from multiple places, code and unwind blocks. The unwind call graph must be first-class citizen and it must be tightly coupled with the normal flow (to allow inlining) and the semantics must be clear, so passes won't destroy it easily. However, since the C++ ABI is but one example on how to do EH and LLVM is language agnostic, I'm inclined to say that this is an impossible task. This is not to say that it can't be done, far from it, but that it won't be as clean as we'd hope for. There are some things (like exception handling and bitfields) that doesn't matter how hard you try refactoring, it always end up dirty. What we need is a clear set of premises (just like John has just made) that are language agnostic and follow them wholeheartedly. We should only try to come up with a plan for IR when those premises have been agreed in a document in SVN. cheers, --renato