thr3ads.net - llvm dev - [LLVMdev] Exception Handling Proposal

If this information is useful, please help other people find it:
Share via:

Renato Golin

2011-May-17 22:16 UTC

[LLVMdev] Exception Handling Proposal - Second round

Hi all,

Following John's, Duncan's and Bill's proposals about exception
handling, I thought I'd summarise what has been discussed so far.

 ** The problems we're trying to solve are:

 P1. Different languages have different EH concepts and IR needs to be
agnostic (as possible) about that
 P2. Inlining and optimisations (currently) destroy the EH semantics
and produce code that can't unwind
 P3. Clutter in the IR representation of EH leads to unnecessary
complexity when optimising or inlining
 P4. The back-end should have a simple and unified representation on
which to build (different) EH tables


 ** The key-facts I've collected after re-reading all emails are:

 F1. There are different families of EH: zero-cost, SjLj etc and they
should have similar IR representations
 F2. Back-ends should know how to implement them or bail out (thus,
representation should be *clear*)
 F3. Optimisations should make sure unwinding and normal flow do not overlap
 F4. Avoid artificial impositions on basic-block behaviour and
dependency to simplify optimisations
 F5. We *must* keep the unwind actions and the order in which they
execute when inlining
 F6. Some instructions (such as divide in Java) can throw exceptions
without an explicit dispatch mechanism


There are two quasi-orthogonal proposals to change the EH mechanism:
 - Duncan Sands', regarding rules on how to protect the dispatch
mechanism (and preserve actions and their orders) when inlining or
optimising code, and
 - Bill Wendling's IR simplification using the "dispatch"
mechanism to
better express unwinding flow and ease inlining and optimisations


 ** Proposal 1: Rules on how to protect the unwind flow (P2, F3, F4, F5)

Current LLVM inlining can create some unreachable blocks that get
optimised away (and shouldn't). Some languages demand that certain
clean-up areas must be executed, others that it must not. Some
libstdc++ code apparently relies on this implementation defined
behaviour. To solve this problem, work arounds were coded to redirect
flow to catch-all regions, that created other problems, etc.

Instead of running around in circles, the following rules must be
observed when inlining/optimising:
 - When inlining a dispatch area, the inlined block must resume to the
inlinee's dispatch block
 - If using eh.selector, inlining should append actions to inlinee's
selector block
 - Optimisers should not remove unwind actions nor change their
control flow (unless semantics is preserved)
 - If we allow changes, we need to explicitly describe the semantics
or have one to rule them all


 ** Proposal 2: Dispatch and basic-block markings (P3, P4, F5)

Replace the eh.selector/eh.typeid by a dispatch mechanism, that
explicitly lists the possible catch areas, filters, personality and
belongs to a basic block, that needs an attribute "landingpad" to help
optimisations understand that that block is special for EH (this might
not be strictly necessary).

The general syntax of the dispatch is:

lpad: landingpad
 %eh_ptr = tail call i8* @llvm.eh.exception()
 dispatch region label %lpad resume to label %unwind
   catches [
     %struct.__fundamental_type_info_pseudo* @_ZTIi, label %ch.int.main
   ]
   personality [i32 (...)* @__gxx_personality_v0]

This dispatch instruction is the last instruction in its block. It
explicitly belongs to that block ("region label %lpad") and resume
unwinding to label %unwind. It catches only INT exceptions (whatever
that means in the source language) and the personality routine that is
going to interpret it during run-time is __gxx_personality_v0.

When optimising, passes should see the catch/clean-up blocks that are
dominated by the lading pad and keep their natural flow. When
inlining, they should be move inside the inlinee and the the "resume
label" should be the inlinee's dispatch landing pad, so the sequence
of actions (and the actions themselves) is kept intact.

The dispatch call can also be attached to the invoke instruction,
though there were some problems with clean-ups (Bill) and it may
clutter the IR by repeating the same dispatch for many invokes in one
single try block.

I see that the %eh_ptr is not used by the dispatch, how does it know
what is the type of exception thrown?


 ** What was not covered

P1/F1/F2: Are these changes EH-style agnostic? Does it at least work
for Dwarf AND SjLj using the same IR representation? Do we want that
to happen?

F6: If a div instruction inside a basic block without EH unwind
information throws an exception, how does the IR represents that? Do
we create an invoke to a fake function for every instruction that
could throw? Do we put the unwind information in the basic-block? In
the dispatch instruction (like we do for region label)?


 ** Amount of work to do

I reckon that both changes can be done at the same time. Current work
is being done in the ARM back-end to support EHABI, which should also
be orthogonal to those changes (Anton?).

The inlining changes can be done at any time, no need to change the IR
or anything and the changes can be reused by the second proposal later
on.

The problem is that, to change the IR representation, we need to
change all front-ends that deal with exception handling (clang,
llvm-gcc, ada, python etc), and make the back-end iteratively more
robust to accept the new format, but it'd be hard to quickly
deactivate the old format.

I've seen this thread show up and die a few times, and I'm not sure we
have a pressure to do this at any given time. Do we?

cheers,
--renato

Bill Wendling

2011-May-18 00:09 UTC

head link

[LLVMdev] Exception Handling Proposal - Second round

Hi Renato,

Thanks for the summary. John and I have been working a lot on our proposal.
It's changed significantly since I wrote about it last. It encompasses a lot
of John's requirements and fixes the main issues. The key is getting enough
time to implement the ideas. As you can imagine, we're swamped here. But
this issue has not been dropped at all. :-)

I'm not ready yet to submit the proposal to the LLVM community – it's
still a bit rough. Some initial work seems to show that it's not bad and
will be easy to implement.

-bw

On May 17, 2011, at 3:16 PM, Renato Golin wrote:
> Hi all,
> 
> Following John's, Duncan's and Bill's proposals about exception
> handling, I thought I'd summarise what has been discussed so far.
> 
> ** The problems we're trying to solve are:
> 
> P1. Different languages have different EH concepts and IR needs to be
> agnostic (as possible) about that
> P2. Inlining and optimisations (currently) destroy the EH semantics
> and produce code that can't unwind
> P3. Clutter in the IR representation of EH leads to unnecessary
> complexity when optimising or inlining
> P4. The back-end should have a simple and unified representation on
> which to build (different) EH tables
> 
> 
> ** The key-facts I've collected after re-reading all emails are:
> 
> F1. There are different families of EH: zero-cost, SjLj etc and they
> should have similar IR representations
> F2. Back-ends should know how to implement them or bail out (thus,
> representation should be *clear*)
> F3. Optimisations should make sure unwinding and normal flow do not overlap
> F4. Avoid artificial impositions on basic-block behaviour and
> dependency to simplify optimisations
> F5. We *must* keep the unwind actions and the order in which they
> execute when inlining
> F6. Some instructions (such as divide in Java) can throw exceptions
> without an explicit dispatch mechanism
> 
> 
> There are two quasi-orthogonal proposals to change the EH mechanism:
> - Duncan Sands', regarding rules on how to protect the dispatch
> mechanism (and preserve actions and their orders) when inlining or
> optimising code, and
> - Bill Wendling's IR simplification using the "dispatch"
mechanism to
> better express unwinding flow and ease inlining and optimisations
> 
> 
> ** Proposal 1: Rules on how to protect the unwind flow (P2, F3, F4, F5)
> 
> Current LLVM inlining can create some unreachable blocks that get
> optimised away (and shouldn't). Some languages demand that certain
> clean-up areas must be executed, others that it must not. Some
> libstdc++ code apparently relies on this implementation defined
> behaviour. To solve this problem, work arounds were coded to redirect
> flow to catch-all regions, that created other problems, etc.
> 
> Instead of running around in circles, the following rules must be
> observed when inlining/optimising:
> - When inlining a dispatch area, the inlined block must resume to the
> inlinee's dispatch block
> - If using eh.selector, inlining should append actions to inlinee's
> selector block
> - Optimisers should not remove unwind actions nor change their
> control flow (unless semantics is preserved)
> - If we allow changes, we need to explicitly describe the semantics
> or have one to rule them all
> 
> 
> ** Proposal 2: Dispatch and basic-block markings (P3, P4, F5)
> 
> Replace the eh.selector/eh.typeid by a dispatch mechanism, that
> explicitly lists the possible catch areas, filters, personality and
> belongs to a basic block, that needs an attribute "landingpad" to
help
> optimisations understand that that block is special for EH (this might
> not be strictly necessary).
> 
> The general syntax of the dispatch is:
> 
> lpad: landingpad
> %eh_ptr = tail call i8* @llvm.eh.exception()
> dispatch region label %lpad resume to label %unwind
>   catches [
>     %struct.__fundamental_type_info_pseudo* @_ZTIi, label %ch.int.main
>   ]
>   personality [i32 (...)* @__gxx_personality_v0]
> 
> This dispatch instruction is the last instruction in its block. It
> explicitly belongs to that block ("region label %lpad") and
resume
> unwinding to label %unwind. It catches only INT exceptions (whatever
> that means in the source language) and the personality routine that is
> going to interpret it during run-time is __gxx_personality_v0.
> 
> When optimising, passes should see the catch/clean-up blocks that are
> dominated by the lading pad and keep their natural flow. When
> inlining, they should be move inside the inlinee and the the "resume
> label" should be the inlinee's dispatch landing pad, so the
sequence
> of actions (and the actions themselves) is kept intact.
> 
> The dispatch call can also be attached to the invoke instruction,
> though there were some problems with clean-ups (Bill) and it may
> clutter the IR by repeating the same dispatch for many invokes in one
> single try block.
> 
> I see that the %eh_ptr is not used by the dispatch, how does it know
> what is the type of exception thrown?
> 
> 
> ** What was not covered
> 
> P1/F1/F2: Are these changes EH-style agnostic? Does it at least work
> for Dwarf AND SjLj using the same IR representation? Do we want that
> to happen?
> 
> F6: If a div instruction inside a basic block without EH unwind
> information throws an exception, how does the IR represents that? Do
> we create an invoke to a fake function for every instruction that
> could throw? Do we put the unwind information in the basic-block? In
> the dispatch instruction (like we do for region label)?
> 
> 
> ** Amount of work to do
> 
> I reckon that both changes can be done at the same time. Current work
> is being done in the ARM back-end to support EHABI, which should also
> be orthogonal to those changes (Anton?).
> 
> The inlining changes can be done at any time, no need to change the IR
> or anything and the changes can be reused by the second proposal later
> on.
> 
> The problem is that, to change the IR representation, we need to
> change all front-ends that deal with exception handling (clang,
> llvm-gcc, ada, python etc), and make the back-end iteratively more
> robust to accept the new format, but it'd be hard to quickly
> deactivate the old format.
> 
> I've seen this thread show up and die a few times, and I'm not sure
we
> have a pressure to do this at any given time. Do we?
> 
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Andrew Trick

2011-May-18 02:10 UTC

head link

[LLVMdev] Exception Handling Proposal - Second round

On May 17, 2011, at 3:16 PM, Renato Golin wrote:> 
> F6. Some instructions (such as divide in Java) can throw exceptions
> without an explicit dispatch mechanism

This sounds like something that came out of a brainstorming session then snuck
into the project requirements when it's really a separate issue. I think you
can safely ignore it.

Implicit exceptions at the Java bytecode level are independent of how the
compiler models trapping instructions. An optimizing compiler will lower
bytecode into an IR with explicit control flow--this does not inhibit
optimization, it facilitates optimization. Note that branches are no more a
barrier to optimization than implicit exceptions. The difference is that an
effective optimization pass already needs to handle branches regardless of the
source language or whether compiling with exceptions enabled. It may currently
be a weak aspect of a few LLVM passes, but that's much easier to fix than
modifying the IR.

Some code generators may emit exceptional control flow as a trapping
instruction. This obviously requires runtime support but is otherwise language
independent. This is best done as late as possible (think instruction
scheduling) in a target specific manner. In practice, I've only seen it used
for implicit null checks to shave a cycle off some loads and squeeze a tiny
amount of performance out of benchmarks. In fact, hardware support for
*nontrapping* loads from page zero plus safe speculation would work much better.
In my opinion, implicit null checks are not worth the giant support overhead
typically caused by a runtime that tries to catch SIGSEGV. I wouldn't expect
a user-space signal handler to be used for integer divides, where there's no
performance benefit.

-Andy

Duncan Sands

2011-May-18 06:00 UTC

head link

[LLVMdev] Exception Handling Proposal - Second round

Hi Renato,
>   ** The problems we're trying to solve are:
this reminds me that I promised to post my own list of EH problems...
I just need to find the time!

Ciao, Duncan.

Renato Golin

2011-May-18 08:34 UTC

head link

[LLVMdev] Exception Handling Proposal - Second round

On 18 May 2011 03:10, Andrew Trick <atrick at apple.com>
wrote:> This sounds like something that came out of a brainstorming session then
snuck into the project requirements when it's really a separate issue. I
think you can safely ignore it.
Ah, great! It was something someone asked about and I've only heard
"don't worry about it" but never heard a full explanation, until
now!
;)

Thanks!
--renato

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - May 2011 - [LLVMdev] Exception Handling Proposal - Second round

[LLVMdev] Exception Handling Proposal - Second round

[LLVMdev] Exception Handling Proposal - Second round

[LLVMdev] Exception Handling Proposal - Second round

[LLVMdev] Exception Handling Proposal - Second round

[LLVMdev] Exception Handling Proposal - Second round

Possibly Parallel Threads