thr3ads.net - llvm dev - [LLVMdev] RFC: Exception Handling Proposal II [Nov 2010]

If this information is useful, please help other people find it:
Share via:

John McCall

2010-Nov-25 02:41 UTC

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 5:07 PM, Bill Wendling wrote:
> On Nov 24, 2010, at 11:18 AM, John McCall wrote:
> 
>> On Nov 24, 2010, at 5:36 AM, Bill Wendling wrote:
>>> Ah ha! I think I had a different mental model than you did. Or at
least I remembered things differently from the discussion. :-) For me, there is
one dispatch per region, which is why I had the region number associated with
the invokes as well as the "unwind to" edge coming from them. (See my
response to Renato for a description.) I'll think more about your model...
>> 
>> Hmm.  The difference between our models is actually in what we're
calling a region.  In your model, adding a cleanup doesn't require a new
region;  you just create a new landing pad which does the cleanup and then
branches to (I guess) the landing pad of its containing cleanup.  So landing
pads are many-to-one with regions and dispatch instructions.  In my model, every
independent landing pad is necessarily a region.  So in my model, there is also
one dispatch per region, but there are more regions.
>> 
>> So, in this example:
>> A { A(); ~A() throw(); };
>> void foo() throw() {
>>   A x; 
>>   b();
>> }
>> 
>> Your model has this as follows, modulo syntax:
>> entry:
>>   %x = alloca %A
>>   invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
>> succ0:
>>   invoke void @b() to label %succ1 unwind label %lp1 region 0
>> succ1:
>>   call void @A::~A(%A* %x) nounwind
>>   ret void
>> lp0:
>>   call void @A::~A(%A* %x) nounwind
>>   br label %lp1
>> lp1:
>>   dispatch region 0 [filter i8* null]    # no resume edge
>> 
>> Whereas my model has this as follows:
>> entry:
>>   %x = alloca %A
>>   invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
>> succ0:
>>   invoke void @b() to label %succ1 unwind label %lp1 region 1
>> succ1:
>>   call void @A::~A(%A* %x) nounwind
>>   ret void
>> lp0:
>>   call void @A::~A(%A* %x) nounwind
>>   dispatch region 0 [], resume label %lp1
>> lp1:
>>   dispatch region 1 [filter i8* null]   # no resume edge
>> 
>> I think my model has some nice conceptual advantages;  for one, it
gives you the constraint that only EH edges and dispatch instructions can lead
to landing pads, which I think will simplify what EH preparation has to do.  But
I could be convinced.
> 
> Notice though that we would need to keep both the EH edge from the invoke
and the region numbers, which you said was redundant (in your first email). :-)
Consider if there were several cleanup landing pads leading in a chain, through
successive dispatches, down to the last dispatch that decides which catch to
execute:
> lp0:
>   call void @A::~A(%A* %x)
>   dispatch region 0 [], resume label %lp1
> lp1:
>   call void @B::~B(%B* %y)
>   dispatch region 1 [], resume label %lp2
> lp2:
>   call void @C::~C(%C* %z)
>   dispatch region 2 [], resume label %lp3
> lp3:
>   dispatch region 3 [filter i8* null]   # no resume edge
> 
> If we have inlining of any of those d'tor calls, we may lose the fact
that the dispatch in, say, lp1 is a cleanup that lands onto the region 2
dispatch in lp2.
What you mean is that, given a resume or invoke edge, we need to be able to find
the dispatch for the target region.  There are ways to make that happen without
tagged edges;  for example, you could make the landing pad a special subclass of
BasicBlock with a pointer to the dispatch, although that'd be a fairly
invasive change.  Tagging the edges solves the problem for clients with a handle
on an edge;  clients that want to go from (say) a dispatch to its landing pad(s)
will still have trouble.
> We know from experience that once this information is lost, it's
*really* hard to get it back again. That's what DwarfEHPrepare.cpp is all
about, and I want to get rid of that pass because it's a series of hacks to
get around our poor EH support.
While I agree that the hacks need to go, there is always going to be some amount
of custom codegen for EH just to get the special data to flow properly from
landing pads to the eh.exception intrinsic and the dispatch instruction.  My
design would give you some very powerful assumptions to work with to implement
that:  both the intrinsic calls and the dispatch would always be dominated by
the region's landing pad, which would in turn only be reachable via specific
edges.  I don't know how you're planning on implementing this without
those assumptions, but if you say you don't need them, that certainly
diminishes the appeal of my proposal.
> On a personal level, I'm not a big fan of forcing constraints on code
when they aren't needed. We had problems in the past (with inlining into a
cleanup part of an EH region) with enforcing the invariant that only unwind
edges may jump to a landing pad. If we go back to the example above, if @C::~C()
were inlined, it could come to pass that the dispatch is placed into a separate
basic block and that the inlined code branches into that new basic block thus
violating the constraint.
I'm not suggesting that the landing pad has to be the same block as the
block with the dispatch instruction.  That's an obviously unreasonable
constraint;  in fact, it wouldn't hold for the vast majority of C++
cleanups, since we generally have to invoke destructors so we can terminate if
they throw.  Since — unlike present-day branches between cleanups — dispatch
edges will be initially opaque to the optimizer, I don't see much danger of
it producing invalid code from otherwise-valid transformations.

John.

Bill Wendling

2010-Nov-25 05:01 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 6:41 PM, John McCall wrote:
> On Nov 24, 2010, at 5:07 PM, Bill Wendling wrote:
> 
>> On Nov 24, 2010, at 11:18 AM, John McCall wrote:
>> 
>>> I think my model has some nice conceptual advantages;  for one, it
gives you the constraint that only EH edges and dispatch instructions can lead
to landing pads, which I think will simplify what EH preparation has to do.  But
I could be convinced.
>> 
>> Notice though that we would need to keep both the EH edge from the
invoke and the region numbers, which you said was redundant (in your first
email). :-) Consider if there were several cleanup landing pads leading in a
chain, through successive dispatches, down to the last dispatch that decides
which catch to execute:
> 
>> lp0:
>>  call void @A::~A(%A* %x)
>>  dispatch region 0 [], resume label %lp1
>> lp1:
>>  call void @B::~B(%B* %y)
>>  dispatch region 1 [], resume label %lp2
>> lp2:
>>  call void @C::~C(%C* %z)
>>  dispatch region 2 [], resume label %lp3
>> lp3:
>>  dispatch region 3 [filter i8* null]   # no resume edge
>> 
>> If we have inlining of any of those d'tor calls, we may lose the
fact that the dispatch in, say, lp1 is a cleanup that lands onto the region 2
dispatch in lp2.
> 
> What you mean is that, given a resume or invoke edge, we need to be able to
find the dispatch for the target region.  There are ways to make that happen
without tagged edges;  for example, you could make the landing pad a special
subclass of BasicBlock with a pointer to the dispatch, although that'd be a
fairly invasive change.  Tagging the edges solves the problem for clients with a
handle on an edge;  clients that want to go from (say) a dispatch to its landing
pad(s) will still have trouble.
It's not that troublesome. The dispatch would give you the region number.
All objects in the function with that region number will point to the landing
pad(s) for that region.
>> We know from experience that once this information is lost, it's
*really* hard to get it back again. That's what DwarfEHPrepare.cpp is all
about, and I want to get rid of that pass because it's a series of hacks to
get around our poor EH support.
> 
> While I agree that the hacks need to go, there is always going to be some
amount of custom codegen for EH just to get the special data to flow properly
from landing pads to the eh.exception intrinsic and the dispatch instruction. 
My design would give you some very powerful assumptions to work with to
implement that:  both the intrinsic calls and the dispatch would always be
dominated by the region's landing pad, which would in turn only be reachable
via specific edges.  I don't know how you're planning on implementing
this without those assumptions, but if you say you don't need them, that
certainly diminishes the appeal of my proposal.
We actually have the "reachable via specific edges" check in our code
right now. But when we tried to allow exceptions to be marked as proper
"cleanups", the assumption was violated. So I'm wary of making
this same assumption twice.

But anyway, I think that I can gather the necessary information from the region
numbers and the invokes' "unwind to" edges to create the EH
tables. The only intrinsic call that should remain is the one that gets the EH
pointer. And that's only needed by the catch blocks.

Perhaps I'm missing something? :-)
>> On a personal level, I'm not a big fan of forcing constraints on
code when they aren't needed. We had problems in the past (with inlining
into a cleanup part of an EH region) with enforcing the invariant that only
unwind edges may jump to a landing pad. If we go back to the example above, if
@C::~C() were inlined, it could come to pass that the dispatch is placed into a
separate basic block and that the inlined code branches into that new basic
block thus violating the constraint.
> 
> I'm not suggesting that the landing pad has to be the same block as the
block with the dispatch instruction.  That's an obviously unreasonable
constraint;  in fact, it wouldn't hold for the vast majority of C++
cleanups, since we generally have to invoke destructors so we can terminate if
they throw.  Since — unlike present-day branches between cleanups — dispatch
edges will be initially opaque to the optimizer, I don't see much danger of
it producing invalid code from otherwise-valid transformations.
Okay, I misread. But then we're back to a disconnect of information. If you
have lp0 jumping into lp1, how does it know that that's the dispatch for
region 1? We would have to implement the BasicBlock subclassing that you
mentioned above, because it's yet another piece of information that needs to
be tightly coupled between instructions. The subclassing of BasicBlock has some
issues which need to be thought out more. In particular, it requires augmenting
the IR with basic block attributes, defining semantics on those, supporting them
throughout the compiler, etc. It's a large change in and of itself, and
would be no longer orthogonal to the EH changes.

-bw

John McCall

2010-Nov-25 07:49 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 9:01 PM, Bill Wendling wrote:> On Nov 24, 2010, at 6:41 PM, John McCall wrote:
>> What you mean is that, given a resume or invoke edge, we need to be
able to find the dispatch for the target region.  There are ways to make that
happen without tagged edges;  for example, you could make the landing pad a
special subclass of BasicBlock with a pointer to the dispatch, although
that'd be a fairly invasive change.  Tagging the edges solves the problem
for clients with a handle on an edge;  clients that want to go from (say) a
dispatch to its landing pad(s) will still have trouble.
> 
> It's not that troublesome. The dispatch would give you the region
number. All objects in the function with that region number will point to the
landing pad(s) for that region.
Well, so a linear scan of the function seems like trouble to me, but I
shouldn't worry too much about optimizing a theoretical client. :)

Anyway, I'd like to table this part of the discussion while we resolve the
other half.  We're arguing about two things:
1.  Whether edges leading to landing pads should also be tagged with the region.
This is, well, more important than a bike shed, but still insignificant relative
to:
2.  Whether a single region can have multiple landing pads.  This is actually a
pretty central question in the design.

Plus the answer to #2 may obviate the need for discussion on #1 anyway.
>>> We know from experience that once this information is lost,
it's *really* hard to get it back again. That's what DwarfEHPrepare.cpp
is all about, and I want to get rid of that pass because it's a series of
hacks to get around our poor EH support.
>> 
>> While I agree that the hacks need to go, there is always going to be
some amount of custom codegen for EH just to get the special data to flow
properly from landing pads to the eh.exception intrinsic and the dispatch
instruction.  My design would give you some very powerful assumptions to work
with to implement that:  both the intrinsic calls and the dispatch would always
be dominated by the region's landing pad, which would in turn only be
reachable via specific edges.  I don't know how you're planning on
implementing this without those assumptions, but if you say you don't need
them, that certainly diminishes the appeal of my proposal.
> 
> We actually have the "reachable via specific edges" check in our
code right now. But when we tried to allow exceptions to be marked as proper
"cleanups", the assumption was violated. So I'm wary of making
this same assumption twice.
> 
> But anyway, I think that I can gather the necessary information from the
region numbers and the invokes' "unwind to" edges to create the EH
tables. The only intrinsic call that should remain is the one that gets the EH
pointer. And that's only needed by the catch blocks.
Right, I agree that it's easy to assemble the EH tables for a given invoke
under any of these variants.  We don't need any new constraints to make this
easier.

What I'm talking about is the flow of data from the start of the landing pad
to two points:
1.  The (optional) call to @llvm.eh.exception.
2.  The dispatch instruction.
Specifically, in DWARF EH, the personality function writes these two values into
the frame somewhere — maybe into registers, maybe into the EH buffer, whatever —
and that information needs to flow to the appropriate intrinsic/instruction. 
You can't just stash it aside, because the cleanup might throw and catch an
exception between points A and B.  I'm really not sure how this is supposed
to work if there's no guaranteed relationship between the landing pad and
the intrinsic/dispatch (†).  The most sensible constraint — dominance — is
equivalent to saying that each region has at most one landing pad.

(†) Technically, there can be no true guarantee because the dispatch doesn't
even need to be reachable from its landing pad.  For example:
  extern void foo();
  struct A { ~A() { throw 0; } };
  void test() { A a; foo(); }
After inlining, the cleanup landing pad in test() will contain a throw to a
terminating handler, and therefore the dispatch will not be reachable.  So any
constraint has to be something like "either the dispatch is unreachable or
it's dominated by the entry to the landing pad".  But it's actually
quite easy to write correct code to handle this case. :)

John.

Renato Golin

2010-Nov-25 10:44 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 25 November 2010 02:41, John McCall <rjmccall at apple.com>
wrote:> for example, you could make the landing pad a special subclass of
BasicBlock with a pointer to the dispatch, although that'd be a fairly
invasive change.  Tagging the edges solves the problem for clients with a handle
on an edge;  clients that want to go from (say) a dispatch to its landing pad(s)
will still have trouble.
Hi John,

I think this is inevitable. To be able to represent:

bb1: unwinds to %cleanup

one has to change the behaviour of basic blocks, and the best way to
do that is to create a sub-class for that special case.

That also must be used by every optimization that analyses the call
graph, dominance, inlining, etc.

cheers,
--renato

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Nov 2010 - [LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

Apparently Analagous Threads