thr3ads.net - llvm dev - [LLVMdev] RFC: Exception Handling Proposal II [Nov 2010]

If this information is useful, please help other people find it:
Share via:

John McCall

2010-Nov-24 12:58 UTC

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 2:59 AM, Renato Golin wrote:> If I got it right, the dispatch instruction will tell the
> instructions/calls to unwind to specific landing pads (cleanup areas,
> terminate), but the region number will encode try/catch areas, so that
> all those cleanup landing pads should ultimately end up in the catch
> area for that region.
Caveat: I'm speaking from what I remember of our discussions, which
is not necessarily what Bill is intending to propose;  that said, I'm
pretty confident that the design hasn't significantly changed.

A dispatch instruction is part of a landing pad, not part of the normal
instruction stream.  A dispatch is actually 1-1 with a specific landing
pad, and that pair of landing pad + dispatch instruction is basically
all a region is.  So the term is a bit misleading because it suggests
that the landing pad is directly associated in the IR with a range of
instructions, whereas in fact the current design is orthogonal from
the question of how you actually reach a landing pad in the first place.

For now, that's still via explicit invokes;  the invoke names the region
it unwinds to — Bill has it listing both the region number and the
landing pad block, which I think is redundant but harmless.

In my opinion, the most crucial property of the new design is that
it makes the chaining of regions explicit in the IR.  The "resume"
edge from a dispatch instruction always leads to either another
region or to a bit of code which re-enters the unwinder in some
opaque way.  When the inliner inlines a call in a protected region
(i.e. an invoke, for now), it just forwards the outermost resume
edges in the inlined function to the innermost region in the calling
function, potentially making the old code unreachable.  Frontends
are responsible for emitting regions and associated resume code
for which this preserves semantics.

So every landing pad actually has a stack of regions which
CodeGen has to examine to write out the unwind tables, but
it's easy to figure out that stack just by chasing links.

While I'm at it, there's another important property of dispatch —
it's undefined behavior to leave the function between landing
at a landing pad and reaching the dispatch.
> If that's so, how do you encode which which landing pad is to be
> followed per region?
> 
> Consider the following code:
> 
> try {
>  Foo f();
>  f.run(); // can throw exception
>  Bar b();
>  b.run(); // can throw exception
>  Baz z();
>  z.run(); // can throw exception
> } catch (...) {
> }
I assume you don't mean these to be function declarations. :)
> The object 'f' is in a different cleanup area than 'b'
which, in turn
> is in a different area than 'z'. These three regions should point
to
> three different landing pads (or different offsets in the same landing
> pad), which (I believe) are encoded in IR by being declared after
> different dispatch instructions, all of which within the same region.
Nope.  Three regions, three landing pads, three dispatch instructions.
(actually four if Foo::Foo() can throw).  The Baz-destructing region
chains to the Bar-destructing region which chains to the Foo-destructing
region which chains to the catching region;  the first three are
cleanup-only.
> If that's so, why do you still have the invoke call? Why should you
> treat call-exceptions any differently than instruction-exceptions?
One of my favorite things about this design is that it's totally
independent of what exactly is allowed to throw.  I'm really not sure
how best to represent other throwing instructions, except that I'm
pretty confident that we don't want anything as heavyweight as
invoke.  There's a pretty broad range of possibilities — we could
make invoke-like instructions for all of them (ick...), or we could
tag individual instructions with regions, or we could mark basic
blocks as unwinding to particular places.  But we can wrestle
with that independently of deciding to adopt explicitly-chained
landing pads.

John.

Bill Wendling

2010-Nov-24 13:36 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 4:58 AM, John McCall wrote:
>> If that's so, how do you encode which which landing pad is to be
>> followed per region?
>> 
>> Consider the following code:
>> 
>> try {
>> Foo f();
>> f.run(); // can throw exception
>> Bar b();
>> b.run(); // can throw exception
>> Baz z();
>> z.run(); // can throw exception
>> } catch (...) {
>> }
> 
> I assume you don't mean these to be function declarations. :)
> 
>> The object 'f' is in a different cleanup area than 'b'
which, in turn
>> is in a different area than 'z'. These three regions should
point to
>> three different landing pads (or different offsets in the same landing
>> pad), which (I believe) are encoded in IR by being declared after
>> different dispatch instructions, all of which within the same region.
> 
> Nope.  Three regions, three landing pads, three dispatch instructions.
> (actually four if Foo::Foo() can throw).  The Baz-destructing region
> chains to the Bar-destructing region which chains to the Foo-destructing
> region which chains to the catching region;  the first three are
> cleanup-only.
> Ah ha! I think I had a different mental model than you did. Or at least I
remembered things differently from the discussion. :-) For me, there is one
dispatch per region, which is why I had the region number associated with the
invokes as well as the "unwind to" edge coming from them. (See my
response to Renato for a description.) I'll think more about your model...
>> If that's so, why do you still have the invoke call? Why should you
>> treat call-exceptions any differently than instruction-exceptions?
> 
> One of my favorite things about this design is that it's totally
> independent of what exactly is allowed to throw.  I'm really not sure
> how best to represent other throwing instructions, except that I'm
> pretty confident that we don't want anything as heavyweight as
> invoke.  There's a pretty broad range of possibilities — we could
> make invoke-like instructions for all of them (ick...), or we could
> tag individual instructions with regions, or we could mark basic
> blocks as unwinding to particular places.  But we can wrestle
> with that independently of deciding to adopt explicitly-chained
> landing pads.
> Exactly! :-)

-bw

Renato Golin

2010-Nov-24 13:52 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 24 November 2010 13:36, Bill Wendling <wendling at apple.com>
wrote:> Ah ha! I think I had a different mental model than you did. Or at least I
remembered things differently from the discussion. :-) For me, there is one
dispatch per region, which is why I had the region number associated with the
invokes as well as the "unwind to" edge coming from them. (See my
response to Renato for a description.) I'll think more about your model...
Having a central dispatch simplifies a bit the front-end (less global
chain info) and the region number already encodes to which you're
going to call when building the EH table, so I guess having multiple
dispatch blocks is redundant.

Does it make sense?

cheers,
--renato

John McCall

2010-Nov-24 19:18 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 24, 2010, at 5:36 AM, Bill Wendling wrote:> On Nov 24, 2010, at 4:58 AM, John McCall wrote:
>> 
>>> The object 'f' is in a different cleanup area than
'b' which, in turn
>>> is in a different area than 'z'. These three regions should
point to
>>> three different landing pads (or different offsets in the same
landing
>>> pad), which (I believe) are encoded in IR by being declared after
>>> different dispatch instructions, all of which within the same
region.
>> 
>> Nope.  Three regions, three landing pads, three dispatch instructions.
>> (actually four if Foo::Foo() can throw).  The Baz-destructing region
>> chains to the Bar-destructing region which chains to the
Foo-destructing
>> region which chains to the catching region;  the first three are
>> cleanup-only.
>> 
> Ah ha! I think I had a different mental model than you did. Or at least I
remembered things differently from the discussion. :-) For me, there is one
dispatch per region, which is why I had the region number associated with the
invokes as well as the "unwind to" edge coming from them. (See my
response to Renato for a description.) I'll think more about your model...
Hmm.  The difference between our models is actually in what we're calling a
region.  In your model, adding a cleanup doesn't require a new region;  you
just create a new landing pad which does the cleanup and then branches to (I
guess) the landing pad of its containing cleanup.  So landing pads are
many-to-one with regions and dispatch instructions.  In my model, every
independent landing pad is necessarily a region.  So in my model, there is also
one dispatch per region, but there are more regions.

So, in this example:
  A { A(); ~A() throw(); };
  void foo() throw() {
    A x; 
    b();
  }

Your model has this as follows, modulo syntax:
  entry:
    %x = alloca %A
    invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
  succ0:
    invoke void @b() to label %succ1 unwind label %lp1 region 0
  succ1:
    call void @A::~A(%A* %x) nounwind
    ret void
  lp0:
    call void @A::~A(%A* %x) nounwind
    br label %lp1
  lp1:
    dispatch region 0 [filter i8* null]    # no resume edge

Whereas my model has this as follows:
  entry:
    %x = alloca %A
    invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
  succ0:
    invoke void @b() to label %succ1 unwind label %lp1 region 1
  succ1:
    call void @A::~A(%A* %x) nounwind
    ret void
  lp0:
    call void @A::~A(%A* %x) nounwind
    dispatch region 0 [], resume label %lp1
  lp1:
    dispatch region 1 [filter i8* null]   # no resume edge

I think my model has some nice conceptual advantages;  for one, it gives you the
constraint that only EH edges and dispatch instructions can lead to landing
pads, which I think will simplify what EH preparation has to do.  But I could be
convinced.

John.

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Nov 2010 - [LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

Maybe Matching Threads