I was thinking about this last night, and I came up with a third alternative
which I think looks very promising. It’s basically a re-working of the previous
alternative to use the landingpad concept rather than arbitrary fake logic, but
it uses a single landing pad for the entire function that mimics the logic of
the personality function to dispatch unwinding calls and catch handlers.
I believe this is consistent with the semantics and spirit of the existing
landingpad mechanism, and it still has the properties that allow the required
.xdata information to be easily extracted from the IR. It will require a new
representation after the handlers have been outlined, but I have an idea for
that which I’ll send out later today.
If we go this way, the inlining code will need to be taught to merge the
landingpad of the inlined function, but I think that will be pretty easy.
So, here it is:
void test()
{
try {
Outer outer;
try {
Inner inner;
do_inner_thing();
} catch (int) {
handle_int();
}
} catch (float) {
handle_float();
}
keep_going();
}
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Original
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Function Attrs: uwtable
define void @_Z4testv() #0 {
entry:
%outer = alloca %class.Outer, align 1
%inner = alloca %class.Inner, align 1
call void @llvm.eh.setehstate(i32 0)
invoke void @_ZN5OuterC1Ev(%class.Outer* %outer)
to label %invoke.cont unwind label %lpad
invoke.cont:
call void @llvm.eh.setehstate(i32 1)
call void @llvm.eh.setehstate(i32 2)
invoke void @_ZN5InnerC1Ev(%class.Inner* %inner)
to label %invoke.cont1 unwind label %lpad
invoke.cont.1:
call void @llvm.eh.setehstate(i32 3)
invoke void @_Z14do_inner_thingv()
to label %invoke.cont2 unwind label %lpad
invoke.cont2:
call void @llvm.eh.setehstate(i32 2)
invoke void @_ZN5InnerD1Ev(%class.Inner* %inner)
to label %invoke.cont3 unwind label %lpad
invoke.cont3:
call void @llvm.eh.setehstate(i32 1)
invoke void @_ZN5OuterD1Ev(%class.Outer* %outer)
to label %invoke.cont4 unwind label %lpad
invoke.cont4:
call void @llvm.eh.setehstate(i32 0)
call void @llvm.eh.setehstate(i32 -1)
call void @_Z10keep_goingv()
ret void
lpad:
%eh.vals = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__CxxFrameHandler3 to i8*)
cleanup
catch i8* bitcast (i8** @_ZTIi to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%eh.ptrs = extractvalue { i8*, i32 } %eh.vals, 0
%eh.sel = extractvalue { i8*, i32 } %eh.vals, 1
br label %unwind.handlers
unwind.handlers:
%unwind.state = call i32 @llvm.eh.getunwindstate(%eh.ptrs)
br label %unwind.dispatch
unwind.dispatch:
%4 = icmp eq i32 %unwind.state, i32 1
br i1 %4, label %unwind.handler.1, label %unwind.dispatch.1
unwind.handler.1:
call void @_ZN5OuterD1Ev(%class.Outer* %outer)
resume { i8*, i32 } %eh.vals
unwind.dispatch:
%5 = icmp eq i32 %unwind.state, i32 3
br i1 %5, label %unwind.handler.3, label %catch.handlers
unwind.handler.3:
call void @_ZN5InnerD1Ev(%class.Inner* %inner)
resume { i8*, i32 } %eh.vals
catch.handlers:
%catch.state = call i32 @llvm.eh.getcatchstate(i8* %eh.ptrs)
br label %catch.dispatch
catch.dispatch.1:
%6 = icmp sge i32 %catch.state, i32 2
br i1 %6, label %catch.1.lower.true, label %catch.dispatch.2
catch.1.lower.true
%7 = icmp sle i32 %catch.state, i32 3
br i1 %7, label %catch.1.state.matches, label %catch.dispatch.2
catch.1.state.matches:
%sel.1 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*))
%matches.1 = icmp eq i32 %sel.1, i32 %eh.sel
br i1 %matches.1, label %catch.1.handler, label %catch.dispatch.2
catch.1.handler:
call void @llvm.eh.setehstate(i32 4)
call void @_Z10handle_intv()
br label %invoke.cont3
catch.dispatch.2:
%8 = icmp sge i32 %catch.state, i32 0
br i1 %8, label %catch.2.lower.true, label %catch.nomatch
catch.2.lower.true
%9 = icmp sle i32 %catch.state, i32 4
br i1 %9, label %catch.2.state.matches, label %catch.nomatch
catch.2.state.matches:
%sel.2 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIf to i8*))
%matches.2 = icmp eq i32 %sel.2, i32 %eh.sel
br i1 %matches.2, label %catch.2.handler, label %catch.nonmatch
catch.2.handler:
call void @llvm.eh.setehstate(i32 5)
call void @_Z12handle_floatv()
br label %invoke.cont4
catch.nomatch:
resume { i8*, i32 } %eh.vals
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150127/edff9a72/attachment.html>
My original reply got stuck in llvmdev moderation (it hit the 100K limit!),
so I'm resending without reply context.
-------
Thanks, your explanation of the .xdata tables in terms of EH states makes a
lot of sense to me.
I have a few concerns about your new EH proposal, though.
1. If we number the EH states in the frontend, we will have to renumber
them during inlining. This isn't insurmountable, but seems like a design
weakness. The major motivation for using the @llvm.eh.typeid.for intrinsic
is to delay numbering the catch clauses until codegen, which is after
inlining. Honestly, we should consider lowering @llvm.eh.typeid.for during
EH preparation so that we can try forming a 'switch' in IR instead of a
series of conditional branches.
Reminds me of http://llvm.org/PR20300, which is a similar EH preparation
improvement I want to do.
2. Without the invoke instruction and accompanying landing pad, the IR has
a lot of implicit control flow. Without explicit control flow, we can't
promote allocas to SSA values, which LLVM basically requires before any
real optimization can begin. Consider this example:
int x = g();
try {
f();
} catch (int) {
x++; // use x as input and output
}
return x;
Today we will promote 'x' to values like this:
entry:
%x = call i32 @g()
invoke @f() to label %ret unwind label %lpad
lpad:
landingpad ...
; elided EH dispatch, assume selector is for 'int'
%x1 = add i32 %x, 1
br label %ret
ret:
%x = phi i32 [%x, %entry ], [%x1, %lpad]
In your IR example, it looks like the control flow edge from 'call void
@_Z14do_inner_thingv' to the catch handler code comes at the end of the
function prior to the return.
How would you change it if there was an assignment of a variable like
'x'
before and after the call and a load of 'x' in the catch handlers? I
don't
think we can do correct phi insertion on the CFG as written.
3. Ultimately, the explicit state setting intrinsics will be lowered out
and we will need to form the ip2state table. Unfortunately, bracketed
intrinsics aren't quite enough to recover the natural scoping of the source
program, which is something we've seen with @llvm.lifetime.start / end.
What should the backend do if it sees control flow like this?
bb0:
call void @llvm.eh.setehstate(i32 0)
br label %bb3
bb1:
call void @llvm.eh.setehstate(i32 2)
br label %bb3
bb3:
invoke void @do_something() ; Which EH state are we in?
We could establish IR rules that such join points need to reset the EH
state, but then we have to go and teach optimizers about it. It's basically
a no-IR-modifications version of labelling each BB with an unwind label,
which is something that's been proposed before more directly:
http://llvm.org/PR1269.
Personally, I think we could make this change to LLVM IR, but at a great
cost. Implicit EH control flow would open up a completely new class of bugs
in LLVM optimizers that doesn't exist today. Most LLVM hackers working on
optimizations that I talk to *REALLY* don't want to carry the burden of
implicit control flow. However, my coworkers may be biased, because
exceptions are banned in most settings here at Google.
Anyway, unless we go all the way and make the EH state a first class IR
construct, I feel like @llvm.eh.state imposes too many restrictions on IR
transformations. What happens if I reorder the BBs? Consider that MBB
placement happens very late, and is guided primarily by branch probability,
not source order.
--------
So, to attempt to address this, I think maybe we can go back to something
like your first proposal.
I continue to think that the right thing to do is to emit the Itanium-style
landingpads from the frontend, and transform the control flow into
something more table-like after optimizations. If we do things this way we
don't have to teach the middle-end to reason about new constructs like
@llvm.eh.state.
I propose that the preparation pass does all the outlining and removes all
the landing pad code. It will leave behind the landingpad instruction and a
call to an intrinsic (@llvm.eh.actions()) that lists the handlers and
handler types in the order that they need to run.
To preserve the structure of the CFG, the actions intrinsic will return an
i8* that will feed into an indirectbr terminator.
Similar to SjLj, SSA values live across the landing pad entrances and exits
will be demoted to stack allocations. Unlike SjLj, to allow access from
outlined landing pad code, the stack memory will be part of the
@llvm.frameallocate block.
Here's how _Z4testv would look after preparation:
define void @_Z4testv() #0 {
entry:
%frame_alloc = call i8* @llvm.frameallocate(i32 2)
%capture_block = bitcast i8* %frame_alloc to %captures._Z4testv*
%outer = getelementptr %captures._Z4testv* %capture_block, i32 0, i32 0
%inner = getelementptr %captures._Z4testv* %capture_block, i32 0, i32 1
invoke void @_ZN5OuterC1Ev(%struct.Outer* %outer)
to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry
invoke void @_ZN5InnerC1Ev(%struct.Inner* %inner)
to label %invoke.cont2 unwind label %lpad1
invoke.cont2: ; preds = %invoke.cont
invoke void @_Z14do_inner_thingv()
to label %invoke.cont4 unwind label %lpad3
invoke.cont4: ; preds = %invoke.cont2
invoke void @_ZN5InnerD1Ev(%struct.Inner* %inner)
to label %try.cont unwind label %lpad1
try.cont: ; preds = %invoke.cont4,
%invoke.cont8
invoke void @_ZN5OuterD1Ev(%struct.Outer* %outer)
to label %try.cont19 unwind label %lpad
try.cont19: ; preds = %try.cont,
%invoke.cont17
call void @_Z10keep_goingv()
ret void
lpad: ; preds = %try.cont,
%entry
%0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover = call i8* (...)* @llvm.eh.actions(
i32 1, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)*
@catch_float)
indirectbr i8* %recover, [label %try.cont], [label %try.cont19]
lpad1: ; preds = %invoke.cont4,
%invoke.cont
%3 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
cleanup
catch i8* bitcast (i8** @_ZTIi to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover1 = call i8* (...)* @llvm.eh.actions(
i32 1, i8* bitcast (i8** @_ZTIi to i8*), void (i8*, i8*)* @catch_int,
i32 0, i8* null, void (i8*, i8*)* @dtor_outer,
i32 2, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)*
@catch_float)
indirectbr i8* %recover1, [label %try.cont], [label %try.cont19]
lpad3: ; preds = %invoke.cont2
%6 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
cleanup
catch i8* bitcast (i8** @_ZTIi to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover2 = call i8* (...)* @llvm.eh.actions(
i32 0, i8* null, void (i8*, i8*)* @dtor_inner,
i32 1, i8* bitcast (i8** @_ZTIi to i8*), void (i8*, i8*)* @catch_int,
i32 0, i8* null, void (i8*, i8*)* @dtor_outer,
i32 2, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)*
@catch_float)
indirectbr i8* %recover2, [label %try.cont], [label %try.cont19]
}
One issue is that I'm not sure how to catch a "float" exception
thrown by
handle_int(). I'll have to think about that.
It's also not clear to me that we need to have the i32 selector in
@llvm.eh.actions. We need a way to distinguish between catch-all
(traditionally i8* null) and a cleanup. We could just use some other
constant like 'i8* inttoptr (i32 1 to i8*)' and make that the cleanup
sentinel.
I think there are still issues here, like how to actually implement this
transform, but I'm going to hit send now and keep thinking. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150127/5e9d71ae/attachment.html>
Thanks, Reid. These are good points.
So I guess that does take us back to something more like my original proposal.
I like your suggestion of having some kind of “eh.actions” intrinsic to
represent the outlining rather than the extension to landingpad that I had
proposed. I was just working on something like that in conjunction with my
second alternative idea.
What I’d really like is to have the “eh.actions” intrinsic take a shape that
makes it really easy to construct the .xdata table directly from these calls.
As I think I mentioned in my original post, I think I have any idea for how to
reconstruct functionally correct eh states (at least for synchronous EH
purposes) from the invoke and landingpad instructions. I would like to
continue, as in my original proposal, limiting the unwind representations to
those that are unique to a given landing pad. I think with enough documentation
I can make that seem sensible.
I’ll start working on a revised proposal. Let me know if you have any more
solid ideas.
-Andy
From: Reid Kleckner [mailto:rnk at google.com]
Sent: Tuesday, January 27, 2015 11:24 AM
To: Kaylor, Andrew
Cc: Bataev, Alexey; Reid Kleckner (reid at kleckner.net); LLVM Developers
Mailing List; Anton Korobeynikov; Kreitzer, David L
Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling
My original reply got stuck in llvmdev moderation (it hit the 100K limit!), so
I'm resending without reply context.
-------
Thanks, your explanation of the .xdata tables in terms of EH states makes a lot
of sense to me.
I have a few concerns about your new EH proposal, though.
1. If we number the EH states in the frontend, we will have to renumber them
during inlining. This isn't insurmountable, but seems like a design
weakness. The major motivation for using the @llvm.eh.typeid.for intrinsic is to
delay numbering the catch clauses until codegen, which is after inlining.
Honestly, we should consider lowering @llvm.eh.typeid.for during EH preparation
so that we can try forming a 'switch' in IR instead of a series of
conditional branches.
Reminds me of http://llvm.org/PR20300, which is a similar EH preparation
improvement I want to do.
2. Without the invoke instruction and accompanying landing pad, the IR has a lot
of implicit control flow. Without explicit control flow, we can't promote
allocas to SSA values, which LLVM basically requires before any real
optimization can begin. Consider this example:
int x = g();
try {
f();
} catch (int) {
x++; // use x as input and output
}
return x;
Today we will promote 'x' to values like this:
entry:
%x = call i32 @g()
invoke @f() to label %ret unwind label %lpad
lpad:
landingpad ...
; elided EH dispatch, assume selector is for 'int'
%x1 = add i32 %x, 1
br label %ret
ret:
%x = phi i32 [%x, %entry ], [%x1, %lpad]
In your IR example, it looks like the control flow edge from 'call void
@_Z14do_inner_thingv' to the catch handler code comes at the end of the
function prior to the return.
How would you change it if there was an assignment of a variable like
'x' before and after the call and a load of 'x' in the catch
handlers? I don't think we can do correct phi insertion on the CFG as
written.
3. Ultimately, the explicit state setting intrinsics will be lowered out and we
will need to form the ip2state table. Unfortunately, bracketed intrinsics
aren't quite enough to recover the natural scoping of the source program,
which is something we've seen with @llvm.lifetime.start / end. What should
the backend do if it sees control flow like this?
bb0:
call void @llvm.eh.setehstate(i32 0)
br label %bb3
bb1:
call void @llvm.eh.setehstate(i32 2)
br label %bb3
bb3:
invoke void @do_something() ; Which EH state are we in?
We could establish IR rules that such join points need to reset the EH state,
but then we have to go and teach optimizers about it. It's basically a
no-IR-modifications version of labelling each BB with an unwind label, which is
something that's been proposed before more directly: http://llvm.org/PR1269.
Personally, I think we could make this change to LLVM IR, but at a great cost.
Implicit EH control flow would open up a completely new class of bugs in LLVM
optimizers that doesn't exist today. Most LLVM hackers working on
optimizations that I talk to *REALLY* don't want to carry the burden of
implicit control flow. However, my coworkers may be biased, because exceptions
are banned in most settings here at Google.
Anyway, unless we go all the way and make the EH state a first class IR
construct, I feel like @llvm.eh.state imposes too many restrictions on IR
transformations. What happens if I reorder the BBs? Consider that MBB placement
happens very late, and is guided primarily by branch probability, not source
order.
--------
So, to attempt to address this, I think maybe we can go back to something like
your first proposal.
I continue to think that the right thing to do is to emit the Itanium-style
landingpads from the frontend, and transform the control flow into something
more table-like after optimizations. If we do things this way we don't have
to teach the middle-end to reason about new constructs like @llvm.eh.state.
I propose that the preparation pass does all the outlining and removes all the
landing pad code. It will leave behind the landingpad instruction and a call to
an intrinsic (@llvm.eh.actions()) that lists the handlers and handler types in
the order that they need to run.
To preserve the structure of the CFG, the actions intrinsic will return an i8*
that will feed into an indirectbr terminator.
Similar to SjLj, SSA values live across the landing pad entrances and exits will
be demoted to stack allocations. Unlike SjLj, to allow access from outlined
landing pad code, the stack memory will be part of the @llvm.frameallocate
block.
Here's how _Z4testv would look after preparation:
define void @_Z4testv() #0 {
entry:
%frame_alloc = call i8* @llvm.frameallocate(i32 2)
%capture_block = bitcast i8* %frame_alloc to %captures._Z4testv*
%outer = getelementptr %captures._Z4testv* %capture_block, i32 0, i32 0
%inner = getelementptr %captures._Z4testv* %capture_block, i32 0, i32 1
invoke void @_ZN5OuterC1Ev(%struct.Outer* %outer)
to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry
invoke void @_ZN5InnerC1Ev(%struct.Inner* %inner)
to label %invoke.cont2 unwind label %lpad1
invoke.cont2: ; preds = %invoke.cont
invoke void @_Z14do_inner_thingv()
to label %invoke.cont4 unwind label %lpad3
invoke.cont4: ; preds = %invoke.cont2
invoke void @_ZN5InnerD1Ev(%struct.Inner* %inner)
to label %try.cont unwind label %lpad1
try.cont: ; preds = %invoke.cont4,
%invoke.cont8
invoke void @_ZN5OuterD1Ev(%struct.Outer* %outer)
to label %try.cont19 unwind label %lpad
try.cont19: ; preds = %try.cont,
%invoke.cont17
call void @_Z10keep_goingv()
ret void
lpad: ; preds = %try.cont, %entry
%0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover = call i8* (...)* @llvm.eh.actions(
i32 1, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)* @catch_float)
indirectbr i8* %recover, [label %try.cont], [label %try.cont19]
lpad1: ; preds = %invoke.cont4,
%invoke.cont
%3 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
cleanup
catch i8* bitcast (i8** @_ZTIi to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover1 = call i8* (...)* @llvm.eh.actions(
i32 1, i8* bitcast (i8** @_ZTIi to i8*), void (i8*, i8*)* @catch_int,
i32 0, i8* null, void (i8*, i8*)* @dtor_outer,
i32 2, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)* @catch_float)
indirectbr i8* %recover1, [label %try.cont], [label %try.cont19]
lpad3: ; preds = %invoke.cont2
%6 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
cleanup
catch i8* bitcast (i8** @_ZTIi to i8*)
catch i8* bitcast (i8** @_ZTIf to i8*)
%recover2 = call i8* (...)* @llvm.eh.actions(
i32 0, i8* null, void (i8*, i8*)* @dtor_inner,
i32 1, i8* bitcast (i8** @_ZTIi to i8*), void (i8*, i8*)* @catch_int,
i32 0, i8* null, void (i8*, i8*)* @dtor_outer,
i32 2, i8* bitcast (i8** @_ZTIf to i8*), void (i8*, i8*)* @catch_float)
indirectbr i8* %recover2, [label %try.cont], [label %try.cont19]
}
One issue is that I'm not sure how to catch a "float" exception
thrown by handle_int(). I'll have to think about that.
It's also not clear to me that we need to have the i32 selector in
@llvm.eh.actions. We need a way to distinguish between catch-all (traditionally
i8* null) and a cleanup. We could just use some other constant like 'i8*
inttoptr (i32 1 to i8*)' and make that the cleanup sentinel.
I think there are still issues here, like how to actually implement this
transform, but I'm going to hit send now and keep thinking. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150127/1a3da6b5/attachment.html>
Possibly Parallel Threads
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
- [LLVMdev] RFC: Exception Handling Rewrite