(Moving this discussion on list as this could be of general interest.) My current work-in-progress implementation is attempting to map out the blocks used by a landing pad before it starts outlining. It creates a table of catch and cleanup handlers with the block at which each one starts. During outlining I intend to have another mechanism to check to see if we’ve already outlined the handler at the specified start block (which handles the case of handlers shared between landing pads) and if not starts the outlining process at that block. For cleanup handlers I’m treating any catch dispatch block (identified by looking for the eh_typeid_for+cmp+conditional branch pattern) as a stopping point for the cloning process. Here’s a snippet of comments explaining how I am doing the block mapping. // The landing pad sequence will have this basic shape: // // <cleanup handler> // <selector comparison> // <catch handler> // <cleanup handler> // <selector comparison> // <catch handler> // <cleanup handler> // ... // // Any of the cleanup slots may be absent. The cleanup slots may be occupied by // any arbitrary control flow, but all paths through the cleanup code must // eventually reach the next selector comparison and no path can skip to a // different selector comparisons, though some paths may terminate abnormally. // Therefore, we will use a depth first search from the start of any given // cleanup block and stop searching when we find the next selector comparison. // // The catch handlers may also have any control structure, but we are only // interested in the start of the catch handlers, so we don't need to actually // follow the flow of the catch handlers. The start of the catch handlers can // be located from the compare instructions, but they can be skipped in the // flow by following the contrary branch. // This has a flaw in that cleanup code could contain blocks that are shared with control flows outside the cleanup block and use phi-dependent branching. If the shared flow isn’t part of another cleanup handler this is only an efficiency problem as the non-cleanup flow will reach a landingpad or a function terminator before it hits a selector. However, if the shared block is shared with another cleanup handler, it could lead to a catch dispatch that has nothing to do with the current block. It turns out that the cloning code handles this problem of shared blocks really well. It just clones everything and prunes the unwanted code paths based on phi resolution and subsequent instruction simplification. I suppose I could clone the entire landing pad to do the block mapping and just discard the clone when I’m done, but that seems really horrific as a way of resolving this corner case problem. What I’d like is to have a way to determine whether the successors of a block are phi-dependent and if so find a way to limit the search to paths that result from predecessors I’ve already seen. I can do this easily enough for simple phi dependence (like constant phi values and a single comparison to choose a branch target) but I haven’t tried to see if the process I have in mind work would for arbitrarily complex cases. If you have any suggestions on this (like maybe a “isSuccessorPhiDependent()” function tucked away somewhere that I haven’t noticed), I’d love to hear them. -Andy From: Reid Kleckner [mailto:rnk at google.com] Sent: Friday, February 13, 2015 12:53 PM To: Kaylor, Andrew Cc: David Majnemer Subject: Re: C++ exception handling review One thing I'm curious about is how are we going to identify cleanup regions? Will we figure that out organically with an up front analysis, or what? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150213/b2580b88/attachment.html>
On Fri, Feb 13, 2015 at 1:37 PM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> (Moving this discussion on list as this could be of general interest.) > > > > My current work-in-progress implementation is attempting to map out the > blocks used by a landing pad before it starts outlining. It creates a > table of catch and cleanup handlers with the block at which each one > starts. During outlining I intend to have another mechanism to check to > see if we’ve already outlined the handler at the specified start block > (which handles the case of handlers shared between landing pads) and if not > starts the outlining process at that block. For cleanup handlers I’m > treating any catch dispatch block (identified by looking for the > eh_typeid_for+cmp+conditional branch pattern) as a stopping point for the > cloning process. > > > > Here’s a snippet of comments explaining how I am doing the block mapping. > > > > // The landing pad sequence will have this basic shape: > > // > > // <cleanup handler> > > // <selector comparison> > > // <catch handler> > > // <cleanup handler> > > // <selector comparison> > > // <catch handler> > > // <cleanup handler> > > // ... > > // > > // Any of the cleanup slots may be absent. The cleanup slots may be > occupied by > > // any arbitrary control flow, but all paths through the cleanup code must > > // eventually reach the next selector comparison and no path can skip to a > > // different selector comparisons, though some paths may terminate > abnormally. > > // Therefore, we will use a depth first search from the start of any given > > // cleanup block and stop searching when we find the next selector > comparison. > > // > > // The catch handlers may also have any control structure, but we are only > > // interested in the start of the catch handlers, so we don't need to > actually > > // follow the flow of the catch handlers. The start of the catch handlers > can > > // be located from the compare instructions, but they can be skipped in the > > // flow by following the contrary branch. > > // > > > > This has a flaw in that cleanup code could contain blocks that are shared > with control flows outside the cleanup block and use phi-dependent > branching. If the shared flow isn’t part of another cleanup handler this > is only an efficiency problem as the non-cleanup flow will reach a > landingpad or a function terminator before it hits a selector. However, if > the shared block is shared with another cleanup handler, it could lead to a > catch dispatch that has nothing to do with the current block. > > > > It turns out that the cloning code handles this problem of shared blocks > really well. It just clones everything and prunes the unwanted code paths > based on phi resolution and subsequent instruction simplification. I > suppose I could clone the entire landing pad to do the block mapping and > just discard the clone when I’m done, but that seems really horrific as a > way of resolving this corner case problem. What I’d like is to have a way > to determine whether the successors of a block are phi-dependent and if so > find a way to limit the search to paths that result from predecessors I’ve > already seen. I can do this easily enough for simple phi dependence (like > constant phi values and a single comparison to choose a branch target) but > I haven’t tried to see if the process I have in mind work would for > arbitrarily complex cases. If you have any suggestions on this (like maybe > a “isSuccessorPhiDependent()” function tucked away somewhere that I haven’t > noticed), I’d love to hear them. >I think complex cases are unlikely and should be handled conservatively by cloning. A good example of the simple case is the way we lower __finally. Internally Clang has the invariant that it cannot double-emit local declarations twice while emitting a function. So for this test case, we had to make the exceptional cleanup path branch into normal path with a phi, and then have a conditional branch back over to the exceptional path. // C++: void f() { __try { might_crash(); } __finally { mylabel: // better not emit twice! int r; // better not emit twice! r = do_cleanup(); if (!r) goto mylabel; } } // Simplified IR: define void @f() { %abnormal_termination = alloca i8 invoke void @might_crash() to label %cont unwind label %lpad cont: store i8 0, i8* %abnormal_termination br label %__finally lpad: landingpad { i8*, i32 } personality i32 (...)* @__C_specific_handler cleanup store i8 1, i8* %abnormal_termination br label %__finally __finally: ; code simplified for brevity %r = call i1 @do_cleanup() br i1 %r, label %__finally, label %done done: %ab = load i8* %abnormal_termination %tobool = icmp eq i8 %ab, i8 0 br i1 %tobool, label %ret, label %resume ret: ret void resume: resume } You can see at -O0 we have these loads and stores, but at -O1 we'll have a phi-dependent branch that's really easy to analyze. It'd be good if the preparation pass could understand this much, and no more. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150213/f00fac15/attachment.html>
I’m not sure I understand what you are suggesting in terms of how this should be
handled.
I agree that the cleanup code is unlikely to be complex, but obviously we still
need some way to handle any cases that unlikely but possible. I hadn’t even
thought about what might be in a __finally block. I was thinking about
destructors being inlined. Obviously that puts a practical limit on what will
be there, but not really a theoretical limit.
So I’m looking for an implementation that will handle the typical cases
efficiently but still not break if the very rare case turns up. It occurs to me
that maybe the depth first search I was planning is exactly wrong in this case.
Given a cleanup handler that goes into a shared block and can either branch back
out to a block that belongs to the landing pad or branch to a block at some
arbitrary point in the normal flow, the path we’re looking for is almost
certainly going to meet the selector comparison or a resume within a block or
two. So a breadth-first search would handle this case better and probably is
reasonable for the common case (which usually won’t have any branches at all).
There are two scenarios I have in mind here.
1) A shared block that might branch into normal code:
lpad:
landingpad …
cleanup
catch …
…
br label %shared.block
shared.block:
%dst = phi i1 [1, %normal.cont], [0, %lpad]
; do something that happens in cleanup and in regular code
…
br i1 %dst, label %normal.cont2, label %catch.dispatch
catch.dispatch:
That sort of PHI dependence is very easy to handle, but can we count on it
always being that simple? It seems like in theory there could be an arbitrary
use-def chain between the PHI node and the branch condition that we may or may
not be able to resolve at compile-time. I wouldn’t be happy with whoever
created such an atrocity, but I didn’t feel like we could just assume it can’t
happen.
If we follow the branch to “normal.cont2” it could conceivably take us through
the entire CFG for the function. Obviously we want to avoid that.
2) A block shared between unconnected landing pads
lpad1:
landingpad …
cleanup
catch <some type> …
…
br label %shared.block
lpad2:
landingpad …
cleanup
catch <some other type> …
…
br label %shared.block
shared.block:
%dst = phi i1 [1, %lpad1], [0, %lpad2]
; do something that happens in both cleanup handlers
…
br i1 %dst, label %lpad1.catch.dispatch, label %lpad2.catch.dispatch
lpad1.catch.dispatch:
…
lpad2.catch.dispatch:
…
I haven’t tried to devise code that would lead to this, but my intuition is that
this kind of sharing is more likely than the other and if it does happen it
could be problematic. Again, the PHI-dependence here is easy to sort out here
and it would be trivial to determine which branch we should follow. But if a
case arises where we can’t figure out which predecessor leads to which
successor, what I had planned just won’t work.
I’d like to believe that all PHI-dependent control flow that arises from block
merging will always be simple and that if anyone tried to do something more
complex someone would ask them not to, but I worry that you never know what is
going to happen over the course of many transformation passes. Like, imagine:
shared.block:
%foo = phi i1 [1, %lpad1], [0, %lpad2]
…
%fubar = call i1 @bar(i1 %foo)
…
br i1 %fubar, label %dispatch1, label %dispatch2
Am I overthinking this?
-Andy
From: Reid Kleckner [mailto:rnk at google.com]
Sent: Friday, February 13, 2015 2:06 PM
To: Kaylor, Andrew
Cc: David Majnemer; LLVM Developers Mailing List
Subject: Re: C++ exception handling
On Fri, Feb 13, 2015 at 1:37 PM, Kaylor, Andrew <andrew.kaylor at
intel.com<mailto:andrew.kaylor at intel.com>> wrote:
(Moving this discussion on list as this could be of general interest.)
My current work-in-progress implementation is attempting to map out the blocks
used by a landing pad before it starts outlining. It creates a table of catch
and cleanup handlers with the block at which each one starts. During outlining
I intend to have another mechanism to check to see if we’ve already outlined the
handler at the specified start block (which handles the case of handlers shared
between landing pads) and if not starts the outlining process at that block.
For cleanup handlers I’m treating any catch dispatch block (identified by
looking for the eh_typeid_for+cmp+conditional branch pattern) as a stopping
point for the cloning process.
Here’s a snippet of comments explaining how I am doing the block mapping.
// The landing pad sequence will have this basic shape:
//
// <cleanup handler>
// <selector comparison>
// <catch handler>
// <cleanup handler>
// <selector comparison>
// <catch handler>
// <cleanup handler>
// ...
//
// Any of the cleanup slots may be absent. The cleanup slots may be occupied by
// any arbitrary control flow, but all paths through the cleanup code must
// eventually reach the next selector comparison and no path can skip to a
// different selector comparisons, though some paths may terminate abnormally.
// Therefore, we will use a depth first search from the start of any given
// cleanup block and stop searching when we find the next selector comparison.
//
// The catch handlers may also have any control structure, but we are only
// interested in the start of the catch handlers, so we don't need to
actually
// follow the flow of the catch handlers. The start of the catch handlers can
// be located from the compare instructions, but they can be skipped in the
// flow by following the contrary branch.
//
This has a flaw in that cleanup code could contain blocks that are shared with
control flows outside the cleanup block and use phi-dependent branching. If the
shared flow isn’t part of another cleanup handler this is only an efficiency
problem as the non-cleanup flow will reach a landingpad or a function terminator
before it hits a selector. However, if the shared block is shared with another
cleanup handler, it could lead to a catch dispatch that has nothing to do with
the current block.
It turns out that the cloning code handles this problem of shared blocks really
well. It just clones everything and prunes the unwanted code paths based on phi
resolution and subsequent instruction simplification. I suppose I could clone
the entire landing pad to do the block mapping and just discard the clone when
I’m done, but that seems really horrific as a way of resolving this corner case
problem. What I’d like is to have a way to determine whether the successors of
a block are phi-dependent and if so find a way to limit the search to paths that
result from predecessors I’ve already seen. I can do this easily enough for
simple phi dependence (like constant phi values and a single comparison to
choose a branch target) but I haven’t tried to see if the process I have in mind
work would for arbitrarily complex cases. If you have any suggestions on this
(like maybe a “isSuccessorPhiDependent()” function tucked away somewhere that I
haven’t noticed), I’d love to hear them.
I think complex cases are unlikely and should be handled conservatively by
cloning. A good example of the simple case is the way we lower __finally.
Internally Clang has the invariant that it cannot double-emit local declarations
twice while emitting a function. So for this test case, we had to make the
exceptional cleanup path branch into normal path with a phi, and then have a
conditional branch back over to the exceptional path.
// C++:
void f() {
__try { might_crash(); }
__finally {
mylabel: // better not emit twice!
int r; // better not emit twice!
r = do_cleanup();
if (!r)
goto mylabel;
}
}
// Simplified IR:
define void @f() {
%abnormal_termination = alloca i8
invoke void @might_crash()
to label %cont unwind label %lpad
cont:
store i8 0, i8* %abnormal_termination
br label %__finally
lpad:
landingpad { i8*, i32 } personality i32 (...)* @__C_specific_handler
cleanup
store i8 1, i8* %abnormal_termination
br label %__finally
__finally: ; code simplified for brevity
%r = call i1 @do_cleanup()
br i1 %r, label %__finally, label %done
done:
%ab = load i8* %abnormal_termination
%tobool = icmp eq i8 %ab, i8 0
br i1 %tobool, label %ret, label %resume
ret:
ret void
resume:
resume
}
You can see at -O0 we have these loads and stores, but at -O1 we'll have a
phi-dependent branch that's really easy to analyze. It'd be good if the
preparation pass could understand this much, and no more. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150213/474cdfe2/attachment.html>