thr3ads.net - llvm dev - [LLVMdev] RFC: New EH representation for MSVC compatibility [May 2015]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner

2015-May-18 21:35 UTC

[LLVMdev] RFC: New EH representation for MSVC compatibility

On Mon, May 18, 2015 at 12:03 PM, Joseph Tremoulet <jotrem at
microsoft.com>
wrote:
>  Hi,
>
>
>
> Thanks for sending this out.  We're looking forward to seeing this come
> about, since we need funclet separation for LLILC as well (and I have
> cycles to spend on it, if that would be helpful).
>
>
>
> Some questions about the new proposal:
>
>
>
> - Do the new forms of resume have any implied read/write side-effects, or
> do they work just like a branch?  In particular, I'm wondering what
> prevents reordering a call across a resume.  Is this just something that
> code motion optimizations are expected to check for explicitly to avoid
> introducing UB per the "Executing such an invoke [or call] that does
not
> transitively unwind to the correct catchend block has undefined
behavior"
> rule?
>
Yes, crossing a resume from a catchblock ends the lifetime of the exception
object, so I'd say that's a "writes escaped memory"
constraint. That said,
a resume after a cleanupblock doesn't, but I'm not sure it's worth
having
this kind of fine-grained analysis. I'm OK teaching SimplifyCFG to combine
cleanupblocks and leaving it at that.

> - Does LLVM already have other examples of terminators that are glued to
> the top of their basic blocks, or will these be the first?  I ask because
> it's a somewhat nonstandard thing (a block in the CFG that can't
have
> instructions added to it) that any code placement algorithms (PRE, PGO
> probe insertion, Phi elimination, RA spill/copy placement, etc.) may need
> to be adjusted for.  The adjustments aren't terrible (conceptually
it's no
> worse than having unsplittable edges from each of the block's preds to
each
> of its succs), but it's something to be aware of.
>
No, LLVM doesn't have anything like this yet. It does have unsplittable
critical edges, which can come from indirectbr and the unwind edge of an
invoke. I don't think it'll be too hard to teach transforms how to deal
with one more, but maybe that's unrealistic youthful optimism. :)

- Since this will require auditing any code with special processing
of> resume instructions to make sure it handles the new resume forms correctly,
> I wonder if it might be helpful to give resume (or the new forms of it) a
> different name, since then it would be immediately clear which code
> has/hasn't been updated to the new model.
>
There aren't that many references to ResumeInst across LLVM, so I'm not
too
scared. I'm not married to reusing 'resume', other candidate names
include
'unwind' and 'continue', and I'd like more ideas.

> - Is the idea that a resume (of the sort that resumes normal execution)
> ends only one catch/cleanup, or that it can end any number of them?  Did
> you consider having it end a single one, and giving it a source that
> references (in a non-flow-edge-inducing way) the related catchend?  If you
> did that, then:
>
> + The code to find a funclet region could terminate with confidence when
> it reaches this sort of resume, and
>
> + Resumes which exit different catches would have different sources and
> thus couldn't be merged, reducing the need to undo tail-merging with
code
> duplication in EH preparation (by blocking the tail-merging in the first
> place)
>
We already have something like this for cleanupblocks because the resume
target and unwind label of the cleanupblock must match. It isn't as strong
as having a reference to the catchblock itself, because tail merging could
kick in like you mention. Undoing this would be and currently is the job of
WinEHPrepare. I guess I felt like the extra representational complexity
wasn't worth the confidence that it would buy us.

> - What is the plan for cleanup/__finally code that may be executed on
> either normal paths or EH paths?  One could imagine a number of options
> here:
>
> + require the IR producer to duplicate code for EH vs non-EH paths
>
> + duplicate code for EH vs non-EH paths during EH preparation
>
> + use resume to exit these even on the non-EH paths; code doesn't need
to
> be duplicated (but could and often would be as an optimization for
> hot/non-EH paths), and normal paths could call the funclet at the end of
> the day
>
> and it isn't clear to me which you're suggesting.  Requiring
duplication
> can worst-case quadratically expand the code (in that if you have n levels
> of cleanup-inside-cleanup-inside-cleanup-…, and each cleanup has k code
> bytes outside the next-inner cleanup, after duplication you'll have k*n
+
> k*(n-1) + … or O(k*n^2) bytes total [compared to k*n before duplication]),
> which I'd think could potentially be a problem in pathological inputs.
>
I want to have separate normal and exceptional codepaths, but at -O0 all
the cleanup work should be bundled up in a function that gets called from
both those paths.

Today, for C++ destructors, we emit two calls to the destructor: one on the
normal path and one on the EH path. For __finally, we outline the finally
body early in clang and emit two calls to it as before, but passing in the
frameaddress as an argument. I think this is a great place to be. It keeps
our -O0 code size small, simplifies the implementation, and allows us to
inline one or both call sites if we think it's profitable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150518/07c33350/attachment.html>

Joseph Tremoulet

2015-May-19 03:40 UTC

head link

[LLVMdev] RFC: New EH representation for MSVC compatibility

> I want to have separate normal and exceptional codepathsI assume you at least mean that's what you'll be doing in Clang's
initial IR generation.  Do you also mean to impose this as a restriction on
other IR generators, and as a property that IR transformations must preserve? 
I.e., is this something that EH preparation can assume?
> For __finally, we outline the finally body early in clang and emit two
calls to it as before, but passing in the frameaddress as an argumentBut then you have to frameescape any __finally-referenced local before
optimization, and doesn't that defeat the purpose of delaying funclet
outlining to EH preparation?
> tail merging could kick in like you mention. Undoing this would be and
currently is the job of WinEHPrepare. I guess I felt like the extra
representational complexity wasn't worth the confidence that it would buy usFor one, it seems counterproductive to let tail merge think it can kick in when
it's doomed to be undone.
For another, if we're talking about a setup where EH paths might mingle with
non-EH paths but the funclet will only be invoked for the EH cases, then I
believe this would help the  "pruning as many unreachable CFG edges as
possible " step be more effective -- after finding out which blocks are
reachable from a catch/cleanup, you could intersect that with the set of blocks
from which a corresponding resume can be reached.  Any funclet blocks ending in
condbr/switch that only wind up with one successor in the funclet could then
have their terminators rewritten as unconditional branches, without needing to
recover dataflow and chase constants through phis and resolve compares/switches
and all that.
> I'm not married to reusing 'resume', other candidate names
include 'unwind' and 'continue', and I'd like more ideasThe first thing that comes to mind is 'endatch/exitcatch', but to use
that you'd need to rename other things since it would be confusing vis-à-vis
catchendblock and lack symmetry with catchblock that isn't prefixed with
begin/enter.
You could consider 'filter' (or 'filterblock') for
'catchblock', since conceptually it plays the role of a filter
(typically one which consults type information; I've seen such things called
"typetestfilter" before).  Or
'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception'
(isn't that what Clang names the blocks it creates for the explicit dispatch
code?); 'catchendblock' would then be something like
'unwinddispatch' or 'continuedispatch' or
'resumedispatch' and the resume that returns to normal execution could
be 'exitdispatch' or 'exitcatch' or even 'uncatch'.
For the resumes that end cleanups, something like 'endcleanup' might
work.
Names are hard…


Thanks
-Joseph


From: Reid Kleckner [mailto:rnk at google.com]
Sent: Monday, May 18, 2015 5:36 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

On Mon, May 18, 2015 at 12:03 PM, Joseph Tremoulet <jotrem at
microsoft.com<mailto:jotrem at microsoft.com>> wrote:
Hi,

Thanks for sending this out.  We're looking forward to seeing this come
about, since we need funclet separation for LLILC as well (and I have cycles to
spend on it, if that would be helpful).

Some questions about the new proposal:

- Do the new forms of resume have any implied read/write side-effects, or do
they work just like a branch?  In particular, I'm wondering what prevents
reordering a call across a resume.  Is this just something that code motion
optimizations are expected to check for explicitly to avoid introducing UB per
the "Executing such an invoke [or call] that does not transitively unwind
to the correct catchend block has undefined behavior" rule?

Yes, crossing a resume from a catchblock ends the lifetime of the exception
object, so I'd say that's a "writes escaped memory"
constraint. That said, a resume after a cleanupblock doesn't, but I'm
not sure it's worth having this kind of fine-grained analysis. I'm OK
teaching SimplifyCFG to combine cleanupblocks and leaving it at that.

- Does LLVM already have other examples of terminators that are glued to the top
of their basic blocks, or will these be the first?  I ask because it's a
somewhat nonstandard thing (a block in the CFG that can't have instructions
added to it) that any code placement algorithms (PRE, PGO probe insertion, Phi
elimination, RA spill/copy placement, etc.) may need to be adjusted for.  The
adjustments aren't terrible (conceptually it's no worse than having
unsplittable edges from each of the block's preds to each of its succs), but
it's something to be aware of.

No, LLVM doesn't have anything like this yet. It does have unsplittable
critical edges, which can come from indirectbr and the unwind edge of an invoke.
I don't think it'll be too hard to teach transforms how to deal with one
more, but maybe that's unrealistic youthful optimism. :)

- Since this will require auditing any code with special processing of resume
instructions to make sure it handles the new resume forms correctly, I wonder if
it might be helpful to give resume (or the new forms of it) a different name,
since then it would be immediately clear which code has/hasn't been updated
to the new model.

There aren't that many references to ResumeInst across LLVM, so I'm not
too scared. I'm not married to reusing 'resume', other candidate
names include 'unwind' and 'continue', and I'd like more
ideas.

- Is the idea that a resume (of the sort that resumes normal execution) ends
only one catch/cleanup, or that it can end any number of them?  Did you consider
having it end a single one, and giving it a source that references (in a
non-flow-edge-inducing way) the related catchend?  If you did that, then:
+ The code to find a funclet region could terminate with confidence when it
reaches this sort of resume, and
+ Resumes which exit different catches would have different sources and thus
couldn't be merged, reducing the need to undo tail-merging with code
duplication in EH preparation (by blocking the tail-merging in the first place)

We already have something like this for cleanupblocks because the resume target
and unwind label of the cleanupblock must match. It isn't as strong as
having a reference to the catchblock itself, because tail merging could kick in
like you mention. Undoing this would be and currently is the job of
WinEHPrepare. I guess I felt like the extra representational complexity
wasn't worth the confidence that it would buy us.

- What is the plan for cleanup/__finally code that may be executed on either
normal paths or EH paths?  One could imagine a number of options here:
+ require the IR producer to duplicate code for EH vs non-EH paths
+ duplicate code for EH vs non-EH paths during EH preparation
+ use resume to exit these even on the non-EH paths; code doesn't need to be
duplicated (but could and often would be as an optimization for hot/non-EH
paths), and normal paths could call the funclet at the end of the day
and it isn't clear to me which you're suggesting.  Requiring duplication
can worst-case quadratically expand the code (in that if you have n levels of
cleanup-inside-cleanup-inside-cleanup-…, and each cleanup has k code bytes
outside the next-inner cleanup, after duplication you'll have k*n + k*(n-1)
+ … or O(k*n^2) bytes total [compared to k*n before duplication]), which I'd
think could potentially be a problem in pathological inputs.

I want to have separate normal and exceptional codepaths, but at -O0 all the
cleanup work should be bundled up in a function that gets called from both those
paths.

Today, for C++ destructors, we emit two calls to the destructor: one on the
normal path and one on the EH path. For __finally, we outline the finally body
early in clang and emit two calls to it as before, but passing in the
frameaddress as an argument. I think this is a great place to be. It keeps our
-O0 code size small, simplifies the implementation, and allows us to inline one
or both call sites if we think it's profitable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/f9f63a44/attachment.html>

Reid Kleckner

2015-May-19 19:39 UTC

head link

[LLVMdev] RFC: New EH representation for MSVC compatibility

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <jotrem at
microsoft.com>
wrote:
>  > I want to have separate normal and exceptional codepaths
>
> I assume you at least mean that's what you'll be doing in
Clang's initial
> IR generation.  Do you also mean to impose this as a restriction on other
> IR generators, and as a property that IR transformations must preserve?
> I.e., is this something that EH preparation can assume?
>
EH preparation should not assume that each basic block lives in exactly one
funclet. People seem to get really upset when I suggest that we change the
language rules to maintain that invariant. :-)

Instead, the EH preparation pass is responsible for establishing that
invariant by duplicating blocks, but it should be *way* easier with real
basic block terminator instructions to mark handler transitions.
> For __finally, we outline the finally body early in clang and emit two
> calls to it as before, but passing in the frameaddress as an argument
>
> But then you have to frameescape any __finally-referenced local before
> optimization, and doesn't that defeat the purpose of delaying funclet
> outlining to EH preparation?
>
True. I'm not committed to this approach, I just want to preserve
functionality until we implement the de-commoning part of EH preparation,
which I figured we could do later, since it's one of the more obscure
corner cases.

Also, __finally is pretty rare, so it's not too crazy to teach the inliner
how to undo framerecover and let the optimization fall out that way.

> > tail merging could kick in like you mention. Undoing this would be and
> currently is the job of WinEHPrepare. I guess I felt like the extra
> representational complexity wasn't worth the confidence that it would
buy us
>
> For one, it seems counterproductive to let tail merge think it can kick in
> when it's doomed to be undone.
>
I'm only saying that the tail merging is legal, and not that it is
desirable. We can teach simplifycfg that tail merging 'resume'
instructions
is a waste of time, for example.

>  For another, if we're talking about a setup where EH paths might
mingle
> with non-EH paths but the funclet will only be invoked for the EH cases,
> then I believe this would help the  "pruning as many unreachable CFG
> edges as possible " step be more effective -- after finding out which
> blocks are reachable from a catch/cleanup, you could intersect that with
> the set of blocks from which a corresponding resume can be reached.  Any
> funclet blocks ending in condbr/switch that only wind up with one successor
> in the funclet could then have their terminators rewritten as unconditional
> branches, without needing to recover dataflow and chase constants through
> phis and resolve compares/switches and all that.
>
For discussion purposes, let's imagine that 'resume' takes a
'from' label
which must be an EH block (it starts with one of these new instructions).
The nice thing about using a label here is that, unlike most SSA values,
labels cannot be phi'd. Now tail merging will have to give up or insert an
i1 ph and a conditional branch on the resume instruction, which
realistically it won't since it's not a clear win.

Originally, I was thinking that this extra funclet cloning precision wasn't
worth it, because we'd arrange the frontend and optimizers to make it
unlikely, but I'm coming around to it. It mirrors the EH pointer value
required by Itanium EH.

> > I'm not married to reusing 'resume', other candidate names
include
> 'unwind' and 'continue', and I'd like more ideas
>
> The first thing that comes to mind is 'endatch/exitcatch', but to
use that
> you'd need to rename other things since it would be confusing vis-à-vis
> catchendblock and lack symmetry with catchblock that isn't prefixed
with
> begin/enter.
>
> You could consider 'filter' (or 'filterblock') for
'catchblock', since
> conceptually it plays the role of a filter (typically one which consults
> type information; I've seen such things called
"typetestfilter" before).
> Or
'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception'
> (isn't that what Clang names the blocks it creates for the explicit
> dispatch code?); 'catchendblock' would then be something like
> 'unwinddispatch' or 'continuedispatch' or
'resumedispatch' and the resume
> that returns to normal execution could be 'exitdispatch' or
'exitcatch' or
> even 'uncatch'.
>
> For the resumes that end cleanups, something like 'endcleanup'
might work.
>
> Names are hard…
>
'dispatch' might work for the instruction which transitions from a
cleanup
to the next EH block. :) I don't really see how to work filter in,
especially since we've already used it for landingpad filters, which exist
to support exception specifications.

Chatting around the office, we came up with 'recover' and
'unwind' for
ending catch and cleanup blocks respectively.

A 'recover' instruction would end a catch block, and would target the
block
where the exception is over. This instruction would modify memory, because
it destroys the exception object.

An 'unwind' instruction would end a cleanup block, and would target the
next action. I like this because I talk a lot about "unwind edges" in
the
CFG, and a cleanup finishing feels like an unwind edge. I could also see
'dispatch' here.

Some possible new syntax:

recover from label %maycatch.int to label %endcatchbb
unwind from label %cleanup.obj to label %nextaction
unwind from label %cleanup.obj    ; unwinds out of the function, hook it up
to the unwind edge of an inlined call site

For Itanium, tail merging is profitable and doable with phis, so we might
want to do this instead:

recover to label %endcatchbb
unwind i8* %ehptr to label %nextaction
unwind i8* %ehptr    ; unwinds out of the function, hook it up to the
unwind edge of an inlined call site

Itanium requires threading the active exception pointer through to the next
action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this
weird kind of control dependence instead. We could drop the 'from' token
as
unnecessary, but I want to make it easier to reason about the textual
representation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150519/abf5e16d/attachment.html>

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - May 2015 - [LLVMdev] RFC: New EH representation for MSVC compatibility

[LLVMdev] RFC: New EH representation for MSVC compatibility

[LLVMdev] RFC: New EH representation for MSVC compatibility

[LLVMdev] RFC: New EH representation for MSVC compatibility

Reasonably Related Threads