thr3ads.net - llvm dev - [LLVMdev] RFC: How to represent SEH (__try / _

If this information is useful, please help other people find it:
Share via:

Bob Wilson

2014-Nov-18 01:22 UTC

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

I don’t know much about SEH and haven’t had time to really dig into this, but
the idea of outlining functions that need to know about the frame layout sounds
a bit scary. Is it really necessary?

I’m wondering if you can treat the cleanups and filter functions as portions of
the same function, instead of outlining them to separate functions. Can you
arrange to set up the base pointer on entry to one of those segments of code to
have the same value as when running the normal part of the function? If so, from
the code-gen point of view, doesn’t it just behave as if there is a large
dynamic alloca on the stack at that point (because the stack pointer is not
where it was when the function was previously running)? Are there other
constraints that prevent that from working?
> On Nov 13, 2014, at 5:28 PM, Reid Kleckner <rnk at google.com> wrote:
> 
> On Thu, Nov 13, 2014 at 5:19 PM, Kaylor, Andrew <andrew.kaylor at
intel.com <mailto:andrew.kaylor at intel.com>> wrote:
> I don’t really have a good enough feeling for the landingpad syntax yet to
comment on the most natural way to extend it yet, but creating a synthetic
cleanup function to call from the personality function is what I was thinking.
> 
> 
> Pretty much.
>  
> With the current (trunk +/- a couple of weeks) clang, compiling for an
“x86_64-pc-windows-msvc” target, I’m seeing a landingpad that looks like this:
> 
>  
> 
> lpad:                                             ; preds = %if.end,
%if.then
> 
>   %2 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*)
> 
>           cleanup
> 
>   %3 = extractvalue { i8*, i32 } %2, 0
> 
>   store i8* %3, i8** %exn.slot
> 
>   %4 = extractvalue { i8*, i32 } %2, 1
> 
>   store i32 %4, i32* %ehselector.slot
> 
>   call void @"\01??1Bob@@QEAA at XZ"(%class.Bob* %bob) #3  ;
Calling the destructor for a class named “Bob”
> 
>   br label %eh.resume
> 
>  
> 
> Replacing __gxx_personality_v0 with the name of my custom personality
function (which has the SEH signature) and scrubbing out the terminate and
resume calls for the time being, I see my personality function being called
twice -- first for the C++ exception (Exception code == 0xe06d7363) and once for
the unwind.  So now I just need to figure out how to get a pointer to a cleanup
function into the DispatcherContext->HandlerData, which must be where the
extra stuff in the landingpad comes in, right?
> 
> 
> It's got some docs here:
> http://llvm.org/docs/ExceptionHandling.html#overview
<http://llvm.org/docs/ExceptionHandling.html#overview>
> 
> The two values are the exception pointer and the selector value. The
selector value is an artifact of the way we model the Itanium EH scheme, and you
can basically set it to zero if you only want to deal with cleanups for the time
being. The exception pointer is presumably pulled from the arguments to the
personality routine. Again, cleanups don't need it, so you can probably zero
it too.
> 
> Anyway, I think I’m making progress. :-)
> 
> 
> Nice! 
> 
> 
> -Andy
> 
>  
> 
>  
> 
> From: Reid Kleckner [mailto:rnk at google.com <mailto:rnk at
google.com>]
> Sent: Thursday, November 13, 2014 4:22 PM
> 
> 
> To: Kaylor, Andrew
> Cc: LLVM Developers Mailing List; John McCall
> Subject: Re: [LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM
IR
> 
>  
> 
> Focusing on cleanups is probably a good way to start. The trouble is that
your personality function can't just reset rsp and jump to the landing pad,
or it will trash the state of the unwinder that's still on the stack.
Everything in the landing pad basically has to be outlined. If the outlining
happens at the IR level, we need some way to represent that, and I don't
really have it nailed down.
> 
>  
> 
> Here's an idea, just to brainstorm:
> 
>  
> 
> define void @parent() {
> 
>   invoke ... unwind to %lpad
> 
>   ...
> 
> lpad:
> 
>   %eh_vals = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__C_specific_handler to i8*)
> 
>       cleanup
> 
>       catch i8* @typeid1
> 
>       catch i8* @typeid2
> 
>   %label = call i8* (...)* @llvm.eh.outlined_handlers(
> 
>       void (i8*, i8*)* @my_cleanup,
> 
>       i8* @typeid1, i8* (i8*, i8*)* @my_catch1,
> 
>       i8* @typeid2, i8* (i8*, i8*)* @my_catch2)
> 
>   indirectbr i8* %label
> 
>  
> 
> endcatch:
> 
>   ...
> 
> }
> 
>  
> 
> define void @my_cleanup(i8*, i8*) {
> 
>   ...
> 
>   ret void ; unwinder will keep going for cleanups
> }
> 
>  
> 
> define i8* @my_catch1(i8*, i8*) {
> 
>   ret i8* blockaddress(@parent, %endcatch) ; merge back into normal flow at
endcatch
> }
> 
>  
> 
> define i8* @my_catch2(i8*, i8*) {
> 
>   ret i8* blockaddress(@parent, %endcatch) ; merge back into normal flow at
endcatch
> }
> 
>  
> 
> I guess @llvm.eh.outlined_handlers wouldn't be valid outside a landing
pad, and would only be introduced during CodeGenPrepare to allow the best
optimization of the handlers in the context of the parent function.
> 
>  
> 
> On Thu, Nov 13, 2014 at 3:25 PM, Kaylor, Andrew <andrew.kaylor at
intel.com <mailto:andrew.kaylor at intel.com>> wrote:
> 
> Thanks for the additional information.
> 
>  
> 
> Right now I’m experimenting with a mix of code compiled with MSVC and code
compiled with clang, trying to get a C++ exception thrown and caught by the
MSVC-compiled code across a function in the clang-compiled code.  My goal here
is to isolate a small part of what needs to be done in a way that lends itself
to tinkering.  I think this might lead me to the outlining of EH blocks that you
describe below.
> 
>  
> 
> If the clang code doesn’t have and exception handler (and it can’t since
clang won’t compile that right now) and doesn’t need to do any clean-up, this
works fine.  If the clang code does need to do cleanup, clang currently emits
the same landingpad stuff that it would emit for mingw and since I’m trying to
link with the MSVC environment I end up with unresolved externals.  So I’m
playing around with the clang-generated IR to see if I can turn it into
something that will handle the cleanup and let the exception pass.  I’ve got it
calling my custom SEH-style personality function and it’s trivial to get that to
let the exception pass without doing the cleanup.  Now I just need to figure out
how to get it to execute the cleanup code.
> 
>  
> 
> I haven’t spent a lot of time on this yet, so if this overlaps with what
you’ve been doing I can step back and approach it from a different direction. 
Otherwise, I’ll proceed and see if I can make use of your suggestions below with
regard to outlining, probably starting with manual changes to the IR that
simulate the process.
> 
>  
> 
> -Andy
> 
>  
> 
>  
> 
> From: Reid Kleckner [mailto:rnk at google.com <mailto:rnk at
google.com>]
> Sent: Thursday, November 13, 2014 11:51 AM
> To: Kaylor, Andrew
> Cc: LLVM Developers Mailing List; John McCall
> Subject: Re: [LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM
IR
> 
>  
> 
> Cool! Apologies for the following stream of consciousness brain dump...
> 
>  
> 
> On Wed, Nov 12, 2014 at 5:07 PM, Kaylor, Andrew <andrew.kaylor at
intel.com <mailto:andrew.kaylor at intel.com>> wrote:
> 
> Hi Reid,
> 
>  
> 
> I’ve been following your proposal, and I’d be interested in helping out if
I can.  My main interest right now is in enabling C++ exception handling in
clang for native (i.e. not mingw/cygwin) Windows targets (both 32-bit and
64-bit), but if I understand things correctly that will be closely related to
your SEH work under the hood.
> 
>  
> 
> Great! I agree, any changes to LLVM IR made to support SEH will also be
needed to support C++ exceptions on Windows, in particular the outlining.
> 
>  
> 
> In the current LLVM model, all the exception handling code lives in the
landing pad. The Windows unwinder doesn't actually return control to the
landingpad until very late. Instead, it creates new stack frames to invoke the
cleanup, catch handler (C++ EH only), or filter function (SEH only). This is why
we need to have outlining somewhere. The question is, where should we do it?
Personally, I want to do this on LLVM IR during CodeGenPrepare.
> 
>  
> 
> The major challenge that outlining anywhere presents is that now the
outlined code has to "know" something about the frame layout of the
function it was outlined from in order to access local variables. I think we can
add `i8* @llvm.eh.get_capture_block(i8* %function, i8* %parent_rbp)` and `void
@llvm.eh.set_capture_block(i8* %captures)` intrinsics to make this work. Any SSA
values or allocas captured by the outlined landing pad code will be demoted to
memory and stored in the capture block, and the layout will be encoded in a
struct used by the outlined handlers and the parent function. However, once you
do this, you cannot inline the IR without some heroics. It probably isn't
that important to be able to inline functions with try/catch, but a good acid
test for any new LLVM IR construct is "will it inline?", and this
construct fails. I think we can live with this construct as long as we only
introduce it after CodeGenPrepare.
> 
>  
> 
> The remaining wrinkle in the capture block scheme is stack realignment
prologues. In this case, we have three pointers to the stack: the SP, the base
pointer (esi/rbx), and the frame pointer (ebp/rbp). Is the capture block stored
at a known constant offset from ebp/rbp or esi/rbx? Or do we load and store a
dynamic offset saved somewhere near ebp/rbp? This needs study.
> 
>  
> 
> I’m still trying to get up to speed on what is and is not implemented, but
I think I’m starting to get a clear picture.  My understanding is that LLVM has
the necessary support to emit exception handling records that Windows will be
able to work with (for Win64 EH) but some work may be required to get the IR
properly wired up, and that there’s basically nothing in place to support Win32
EH and nothing in clang to generate the IR for either case.  Is that more or
less accurate?
> 
>  
> 
> We can emit valid pdata and xdata sections on Win64, and this supports
basic stack unwinding. On top of that, we currently follow mingw64 and use
Itanium-style LSDA tables and the __gxx_personality_seh0 personality function to
run EH handlers. This means the standard exception handling IR emitted by clang
and other frontends "just works" on Windows, and I want to keep it
that way. I think most of the changes should be on the LLVM side to lower the
standard EH IR down to something that is more compatible with MSVC EH.
> 
>  
> 
> I’ve been looking at the work Kai Nacke did in ldc to implement exception
handling there, but it isn’t clear to me yet how relevant that is to clang.
> 
>  
> 
> Can you tell me more about what your plans are?  Specifically, do you
intend to support both 32 and 64 bit targets?  And were you also planning to
work toward C++ exception handling support in clang once you had the general SEH
support in place?
> 
>  
> 
> I want to do Win64 first because it is easier and better documented, and
then look at 32-bit next. 32-bit SEH does things like "take the address of
a BB label from the middle of the parent function and 'call' it with a
special ebp value passed in", but that is basically equivalent to the Win64
way of doing things with a very special calling convention.
> 
>  
> 
> I know some people are also interested in ARM (WoA), which should be
similar to Win64, as it also uses pdata/xdata style unwind info.
> 
>  
> 
> Finally, and most importantly, what can I do to help?
> 
>  
> 
> I think there are some separable tasks here. 
> 
>  
> 
> The EH capture block intrinsics can probably be built in isolation from the
outlining. We can probably make `get_capture_block` work with the result of
`@llvm.frameaddress(i32 0)`. The inliner also has to be taught to avoid inlining
functions that set up a capture block.
> 
>  
> 
> Doing outlining will be similar what `llvm::CloneAndPruneFunctionInto`
does, except it will start at the landing pad instead of the entry block.
Instead of mapping from parameters to arguments, the outliner would map the
selector to a constant and propagate that value forwards, pruning conditional
branches as it goes. The `resume` instruction would end outlining and become a
`ret`. Any cloned `ret` instructions are the result of cloning something that is
statically reachable but dynamically unreachable. We can transform them to
`unreachable` and run standard cleanup passes to propagate that backwards.
> 
>  
> 
> 32-bit x86 EH will require installing an alloca onto the fs:00 chain of EH
handlers. I suppose this could be emitted during CodeGenPrepare as regular LLVM
IR instructions, since we have a way of writing `load/store fs:00` with address
space 257. This alloca should probably be the same as the capture block, since
it has to be at some known offset from ebp.
> 
>  
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141117/06e1707e/attachment.html>

Reid Kleckner

2014-Nov-18 01:50 UTC

head link

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

On Mon, Nov 17, 2014 at 5:22 PM, Bob Wilson <bob.wilson at apple.com>
wrote:
> I don’t know much about SEH and haven’t had time to really dig into this,
> but the idea of outlining functions that need to know about the frame
> layout sounds a bit scary. Is it really necessary?
>
> I’m wondering if you can treat the cleanups and filter functions as
> portions of the same function, instead of outlining them to separate
> functions. Can you arrange to set up the base pointer on entry to one of
> those segments of code to have the same value as when running the normal
> part of the function? If so, from the code-gen point of view, doesn’t it
> just behave as if there is a large dynamic alloca on the stack at that
> point (because the stack pointer is not where it was when the function was
> previously running)? Are there other constraints that prevent that from
> working?
>
The "big dynamic alloca" approach does work, at least conceptually.
It's
more or less what MSVC does. They emit the normal code, then the epilogue,
then a special prologue that resets ebp/rbp, and then continue with normal
emission. Any local variables declared in the __except block are allocated
in the parent frame and are accessed via ebp. Any calls create new stack
adjustments to new allocate argument memory.

This approach sounds far scarier to me, personally, and will significantly
complicate a part of LLVM that is already poorly understood and hard to
hack on. I think adding a pair of intrinsics that can't be inlined will be
far less disruptive for the rest of LLVM. This is actually already the
status quo for SjLj exceptions, which introduce a number of uninlinable
intrinsic calls (although maybe SjLj is a bad precedent :).

The way I see it, it's just a question of how much frame layout information
you want to teach CodeGen to save. If we add the set_capture_block /
get_capture_block intrinsics, then we only need to save the frame offset of
*one* alloca. This is easy, we can throw it into a side table on
MachineModuleInfo. If we don't go this way, we need to save just the right
amount of CodeGen state to get stack offsets in some other function.

Having a single combined MachineFunction also means that MI passes will
have to learn more about SEH. For example, we need to preserve the ordering
of basic blocks so that we don't end up with discontiguous regions of code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141117/1f3d3640/attachment.html>

Bob Wilson

2014-Nov-18 18:50 UTC

head link

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

> On Nov 17, 2014, at 5:50 PM, Reid Kleckner <rnk at google.com> wrote:
> 
> On Mon, Nov 17, 2014 at 5:22 PM, Bob Wilson <bob.wilson at apple.com
<mailto:bob.wilson at apple.com>> wrote:
> I don’t know much about SEH and haven’t had time to really dig into this,
but the idea of outlining functions that need to know about the frame layout
sounds a bit scary. Is it really necessary?
> 
> I’m wondering if you can treat the cleanups and filter functions as
portions of the same function, instead of outlining them to separate functions.
Can you arrange to set up the base pointer on entry to one of those segments of
code to have the same value as when running the normal part of the function? If
so, from the code-gen point of view, doesn’t it just behave as if there is a
large dynamic alloca on the stack at that point (because the stack pointer is
not where it was when the function was previously running)? Are there other
constraints that prevent that from working?
> 
> The "big dynamic alloca" approach does work, at least
conceptually. It's more or less what MSVC does. They emit the normal code,
then the epilogue, then a special prologue that resets ebp/rbp, and then
continue with normal emission. Any local variables declared in the __except
block are allocated in the parent frame and are accessed via ebp. Any calls
create new stack adjustments to new allocate argument memory.
> 
> This approach sounds far scarier to me, personally, and will significantly
complicate a part of LLVM that is already poorly understood and hard to hack on.
I think adding a pair of intrinsics that can't be inlined will be far less
disruptive for the rest of LLVM. This is actually already the status quo for
SjLj exceptions, which introduce a number of uninlinable intrinsic calls
(although maybe SjLj is a bad precedent :).
> 
> The way I see it, it's just a question of how much frame layout
information you want to teach CodeGen to save. If we add the set_capture_block /
get_capture_block intrinsics, then we only need to save the frame offset of
*one* alloca. This is easy, we can throw it into a side table on
MachineModuleInfo. If we don't go this way, we need to save just the right
amount of CodeGen state to get stack offsets in some other function.
This is the only part that concerns me. Who keeps track of the layout of the
data inside that capture block? How do you know what local variables need to be
in the capture block? If the front-end needs to decide that, is that something
that fits easily into how clang works?

For DWARF EH and SjLj, the backend is responsible for handling most of the EH
work. It seems like it would be a more consistent design for SEH to do the same.
> 
> Having a single combined MachineFunction also means that MI passes will
have to learn more about SEH. For example, we need to preserve the ordering of
basic blocks so that we don't end up with discontiguous regions of code.
Yes, you would probably need to do that. It doesn’t seem like that would be
fundamentally difficult, but I haven’t thought through the details and I can
imagine that it would take a fair bit of work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141118/c63f288e/attachment.html>

llvm dev - Nov 2014 - [LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

llvm dev - Nov 2014 - [LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR