Kaylor, Andrew
2014-Nov-25 23:09 UTC
[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
> We should also think about how to call std::terminate when cleanup dtors throw. The current representation for Itanium is inefficient. As a strawman, I propose making @__clang_call_terminate an intrinsic:… That sounds like a good starting point.> Chandler expressed strong concerns about this design, however, as @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi *before* we can emit code for all the callers of @llvm.eh.get_capture_block. Today, this is easy, because module order defines emission order, but in the great glorious future, codegen will hopefully be parallelized, and then we've inflicted this horrible constraint on the innocent.> His suggestion to break the ordering dependence was to lock down the frame offset of the capture block to always be some fixed offset known by the target (ie ebp - 4 on x86, if we like that).Chandler probably has a better feel for this sort of thing than I do. I can’t think of a reason offhand why that wouldn’t work, but it makes me a little nervous. What would that look like in the IR? Would we use the same intrinsics and just lower them to use the known location? I’ll think about this, but for now I’m happy to just proceed with the belief that it’s a solvable problem either way.>> For C++ exception handling, we need cleanup code that executes before the catch handlers and cleanup code that excutes in the case on uncaught exceptions. I think both of these need to be outlined for the MSVC environment. Do you think we need a stub handler to be inserted in cases where no actual cleanup is performed? > I think it's actually harder than that, once you consider nested trys: > void f() { > try { > Outer outer; > try { > Inner inner; > g(); > } catch (int) { > // ~Inner gets run first > } > } catch (float) { > // ~Inner gets run first > // ~Outer gets run next > } > // uncaught exception? Run ~Inner then ~Outer. > }I took a look at the IR that’s generated for this example. I see what you mean. So there is potentially cleanup code before and after every catch handler, right? Do you happen to know offhand what that looks like in the .xdata for the _CxxFrameHandler3 function? -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141125/9e6f22e6/attachment.html>
Reid Kleckner
2014-Nov-26 01:27 UTC
[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
On Tue, Nov 25, 2014 at 3:09 PM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> > We should also think about how to call std::terminate when cleanup > dtors throw. The current representation for Itanium is inefficient. As a > strawman, I propose making @__clang_call_terminate an intrinsic: > > … > > > > That sounds like a good starting point. > > > > > > > Chandler expressed strong concerns about this design, however, as > @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you > add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi > *before* we can emit code for all the callers of > @llvm.eh.get_capture_block. Today, this is easy, because module order > defines emission order, but in the great glorious future, codegen will > hopefully be parallelized, and then we've inflicted this horrible > constraint on the innocent. > > > > > His suggestion to break the ordering dependence was to lock down the > frame offset of the capture block to always be some fixed offset known by > the target (ie ebp - 4 on x86, if we like that). > > > > Chandler probably has a better feel for this sort of thing than I do. I > can’t think of a reason offhand why that wouldn’t work, but it makes me a > little nervous. > > >What would that look like in the IR? Would we use the same intrinsics and> just lower them to use the known location? >Chandler seems to be OK with get/set capture block, as long as the codegen ordering dependence can be removed. I think we can remove it by delaying the resolution of the frame offset to assembly time using an MCSymbolRef. It would look a lot like this kind of assembly: my_handler: push %rbp mov %rsp, %rbp lea Lframe_offset0(%rdx), %rax ; This is now the parent capture block ... retq parent_fn: push %rbp mov %rsp, %rbp push %rbx push %rdi subq $NN, %rsp Lframe_offset0 = X + 2 * 8 ; Two CSRs plus some offset into the main stack allocation I guess I'll try to make that work. I’ll think about this, but for now I’m happy to just proceed with the> belief that it’s a solvable problem either way. > > > > >> For C++ exception handling, we need cleanup code that executes before > the catch handlers and cleanup code that excutes in the case on uncaught > exceptions. I think both of these need to be outlined for the MSVC > environment. Do you think we need a stub handler to be inserted in cases > where no actual cleanup is performed? > > > I think it's actually harder than that, once you consider nested trys: > > > void f() { > > > try { > > > Outer outer; > > > try { > > > Inner inner; > > > g(); > > > } catch (int) { > > > // ~Inner gets run first > > } > > > } catch (float) { > > > // ~Inner gets run first > > > // ~Outer gets run next > > } > > > // uncaught exception? Run ~Inner then ~Outer. > > } > > > > I took a look at the IR that’s generated for this example. I see what you > mean. So there is potentially cleanup code before and after every catch > handler, right? > > > > Do you happen to know offhand what that looks like in the .xdata for the > _CxxFrameHandler3 function? >I can't tell how the state tables arrange for the destructors to run in the right order, but they can accomplish this without duplicating the cleanup code into the outlined catch handler functions, which is nice. I think we may be able to address this by emitting calls to start/stop intrinsics around EH cleanups, but that may inhibit optimizations. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141125/8ac4e60b/attachment.html>
Vadim Chugunov
2014-Dec-03 00:15 UTC
[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
Hi Reid, Is this design supposed to be able to cope with asynchronous exceptions? I am having trouble imagining how this would work without adding the ability to associate landing pads with scopes in LLVM IR. Vadim On Tue, Nov 25, 2014 at 5:27 PM, Reid Kleckner <rnk at google.com> wrote:> On Tue, Nov 25, 2014 at 3:09 PM, Kaylor, Andrew <andrew.kaylor at intel.com> > wrote: > >> > We should also think about how to call std::terminate when cleanup >> dtors throw. The current representation for Itanium is inefficient. As a >> strawman, I propose making @__clang_call_terminate an intrinsic: >> >> … >> >> >> >> That sounds like a good starting point. >> >> >> >> >> >> > Chandler expressed strong concerns about this design, however, as >> @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you >> add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi >> *before* we can emit code for all the callers of >> @llvm.eh.get_capture_block. Today, this is easy, because module order >> defines emission order, but in the great glorious future, codegen will >> hopefully be parallelized, and then we've inflicted this horrible >> constraint on the innocent. >> >> >> >> > His suggestion to break the ordering dependence was to lock down the >> frame offset of the capture block to always be some fixed offset known by >> the target (ie ebp - 4 on x86, if we like that). >> >> >> >> Chandler probably has a better feel for this sort of thing than I do. I >> can’t think of a reason offhand why that wouldn’t work, but it makes me a >> little nervous. >> >> >> > What would that look like in the IR? Would we use the same intrinsics and >> just lower them to use the known location? >> > > Chandler seems to be OK with get/set capture block, as long as the codegen > ordering dependence can be removed. I think we can remove it by delaying > the resolution of the frame offset to assembly time using an MCSymbolRef. > It would look a lot like this kind of assembly: > > my_handler: > push %rbp > mov %rsp, %rbp > lea Lframe_offset0(%rdx), %rax ; This is now the parent capture block > ... > retq > > parent_fn: > push %rbp > mov %rsp, %rbp > push %rbx > push %rdi > subq $NN, %rsp > Lframe_offset0 = X + 2 * 8 ; Two CSRs plus some offset into the main stack > allocation > > I guess I'll try to make that work. > > I’ll think about this, but for now I’m happy to just proceed with the >> belief that it’s a solvable problem either way. >> >> >> >> >> For C++ exception handling, we need cleanup code that executes before >> the catch handlers and cleanup code that excutes in the case on uncaught >> exceptions. I think both of these need to be outlined for the MSVC >> environment. Do you think we need a stub handler to be inserted in cases >> where no actual cleanup is performed? >> >> > I think it's actually harder than that, once you consider nested trys: >> >> > void f() { >> >> > try { >> >> > Outer outer; >> >> > try { >> >> > Inner inner; >> >> > g(); >> >> > } catch (int) { >> >> > // ~Inner gets run first >> > } >> >> > } catch (float) { >> >> > // ~Inner gets run first >> >> > // ~Outer gets run next >> > } >> >> > // uncaught exception? Run ~Inner then ~Outer. >> > } >> >> >> >> I took a look at the IR that’s generated for this example. I see what >> you mean. So there is potentially cleanup code before and after every >> catch handler, right? >> >> >> >> Do you happen to know offhand what that looks like in the .xdata for the >> _CxxFrameHandler3 function? >> > > I can't tell how the state tables arrange for the destructors to run in > the right order, but they can accomplish this without duplicating the > cleanup code into the outlined catch handler functions, which is nice. > > I think we may be able to address this by emitting calls to start/stop > intrinsics around EH cleanups, but that may inhibit optimizations. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141202/decb0a15/attachment.html>
Reid Kleckner
2014-Dec-03 21:32 UTC
[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
I went ahead and implemented @llvm.frameallocate in a patch here: http://reviews.llvm.org/D6493 Andrew, do you have a wip patch for outlining, or any lessons learned from attempting it? I think outlining is now the next step, so let me know if there's something you're actively working on so I can avoid duplicated effort. :) On Tue, Nov 25, 2014 at 3:09 PM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> > We should also think about how to call std::terminate when cleanup > dtors throw. The current representation for Itanium is inefficient. As a > strawman, I propose making @__clang_call_terminate an intrinsic: > > … > > > > That sounds like a good starting point. > > > > > > > Chandler expressed strong concerns about this design, however, as > @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you > add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi > *before* we can emit code for all the callers of > @llvm.eh.get_capture_block. Today, this is easy, because module order > defines emission order, but in the great glorious future, codegen will > hopefully be parallelized, and then we've inflicted this horrible > constraint on the innocent. > > > > > His suggestion to break the ordering dependence was to lock down the > frame offset of the capture block to always be some fixed offset known by > the target (ie ebp - 4 on x86, if we like that). > > > > Chandler probably has a better feel for this sort of thing than I do. I > can’t think of a reason offhand why that wouldn’t work, but it makes me a > little nervous. > > > > What would that look like in the IR? Would we use the same intrinsics and > just lower them to use the known location? > > > > I’ll think about this, but for now I’m happy to just proceed with the > belief that it’s a solvable problem either way. > > > > >> For C++ exception handling, we need cleanup code that executes before > the catch handlers and cleanup code that excutes in the case on uncaught > exceptions. I think both of these need to be outlined for the MSVC > environment. Do you think we need a stub handler to be inserted in cases > where no actual cleanup is performed? > > > I think it's actually harder than that, once you consider nested trys: > > > void f() { > > > try { > > > Outer outer; > > > try { > > > Inner inner; > > > g(); > > > } catch (int) { > > > // ~Inner gets run first > > } > > > } catch (float) { > > > // ~Inner gets run first > > > // ~Outer gets run next > > } > > > // uncaught exception? Run ~Inner then ~Outer. > > } > > > > I took a look at the IR that’s generated for this example. I see what you > mean. So there is potentially cleanup code before and after every catch > handler, right? > > > > Do you happen to know offhand what that looks like in the .xdata for the > _CxxFrameHandler3 function? > > > > -Andy > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141203/3df00102/attachment.html>
Kaylor, Andrew
2014-Dec-03 22:13 UTC
[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR
Hi Reid, I saw your patch but haven’t looked closely at it yet. I do have a work in progress for the outlining. I expect to have something ready to share pretty soon, hopefully by the end of the week. It won’t be ready for primetime, as it’s making a whole lot of assumptions about the structure of the IR, but I think it will work with a sample IR file based on what you posted in your earlier SEH review. I expect that it will be a useful point of reference for discussion and that I should be able to quickly refactor it into something product ready once we’ve ironed out the expectations as to what the incoming IR will look like and how flexible the heuristics for identifying regions to outline need to be. One thing that I’ve discussed with a co-worker but haven’t explored in my implementation yet is the possibility of using CodeExtractor to do the outlining rather than basing it on the CloneFunction stuff. My current implementation is based on CloneAndPruneFunctionInto as you suggested, but I wondered if CodeExtractor might not be a better starting point. What do you think? Also, at the moment I’m more or less ignoring the frame variable issue. My ValueMaterializer is just creating new allocas with the name I want. I think that will be easy enough to patch up once your llvm.frameallocate stuff is in place. The implication of this is that right now I’m not looking for live variables before I start outlining, I’m just picking them up as I go. It seems like that may need to change. -Andy From: Reid Kleckner [mailto:rnk at google.com] Sent: Wednesday, December 03, 2014 1:32 PM To: Kaylor, Andrew Cc: LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR I went ahead and implemented @llvm.frameallocate in a patch here: http://reviews.llvm.org/D6493 Andrew, do you have a wip patch for outlining, or any lessons learned from attempting it? I think outlining is now the next step, so let me know if there's something you're actively working on so I can avoid duplicated effort. :) On Tue, Nov 25, 2014 at 3:09 PM, Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> wrote:> We should also think about how to call std::terminate when cleanup dtors throw. The current representation for Itanium is inefficient. As a strawman, I propose making @__clang_call_terminate an intrinsic:… That sounds like a good starting point.> Chandler expressed strong concerns about this design, however, as @llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi *before* we can emit code for all the callers of @llvm.eh.get_capture_block. Today, this is easy, because module order defines emission order, but in the great glorious future, codegen will hopefully be parallelized, and then we've inflicted this horrible constraint on the innocent.> His suggestion to break the ordering dependence was to lock down the frame offset of the capture block to always be some fixed offset known by the target (ie ebp - 4 on x86, if we like that).Chandler probably has a better feel for this sort of thing than I do. I can’t think of a reason offhand why that wouldn’t work, but it makes me a little nervous. What would that look like in the IR? Would we use the same intrinsics and just lower them to use the known location? I’ll think about this, but for now I’m happy to just proceed with the belief that it’s a solvable problem either way.>> For C++ exception handling, we need cleanup code that executes before the catch handlers and cleanup code that excutes in the case on uncaught exceptions. I think both of these need to be outlined for the MSVC environment. Do you think we need a stub handler to be inserted in cases where no actual cleanup is performed? > I think it's actually harder than that, once you consider nested trys: > void f() { > try { > Outer outer; > try { > Inner inner; > g(); > } catch (int) { > // ~Inner gets run first > } > } catch (float) { > // ~Inner gets run first > // ~Outer gets run next > } > // uncaught exception? Run ~Inner then ~Outer. > }I took a look at the IR that’s generated for this example. I see what you mean. So there is potentially cleanup code before and after every catch handler, right? Do you happen to know offhand what that looks like in the .xdata for the _CxxFrameHandler3 function? -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141203/6aced324/attachment.html>