Joseph Tremoulet
2015-Feb-11 21:57 UTC
[LLVMdev] RFC: Native Windows C++ exception handling
Ah, ok. So if the outliner sees non-dispatch code in the landing pad area, it can find/create somewhere to put it and an appropriate eh.actions annotation to get an EH table generated that will ensure it gets executed appropriately at run-time (in this example, perform the add before invoking either handler); is that more or less the idea? That makes sense, thanks. I have the same question about the post-outlining IR. To change the example to one where the bait won't get outlined, suppose you had int foo(int a, int b) { try { try { maybe_throw(); return 0; } catch (int) { // some code here that gets outlined } L1: return a + b; } catch (float) { // some other code here that also gets outlined } L2: return (a + b) + 1; } and suppose that nothing gets moved around before outlining. Then, after outlining, the landingpad will be followed by an eh.actions call and then an indirect branch that targets L1 and L2, correct? Do we need to worry that a late codesize optimization might want to merge the adds by hoisting them up above the indirect branch? If that happened, wouldn't it get skipped over if an exception were raised? Thanks -Joseph From: Kaylor, Andrew [mailto:andrew.kaylor at intel.com] Sent: Wednesday, February 11, 2015 2:06 PM To: Reid Kleckner; Joseph Tremoulet Cc: Bataev, Alexey; Reid Kleckner (reid at kleckner.net); LLVM Developers Mailing List Subject: RE: [LLVMdev] RFC: Native Windows C++ exception handling It’s an interesting problem though. If an instruction is in the landing pad block but not inside a begincatch/endcatch pair it will be interpreted as cleanup code. I think that is OK, but it’s something we’ll need to be aware of. For reference, Joseph’s first scenario will look like this: ; Function Attrs: uwtable define i32 @_Z3fooii(i32 %a, i32 %b) #0 { entry: <…snip…> store i32 %a, i32* %a.addr, align 4 store i32 %b, i32* %b.addr, align 4 invoke void @_Z11maybe_throwv() to label %invoke.cont unwind label %lpad lpad: ; preds = %entry %4 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) catch i8* bitcast (i8** @_ZTIi to i8*) catch i8* bitcast (i8** @_ZTIf to i8*) %5 = extractvalue { i8*, i32 } %4, 0 store i8* %5, i8** %exn.slot %6 = extractvalue { i8*, i32 } %4, 1 store i32 %6, i32* %ehselector.slot %2 = load i32* %a.addr, align 4 %3 = load i32* %b.addr, align 4 %add = add nsw i32 %2, %3 store i32 %add, i32* %x, align 4 br label %catch.dispatch <…snip…> } So in this scenario, even though the landingpad doesn’t have a cleanup clause, we’ll need to add one (better, the pass the sunk the add operation should have added one). But assuming we do move the add instructions into a cleanup handler, this should just work. -Andy From: Reid Kleckner [mailto:rnk at google.com] Sent: Wednesday, February 11, 2015 10:46 AM To: Joseph Tremoulet Cc: Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net<mailto:reid at kleckner.net>); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling These are exactly the sorts of code transformations we want to allow by delaying the outlining until later. By keeping such code inlined in the parent function until after optimization, we enable a lot of core optimizations like SROA. For example, we should be able to completely eliminate wrappers like unique_ptr that would otherwise stay around due to the pointer escaped to the destructor call that gets executed on exception. On Tue, Feb 10, 2015 at 6:21 PM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote: Hi, Sorry if I'm late to the party. I'm curious whether you need to be concerned about code being moved into the landing pad area with this approach (i.e. whether "real code" might wind up in the eh.actions call's block which [if I'm following correctly] won't actually get executed at runtime). For example, given something like int foo (int a, int b) { int x; try { x = a + b; maybe_throw(); return 0; } catch (int) { return x; } catch (float) { return x + 1; } } Do you need to worry that something like partial deadcode elimination could want to move the definition of x down into the landing pad code (but not all the way down into the catch handlers)? Or conversely, given something like int foo(int a, int b) { try { maybe_throw(); return 0; } catch (int) { return a + b; } catch (float) { return (a + b) + 1; } } Do you need to worry that something like very busy expressions could want to hoist the "a + b" computation up into the landing pad code (but not all the way up into the try block)? Should those sorts of code motion be legal? If not, what's preventing them, and if so, how will the moved code be executed at runtime? Thanks -Joseph From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu>] On Behalf Of Kaylor, Andrew Sent: Wednesday, January 28, 2015 7:49 PM To: Reid Kleckner Cc: Bataev, Alexey; Reid Kleckner (reid at kleckner.net<mailto:reid at kleckner.net>); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling Hi Reid, I’ve worked through a few scenarios, and I think this is converging. I’m attaching a new example, which extends the one we’ve been working through to handle a few more cases. I wasn’t sure what you intended the first i32 argument in an llvm.eh.actions entry to be. I’ve been using it as a place to store the eh state that I’m associating with the action, but that’s kind of circular because I’m using the information in these tables to calculate that value. I’ll be able to do this calculation in the MSVCEHPrepare pass before these instructions are added, so I can put it there (it’s nice for readability), but the later pass that generates the tables (assuming we leave that in a later pass) will need information not represented in the action table (at least as I’ve been using it). I’m not 100% sure that this is necessary, but it seemed like the way things ought to be. I’ll experiment more before anything gets set in stone. In addition to the IP-to-state table that we’ve talked about, we also need an unwind table (which has each state, the state it transitions to when it expires and the unwind handler, if any) and a catch handler table (which has the range of states that can throw to the handlers, the state of the handlers and a map of types handled to their handlers). I’ve got examples of these in the attached example. So I now have a firm plan for how to compute these tables from the outlined IR. I think the algorithm you proposed for computing eh states doesn’t quite work. In particular, if multiple catch handlers get added by the same landing pad we’ll want to start by assuming that they have the same state. If they end up getting popped at different times then we’ll need to update the eh state of the one that gets popped first. Unfortunately my example doesn’t cover this case, but I worked through it and my new algorithm (based on your but slightly tweaked) works for that case. Also, unwind handlers get discrete states (they happen when a transition crosses the state, but in the .xdata tables they are represented with a single state). Catch handlers, on the other hand, do get a range. Anyway, here’s what I’m doing (by hand so far, I don’t have it coded yet). 1. Start with an empty stack of actions and an empty master list of eh state. 2. Visit each landing pad in succession, processing the actions at that landing pad back to front. 3. Pop actions from the current stack that aren’t in the current landing pad’s action table a. If a catch is popped that had been assumed to have the same state as a catch that isn’t being popped its state number and all state numbers above it need to be incremented b. When a catch is popped, the next available eh_state is assigned to its handler 4. As an action is pushed to the current stack, it is assigned an eh_state in the master list a. If the action was an unwind or if it was a cacth after an unwind, the next available eh_state is incremented b. If the action was a catch following a catch that was also just added, it gets the same eh_state as the previous catch 5. When all landing pads have been handled, the remaining actions are popped and processed as above. The “next” state for each eh_state can also be computed during the above process by observing the state of the action on the top of the current action stack when the action associated with the state is popped. In the case of catch handlers, I think the next state will always be the same as the next state of the corresponding catch action. So, I think it makes sense to compute the unwind and catch tables during the MSVCEHPrepare pass, but I wasn’t sure how best to preserve the information once it was computed. Is it reasonable to stick this in metadata? We can keep fine tuning this if you like, but I think it’s looking solid enough that I’m going to start revising my outlining patch to produce the results in the attached example. -Andy From: Reid Kleckner [mailto:rnk at google.com] Sent: Tuesday, January 27, 2015 1:55 PM To: Kaylor, Andrew Cc: Bataev, Alexey; Reid Kleckner (reid at kleckner.net<mailto:reid at kleckner.net>); LLVM Developers Mailing List; Anton Korobeynikov; Kreitzer, David L Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling On Tue, Jan 27, 2015 at 12:58 PM, Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> wrote: Thanks, Reid. These are good points. So I guess that does take us back to something more like my original proposal. I like your suggestion of having some kind of “eh.actions” intrinsic to represent the outlining rather than the extension to landingpad that I had proposed. I was just working on something like that in conjunction with my second alternative idea. Great! What I’d really like is to have the “eh.actions” intrinsic take a shape that makes it really easy to construct the .xdata table directly from these calls. As I think I mentioned in my original post, I think I have any idea for how to reconstruct functionally correct eh states (at least for synchronous EH purposes) from the invoke and landingpad instructions. I would like to continue, as in my original proposal, limiting the unwind representations to those that are unique to a given landing pad. I think with enough documentation I can make that seem sensible. My thinking is that the "eh.actions" list can be transformed into a compact xdata table later, after we've done machine basic block layout. I think the algorithm will be something like 1. Input: already laid out MachineFunction 2. Call EHStreamer::computeCallSiteTable to populate a LandingPadInfo vector sorted by ascending PC values 4. Iterate the LandingPadInfos, comparing the action list of each landing pad with the previous landing pad, assuming an empty action list at function start and end. 5. Model the action list as a stack, and compute the common suffix of the landing pad action lists 6. Each new action pushed represents a new EH state number 7. Pushing a cleanup action adds a state table entry that transitions from the current EH state to the previous with a cleanup handler 8. Pushing a catch action adds a new unwind table entry with an open range from the current EH state to an unknown EH state. The state after catching is... ??? 9. Popping a catch action closes an open unwind table range So, I think the action list is at least not totally crazy. =) I’ll start working on a revised proposal. Let me know if you have any more solid ideas. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150211/5cdf38e2/attachment.html>
On Wed, Feb 11, 2015 at 1:57 PM, Joseph Tremoulet <jotrem at microsoft.com> wrote:> Ah, ok. So if the outliner sees non-dispatch code in the landing pad > area, it can find/create somewhere to put it and an appropriate eh.actions > annotation to get an EH table generated that will ensure it gets executed > appropriately at run-time (in this example, perform the add before invoking > either handler); is that more or less the idea? That makes sense, thanks. >Yep. In the worst case, we could model code before landing pad dispatch as a cleanup handler, but I think the most common transforms are easily undone. Consider your example where a + b gets hoisted before the catch dispatch. Adds have no side effects, so we can freely sink them back down into the catch handler once we start outlining. Things that are hard to move, like loads and stores to unknown memory locations, cannot be hoisted over the llvm.eh.begincatch() call in the first place. It should act as a memory barrier.> I have the same question about the post-outlining IR. To change the > example to one where the bait won't get outlined, suppose you had > > > > int foo(int a, int b) { > > try { > > try { > > maybe_throw(); > > return 0; > > } catch (int) { > > // some code here that gets outlined > > } > > L1: > > return a + b; > > } catch (float) { > > // some other code here that also gets outlined > > } > > L2: > > return (a + b) + 1; > > } > > > > and suppose that nothing gets moved around before outlining. Then, after > outlining, the landingpad will be followed by an eh.actions call and then > an indirect branch that targets L1 and L2, correct? > > > > Do we need to worry that a late codesize optimization might want to merge > the adds by hoisting them up above the indirect branch? If that happened, > wouldn't it get skipped over if an exception were raised? >We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry block. In the IR, the fake 'indirectbr' instruction after the call to @llvm.eh.actions helps keep the CFG conservatively correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150211/32a3ed8a/attachment.html>
Joseph Tremoulet
2015-Feb-12 15:43 UTC
[LLVMdev] RFC: Native Windows C++ exception handling
> We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry blockI don't follow. What path do you see from entry to either L1 or L2 that doesn't pass through the indirectbr? In order to reach either L1 or L2, the call to maybe_throw() must raise an exception (else we'd return 0 from foo), the exception must be caught by one of the two handlers (else we'd unwind out of foo), and one of the outlined handlers must have executed and returned. Don't those conditions correspond to the path from entry to the indirectbr? To clarify, I'm not trying to assert there's a problem; I'm new to LLVM and trying to understand the model. I've worked before on a compiler that similarly used stand-in code to model control flow effected by the runtime/unwinder, and we had these issues with code motion. Our system was closed enough that we could just have the affected optimizations check for the relevant opcodes (we were doing the equivalent of using a special exitlandingpad terminator instead of indirectbr) and avoid pushing code across them; I don't know if a similar approach is appropriate in LLVM, or if there is (or should be) a way to annotate the block to indicate that it's one of these cases and new code inserted there would get skipped over at run-time, or if it's ok to just assume that those sort of optimizations don't run after EH Preparation, or what. Even if I am misunderstanding the shape of the flow graph in my example as you suggest, is there no cause for concern that some post-outlining pass might try to insert code into that block? Like maybe some sort of late profile instrumentation? It's great if there's not; I'm just trying to understand why not. Thanks -Joseph From: Reid Kleckner [mailto:rnk at google.com] Sent: Wednesday, February 11, 2015 5:09 PM To: Joseph Tremoulet Cc: Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling On Wed, Feb 11, 2015 at 1:57 PM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote: Ah, ok. So if the outliner sees non-dispatch code in the landing pad area, it can find/create somewhere to put it and an appropriate eh.actions annotation to get an EH table generated that will ensure it gets executed appropriately at run-time (in this example, perform the add before invoking either handler); is that more or less the idea? That makes sense, thanks. Yep. In the worst case, we could model code before landing pad dispatch as a cleanup handler, but I think the most common transforms are easily undone. Consider your example where a + b gets hoisted before the catch dispatch. Adds have no side effects, so we can freely sink them back down into the catch handler once we start outlining. Things that are hard to move, like loads and stores to unknown memory locations, cannot be hoisted over the llvm.eh.begincatch() call in the first place. It should act as a memory barrier. I have the same question about the post-outlining IR. To change the example to one where the bait won't get outlined, suppose you had int foo(int a, int b) { try { try { maybe_throw(); return 0; } catch (int) { // some code here that gets outlined } L1: return a + b; } catch (float) { // some other code here that also gets outlined } L2: return (a + b) + 1; } and suppose that nothing gets moved around before outlining. Then, after outlining, the landingpad will be followed by an eh.actions call and then an indirect branch that targets L1 and L2, correct? Do we need to worry that a late codesize optimization might want to merge the adds by hoisting them up above the indirect branch? If that happened, wouldn't it get skipped over if an exception were raised? We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry block. In the IR, the fake 'indirectbr' instruction after the call to @llvm.eh.actions helps keep the CFG conservatively correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150212/4018c9ac/attachment.html>
Seemingly Similar Threads
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: Native Windows C++ exception handling
- [LLVMdev] RFC: Native Windows C++ exception handling
- Seeking clarification about indirect critical edges