Joseph Tremoulet
2015-Feb-12 15:43 UTC
[LLVMdev] RFC: Native Windows C++ exception handling
> We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry blockI don't follow. What path do you see from entry to either L1 or L2 that doesn't pass through the indirectbr? In order to reach either L1 or L2, the call to maybe_throw() must raise an exception (else we'd return 0 from foo), the exception must be caught by one of the two handlers (else we'd unwind out of foo), and one of the outlined handlers must have executed and returned. Don't those conditions correspond to the path from entry to the indirectbr? To clarify, I'm not trying to assert there's a problem; I'm new to LLVM and trying to understand the model. I've worked before on a compiler that similarly used stand-in code to model control flow effected by the runtime/unwinder, and we had these issues with code motion. Our system was closed enough that we could just have the affected optimizations check for the relevant opcodes (we were doing the equivalent of using a special exitlandingpad terminator instead of indirectbr) and avoid pushing code across them; I don't know if a similar approach is appropriate in LLVM, or if there is (or should be) a way to annotate the block to indicate that it's one of these cases and new code inserted there would get skipped over at run-time, or if it's ok to just assume that those sort of optimizations don't run after EH Preparation, or what. Even if I am misunderstanding the shape of the flow graph in my example as you suggest, is there no cause for concern that some post-outlining pass might try to insert code into that block? Like maybe some sort of late profile instrumentation? It's great if there's not; I'm just trying to understand why not. Thanks -Joseph From: Reid Kleckner [mailto:rnk at google.com] Sent: Wednesday, February 11, 2015 5:09 PM To: Joseph Tremoulet Cc: Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling On Wed, Feb 11, 2015 at 1:57 PM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote: Ah, ok. So if the outliner sees non-dispatch code in the landing pad area, it can find/create somewhere to put it and an appropriate eh.actions annotation to get an EH table generated that will ensure it gets executed appropriately at run-time (in this example, perform the add before invoking either handler); is that more or less the idea? That makes sense, thanks. Yep. In the worst case, we could model code before landing pad dispatch as a cleanup handler, but I think the most common transforms are easily undone. Consider your example where a + b gets hoisted before the catch dispatch. Adds have no side effects, so we can freely sink them back down into the catch handler once we start outlining. Things that are hard to move, like loads and stores to unknown memory locations, cannot be hoisted over the llvm.eh.begincatch() call in the first place. It should act as a memory barrier. I have the same question about the post-outlining IR. To change the example to one where the bait won't get outlined, suppose you had int foo(int a, int b) { try { try { maybe_throw(); return 0; } catch (int) { // some code here that gets outlined } L1: return a + b; } catch (float) { // some other code here that also gets outlined } L2: return (a + b) + 1; } and suppose that nothing gets moved around before outlining. Then, after outlining, the landingpad will be followed by an eh.actions call and then an indirect branch that targets L1 and L2, correct? Do we need to worry that a late codesize optimization might want to merge the adds by hoisting them up above the indirect branch? If that happened, wouldn't it get skipped over if an exception were raised? We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry block. In the IR, the fake 'indirectbr' instruction after the call to @llvm.eh.actions helps keep the CFG conservatively correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150212/4018c9ac/attachment.html>
You're right, we would hoist the add to someplace before the notional indirectbr, but that needs to work anyway. After looking more closely at the IR, we would probably hoist into the landingpad instead of into the entry block. Here's the IR I have in my head after optimization and hoisting but before outlining: define i32 @foo(i32 %a, i32 %b) { entry: invoke void @maybe_throw() to label %ret unwind label %lpad ret: ret i32 0 ; The return case lpad: %ehvals = landingpad { i8*, i32 } personality ... catch ... ; typeinfo for int catch ... ; typeinfo for float %sum = add i32 %a, %b ; hoisted out from catch_int and catch_float via CSE or something ... ; dispatch on the selector, effectively doing a switch catch_int: ret i32 %sum catch_float: %s1 = add i32 %sum, 1 ret i32 %s1 ... ; resume } We have a couple options during EH preparation: 1. Outline the add into a cleanup, store the result into the frameallocation, and access it in the catch handler 2. Sink the add back into the handlers, which we can do because it has no side effects (better) 3. Sink the add past the catch handlers and past the notional indirectbr into the blocks referenced by the handler return result Number 3 is probably the best, and it would look like: define i8* @int_handler(i8*, i8*) { ret i8* basicblockaddr(@func, %catch_int) } define i8* @float_handler(i8*, i8*) { ret i8* basicblockaddr(@func, %catch_float) } define i32 @foo(i32 %a, i32 %b) { entry: invoke void @maybe_throw() to label %ret unwind label %lpad ret: ret i32 0 ; The return case lpad: %ehvals = landingpad { i8*, i32 } personality ... catch ... ; typeinfo for int catch ... ; typeinfo for float %rejoin_at = call i8* @llvm.eh.actions(... @int_handler, ... @float_handler, ...) indirectbr i8* %rejoin_at, [ label %catch_int, label %catch_float ] catch_int: %s1 = add i32 %a, %b ret i32 %s1 catch_float: %s2 = add i32 %a, %b %s3 = add i32 %s2, 1 ret i32 %s3 } But even if we aren't smart enough to do #3, which could implement #1 by additional outlining. It's ugly though. =/ In practice, I don't think this situation will occur very often because the catch blocks look like this: catch_int: call void @llvm.eh.begincatch(i8* %ehptr) ... ; user code call void @llvm.eh.endcatch() ... ; rejoin normal control Anything we can hoist out of "user code" here can usually be sunk back in. If it's not trivial like a hoisted store to an unescaped internal global, then we'll have to emit something that looks like a destructor cleanup. On Thu, Feb 12, 2015 at 7:43 AM, Joseph Tremoulet <jotrem at microsoft.com> wrote:> > We'd have to hoist a + b to somewhere that dominates L1 and L2. I > think the only BB in your program that dominates is the entry block > > I don't follow. What path do you see from entry to either L1 or L2 that > doesn't pass through the indirectbr? In order to reach either L1 or L2, > the call to maybe_throw() must raise an exception (else we'd return 0 from > foo), the exception must be caught by one of the two handlers (else we'd > unwind out of foo), and one of the outlined handlers must have executed and > returned. Don't those conditions correspond to the path from entry to the > indirectbr? > > > > To clarify, I'm not trying to assert there's a problem; I'm new to LLVM > and trying to understand the model. I've worked before on a compiler that > similarly used stand-in code to model control flow effected by the > runtime/unwinder, and we had these issues with code motion. Our system was > closed enough that we could just have the affected optimizations check for > the relevant opcodes (we were doing the equivalent of using a special > exitlandingpad terminator instead of indirectbr) and avoid pushing code > across them; I don't know if a similar approach is appropriate in LLVM, or > if there is (or should be) a way to annotate the block to indicate that > it's one of these cases and new code inserted there would get skipped over > at run-time, or if it's ok to just assume that those sort of optimizations > don't run after EH Preparation, or what. Even if I am misunderstanding the > shape of the flow graph in my example as you suggest, is there no cause for > concern that some post-outlining pass might try to insert code into that > block? Like maybe some sort of late profile instrumentation? It's great > if there's not; I'm just trying to understand why not. > > > > Thanks > > -Joseph > > > > *From:* Reid Kleckner [mailto:rnk at google.com] > *Sent:* Wednesday, February 11, 2015 5:09 PM > *To:* Joseph Tremoulet > *Cc:* Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net); > LLVM Developers Mailing List > *Subject:* Re: [LLVMdev] RFC: Native Windows C++ exception handling > > > > On Wed, Feb 11, 2015 at 1:57 PM, Joseph Tremoulet <jotrem at microsoft.com> > wrote: > > Ah, ok. So if the outliner sees non-dispatch code in the landing pad > area, it can find/create somewhere to put it and an appropriate eh.actions > annotation to get an EH table generated that will ensure it gets executed > appropriately at run-time (in this example, perform the add before invoking > either handler); is that more or less the idea? That makes sense, thanks. > > > > Yep. In the worst case, we could model code before landing pad dispatch as > a cleanup handler, but I think the most common transforms are easily > undone. Consider your example where a + b gets hoisted before the catch > dispatch. Adds have no side effects, so we can freely sink them back down > into the catch handler once we start outlining. > > > > Things that are hard to move, like loads and stores to unknown memory > locations, cannot be hoisted over the llvm.eh.begincatch() call in the > first place. It should act as a memory barrier. > > > > I have the same question about the post-outlining IR. To change the > example to one where the bait won't get outlined, suppose you had > > > > int foo(int a, int b) { > > try { > > try { > > maybe_throw(); > > return 0; > > } catch (int) { > > // some code here that gets outlined > > } > > L1: > > return a + b; > > } catch (float) { > > // some other code here that also gets outlined > > } > > L2: > > return (a + b) + 1; > > } > > > > and suppose that nothing gets moved around before outlining. Then, after > outlining, the landingpad will be followed by an eh.actions call and then > an indirect branch that targets L1 and L2, correct? > > > > Do we need to worry that a late codesize optimization might want to merge > the adds by hoisting them up above the indirect branch? If that happened, > wouldn't it get skipped over if an exception were raised? > > > > We'd have to hoist a + b to somewhere that dominates L1 and L2. I think > the only BB in your program that dominates is the entry block. In the IR, > the fake 'indirectbr' instruction after the call to @llvm.eh.actions helps > keep the CFG conservatively correct. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150212/3f15ca0a/attachment.html>
Joseph Tremoulet
2015-Feb-13 15:11 UTC
[LLVMdev] RFC: Native Windows C++ exception handling
Yes, it seems like WinEHPrepare should always be able to move landing pad code to somewhere appropriate. Even if there's some involved code to handle the cases that force #1, it's great that the complexity is contained in WinEHPrepare. I'm still confused about the apparent lack of constraints after WinEHPrepare. Can we simply require/assume that WinEHPrepare be run after any passes that may move/insert code into landing pads? Is that documented somewhere? For a concrete example, consider your snippet showing what the IR would look like after outlining with #3. What's to stop a later pass from redoing the hoist? Thanks -Joseph From: Reid Kleckner [mailto:rnk at google.com] Sent: Thursday, February 12, 2015 6:10 PM To: Joseph Tremoulet Cc: Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling You're right, we would hoist the add to someplace before the notional indirectbr, but that needs to work anyway. After looking more closely at the IR, we would probably hoist into the landingpad instead of into the entry block. Here's the IR I have in my head after optimization and hoisting but before outlining: define i32 @foo(i32 %a, i32 %b) { entry: invoke void @maybe_throw() to label %ret unwind label %lpad ret: ret i32 0 ; The return case lpad: %ehvals = landingpad { i8*, i32 } personality ... catch ... ; typeinfo for int catch ... ; typeinfo for float %sum = add i32 %a, %b ; hoisted out from catch_int and catch_float via CSE or something ... ; dispatch on the selector, effectively doing a switch catch_int: ret i32 %sum catch_float: %s1 = add i32 %sum, 1 ret i32 %s1 ... ; resume } We have a couple options during EH preparation: 1. Outline the add into a cleanup, store the result into the frameallocation, and access it in the catch handler 2. Sink the add back into the handlers, which we can do because it has no side effects (better) 3. Sink the add past the catch handlers and past the notional indirectbr into the blocks referenced by the handler return result Number 3 is probably the best, and it would look like: define i8* @int_handler(i8*, i8*) { ret i8* basicblockaddr(@func, %catch_int) } define i8* @float_handler(i8*, i8*) { ret i8* basicblockaddr(@func, %catch_float) } define i32 @foo(i32 %a, i32 %b) { entry: invoke void @maybe_throw() to label %ret unwind label %lpad ret: ret i32 0 ; The return case lpad: %ehvals = landingpad { i8*, i32 } personality ... catch ... ; typeinfo for int catch ... ; typeinfo for float %rejoin_at = call i8* @llvm.eh.actions(... @int_handler, ... @float_handler, ...) indirectbr i8* %rejoin_at, [ label %catch_int, label %catch_float ] catch_int: %s1 = add i32 %a, %b ret i32 %s1 catch_float: %s2 = add i32 %a, %b %s3 = add i32 %s2, 1 ret i32 %s3 } But even if we aren't smart enough to do #3, which could implement #1 by additional outlining. It's ugly though. =/ In practice, I don't think this situation will occur very often because the catch blocks look like this: catch_int: call void @llvm.eh.begincatch(i8* %ehptr) ... ; user code call void @llvm.eh.endcatch() ... ; rejoin normal control Anything we can hoist out of "user code" here can usually be sunk back in. If it's not trivial like a hoisted store to an unescaped internal global, then we'll have to emit something that looks like a destructor cleanup. On Thu, Feb 12, 2015 at 7:43 AM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote:> We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry blockI don't follow. What path do you see from entry to either L1 or L2 that doesn't pass through the indirectbr? In order to reach either L1 or L2, the call to maybe_throw() must raise an exception (else we'd return 0 from foo), the exception must be caught by one of the two handlers (else we'd unwind out of foo), and one of the outlined handlers must have executed and returned. Don't those conditions correspond to the path from entry to the indirectbr? To clarify, I'm not trying to assert there's a problem; I'm new to LLVM and trying to understand the model. I've worked before on a compiler that similarly used stand-in code to model control flow effected by the runtime/unwinder, and we had these issues with code motion. Our system was closed enough that we could just have the affected optimizations check for the relevant opcodes (we were doing the equivalent of using a special exitlandingpad terminator instead of indirectbr) and avoid pushing code across them; I don't know if a similar approach is appropriate in LLVM, or if there is (or should be) a way to annotate the block to indicate that it's one of these cases and new code inserted there would get skipped over at run-time, or if it's ok to just assume that those sort of optimizations don't run after EH Preparation, or what. Even if I am misunderstanding the shape of the flow graph in my example as you suggest, is there no cause for concern that some post-outlining pass might try to insert code into that block? Like maybe some sort of late profile instrumentation? It's great if there's not; I'm just trying to understand why not. Thanks -Joseph From: Reid Kleckner [mailto:rnk at google.com<mailto:rnk at google.com>] Sent: Wednesday, February 11, 2015 5:09 PM To: Joseph Tremoulet Cc: Kaylor, Andrew; Bataev, Alexey; Reid Kleckner (reid at kleckner.net<mailto:reid at kleckner.net>); LLVM Developers Mailing List Subject: Re: [LLVMdev] RFC: Native Windows C++ exception handling On Wed, Feb 11, 2015 at 1:57 PM, Joseph Tremoulet <jotrem at microsoft.com<mailto:jotrem at microsoft.com>> wrote: Ah, ok. So if the outliner sees non-dispatch code in the landing pad area, it can find/create somewhere to put it and an appropriate eh.actions annotation to get an EH table generated that will ensure it gets executed appropriately at run-time (in this example, perform the add before invoking either handler); is that more or less the idea? That makes sense, thanks. Yep. In the worst case, we could model code before landing pad dispatch as a cleanup handler, but I think the most common transforms are easily undone. Consider your example where a + b gets hoisted before the catch dispatch. Adds have no side effects, so we can freely sink them back down into the catch handler once we start outlining. Things that are hard to move, like loads and stores to unknown memory locations, cannot be hoisted over the llvm.eh.begincatch() call in the first place. It should act as a memory barrier. I have the same question about the post-outlining IR. To change the example to one where the bait won't get outlined, suppose you had int foo(int a, int b) { try { try { maybe_throw(); return 0; } catch (int) { // some code here that gets outlined } L1: return a + b; } catch (float) { // some other code here that also gets outlined } L2: return (a + b) + 1; } and suppose that nothing gets moved around before outlining. Then, after outlining, the landingpad will be followed by an eh.actions call and then an indirect branch that targets L1 and L2, correct? Do we need to worry that a late codesize optimization might want to merge the adds by hoisting them up above the indirect branch? If that happened, wouldn't it get skipped over if an exception were raised? We'd have to hoist a + b to somewhere that dominates L1 and L2. I think the only BB in your program that dominates is the entry block. In the IR, the fake 'indirectbr' instruction after the call to @llvm.eh.actions helps keep the CFG conservatively correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150213/37d7ac39/attachment.html>