Sanjoy Das via llvm-dev
2016-Feb-18 00:41 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
Replies inline. At a high level, it feels like we'll eventually need a new instruction to represent the kind of control flow a guard entails (to be clear: we should probably still start with an intrinsic) -- they are fairly well-behaved, i.e. readonly, nounwind etc. as far as the immediate "physical" caller is concerned, but not so as far as its callers's callers are concerned. On Wed, Feb 17, 2016 at 3:40 PM, Philip Reames <listmail at philipreames.com> wrote:>> one very important difference -- `@llvm.guard_on(i1 <false>)` is well >> defined (and not UB). `@llvm.guard_on` on a false predicate bails to >> the interpreter and that is always safe (but slow), and so >> `@llvm.guard_on(i1 false)` is basically a `noreturn` call that >> unconditionally transitions the current compilation to the >> interpreter. > > It's also worth noting that @llvm.guard_on(i1 true) is useful and well > defined as well. When lowering, such a guard simply disappears, but it can > be useful to keep around in the IR during optimization. It gives a well > defined point for widening transforms to apply with a well known > deoptimization state already available.Yes! Actually, I had exactly this in an earlier version of this writeup, which I removed to make the (already long) RFC shorter.> I'd suggest a small change to Sanjoy's declaration. I think we should allow > additional arguments to the guard, not just the condition. What exactly > those arguments mean would be up to the runtime, but various runtimes might > want to provide additional arguments to the OSR mechanism.We'll still have to make a call on the signature of the intrinsic (or are you suggesting a varargs intrinsic)? I suppose we could also have a family of intrinsics, that take on argument of variable type.>> Bailing out to the interpreter involves re-creating the state of the >> interpreter frames as-if the compilee had been executing in the >> interpreter all along. This state is represented and maintained using >> a `"deopt"` operand bundle attached to the call to `@llvm.guard_on`. >> The verifier will reject calls to `@llvm.guard_on` without a `"deopt"` >> operand bundle. > > This introduces a very subtle point. The side exit only effects the > *function* which contains the guard. A caller of that function in the same > module may be returned to by either the function itself, or the interpreter > after running the continuation implied by the guard. This introduces a > complication for IPA/IPO; any guard (really, any side exit, of which guards > are one form) has to be treated as a possible return point from the callee > with an unknowable return value and memory state.This is a really good point. This has strong implications for the guards memory effects as well -- even though a guard can be seen as readonly in its containing functions, things that call the containing function has to see the guard as read-write. IOW @foo below is read/write, even though v0 can be forwarded to v1: ``` int @foo(int cond) { int v0 = this->field; guard_on(cond); int v1 = this->field; return v0 + v1; } ``` As you point out, we're also introducing a newish kind of control flow here. It is not fundamentally new, since longjmp does something similar (but not quite the same). I hate to say this, but perhaps we're really looking at (eventually) a new instruction here, and not just a new intrinsic.>> `@llvm.guard_on` cannot be `invoke`ed (that would be >> meaningless anyway, since the method it would've "thrown" into is >> about to go away). > > I disagree with this bit. It needlessly complicates the inliner. Allowing > an invoke of a guard which SimplifyCFG converts to calls just like it would > a nothrow function seems much cleaner.SGTM.>> The observable behavior of `@llvm.guard_on` is specified as: >> >> ``` >> void @llvm.guard_on(i1 %pred) { >> entry: >> %unknown_cond = < unknown source > >> %cond = and i1 %unknown_cond, %pred >> br i1 %cond, label %left, label %right >> >> left: >> call void @bail_to_interpreter() [ "deopt"() ] noreturn >> unreachable >> >> right: >> ret void >> } >> ``` >> >> So, precisely speaking, `@llvm.guard_on` is guaranteed to bail to the >> interpreter if `%pred` is false, but it **may** bail to the >> interpreter if `%pred` is true. It is this bit that lets us soundly >> widen `%pred`, since all we're doing is "refining" `< unknown source >`. > > Unless I'm misreading this, it looks like Sanjoy got the IR in the > specification wrong. The intrinsic is specified to side exit if the > condition is false (so that it's true in the caller after the guard), not > the other way around. The text description appears correct.Yes, and thanks for catching. The branch should have been "br i1 %cond, label %right, label %left".>> `@bail_to_interpreter` does not return to the current compilation, but >> it returns to the `"deopt"` continuation that is has been given (once >> inlined, the empty "deopt"() continuation will be fixed up to have the >> right >> continuation). > > This "bail_to_interpreter" is a the more general form of side exit I > mentioned above.How is it more general?> As a starting point, I'd likely do this just before code gen prep with some > custom sinking logic to pull instructions only used on the failing path into > the newly introduced block. Long term, we'd probably want to do the same > thing over MI, but mucking with control flow at that layer is a bit more > complicated. > > Rather than framing this as inlining, I'd frame it as expansion to a well > known body (pretty much the one Sanjoy gives above). The > @bail_to_interpreter construct could be lowered directly to a function call > to some well known symbol name (which JIT users can intercept and bind to > whatever they want.) Something like __llvm_side_exit seems like a > reasonable choice.SGTM.>> with this one having the semantics that it always throws an exception >> if `%predicate` fails. Only the non-widening optimizations for >> `@llvm.guard_on` will apply to `@llvm.exception_on`. > > Not sure we actually need this. A valid implementation of the side exit > handler (which would do the OSR for us) is to call a runtime routine which > generates and throws the exception. The only bit we might need is a > distinction between widdenable and non-widenable guards.Yes, and that's the only distinction I'm trying to make here.>> ## memory effects (unresolved) >> >> [I haven't come up with a good model for the memory effects of >> `@llvm.guard_on`, suggestions are very welcome.] >> >> I'd really like to model `@llvm.guard_on` as a readonly function, >> since it does not write to memory if it returns; and e.g. forwarding >> loads across a call to `@llvm.guard_on` should be legal. >> >> However, I'm in a quandary around representing the "may never return" >> aspect of `@llvm.guard_on`: I have to make it illegal to, say, hoist a >> load form `%ptr` across a guard on `%ptr != null`. > > Modeling this as memory dependence just seems wrong. We already have to > model control dependence on functions which may throw. I don't think > there's anything new here.I am trying to model this as control dependence, but the difficult bit is to do that while still maintaining that the call does not clobber any memory. I'm worried that there may be reasons (practical or theoretical) why we "readonly" functions always have to terminate and be nothrow.> The only unusual bit is that we're going to want to teach AliasAnalysis that > the guard does write to any memory location (to allow forwarding) while > still preserving the control dependence.So you're saying that we model the guard as otherwise read/write (thus sidestepping the readonly non-returning quandary) but teach AliasAnalysis that it doesn't clobber any memory? That would work. We can also use the same tool to solve the "may return to its caller's caller with arbitrary heap state" issue by teaching AA that a guard does not alias with reads in its own (physical) function, but clobbers the heap for other (physical) functions. Notation: I'm differentiating between physical functions == functions that create actual stack frames and inlined functions == logical Java functions that don't create separate physical frames. Inlining IR from one java level function into another usually creates a physical function that contains more than one logical function.>> >> There are couple >> of ways I can think of dealing with this, none of them are both easy >> and neat: >> >> - State that since `@llvm.guard_on` could have had an infinite loop >> in it, it may never return. Unfortunately, the LLVM IR has some >> rough edges on readonly infinite loops (since C++ disallows that), >> so LLVM will require some preparatory work before we can do this >> soundly. >> >> - State that `@llvm.guard_on` can unwind, and thus has non-local >> control flow. This can actually work (and is pretty close to >> the actual semantics), but is somewhat of hack since >> `@llvm.guard_on` doesn't _really_ throw an exception. > > Er, careful. Semantically, the guard *might* throw an exception. It could > be that's what the interpreter does when evaluating the continuation implied > by the guard and any of our callers have to account for the fact the > function which contains the guard might throw. The easiest way to ensure > that is to model the guard call as possibly throwing.Yes, it does not throw an exception into its own caller, but may throw into its caller's caller. -- Sanjoy
Philip Reames via llvm-dev
2016-Feb-18 04:53 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On 02/17/2016 04:41 PM, Sanjoy Das wrote:> Replies inline. > > At a high level, it feels like we'll eventually need a new instruction > to represent the kind of control flow a guard entails (to be clear: we > should probably still start with an intrinsic) -- they are fairly > well-behaved, i.e. readonly, nounwind etc. as far as the immediate > "physical" caller is concerned, but not so as far as its callers's > callers are concerned.I think you're jumping ahead a bit here. I'm not sure the semantics are anywhere near as weird as you're framing them to be. :)> > On Wed, Feb 17, 2016 at 3:40 PM, Philip Reames > <listmail at philipreames.com> wrote: > > >> I'd suggest a small change to Sanjoy's declaration. I think we should allow >> additional arguments to the guard, not just the condition. What exactly >> those arguments mean would be up to the runtime, but various runtimes might >> want to provide additional arguments to the OSR mechanism. > We'll still have to make a call on the signature of the intrinsic (or > are you suggesting a varargs intrinsic)? > > I suppose we could also have a family of intrinsics, that take on > argument of variable type.I was proposing a varargs _intrinsic_. (Not varargs as in C/C++, but varargs as-in polymorphic over all static signatures.)> >>> Bailing out to the interpreter involves re-creating the state of the >>> interpreter frames as-if the compilee had been executing in the >>> interpreter all along. This state is represented and maintained using >>> a `"deopt"` operand bundle attached to the call to `@llvm.guard_on`. >>> The verifier will reject calls to `@llvm.guard_on` without a `"deopt"` >>> operand bundle. >> This introduces a very subtle point. The side exit only effects the >> *function* which contains the guard. A caller of that function in the same >> module may be returned to by either the function itself, or the interpreter >> after running the continuation implied by the guard. This introduces a >> complication for IPA/IPO; any guard (really, any side exit, of which guards >> are one form) has to be treated as a possible return point from the callee >> with an unknowable return value and memory state. > This is a really good point. This has strong implications for the > guards memory effects as well -- even though a guard can be seen as > readonly in its containing functions, things that call the containing > function has to see the guard as read-write. IOW @foo below is > read/write, even though v0 can be forwarded to v1: > > ``` > int @foo(int cond) { > int v0 = this->field; > guard_on(cond); > int v1 = this->field; > return v0 + v1; > } > ```Essentially, we'd be introducing an aliasing rule along the following: "reads nothing on normal path, reads/writes world if guard is taken (in which case, does not return)." Yes, implementing that will be a bit complicated, but I don't see this as a fundamental issue.> > As you point out, we're also introducing a newish kind of control flow > here. It is not fundamentally new, since longjmp does something > similar (but not quite the same). > > I hate to say this, but perhaps we're really looking at (eventually) a > new instruction here, and not just a new intrinsic. > >>> `@bail_to_interpreter` does not return to the current compilation, but >>> it returns to the `"deopt"` continuation that is has been given (once >>> inlined, the empty "deopt"() continuation will be fixed up to have the >>> right >>> continuation). >> This "bail_to_interpreter" is a the more general form of side exit I >> mentioned above. > How is it more general?You can express a guard as a conditional branch to a @bail_to_interpreter construct. Without the @bail_to_interpreter (which is the thing which has those weird aliasing properties we're talking about), you're stuck. (In our tree, we've implemented exactly the "bail_to_interpreter" construct under a different name, marked as readonly with an exception in FunctionAttrs to resolve the IPO problem.) What the guard nodes add is a) conciseness, b) potentially better optimization, and c) the ability to widen. Thinking about it further, I think we should probably have started with proposing the @bail_to_interpreter construct, gotten that working fully upstream, then done the guard, but oh well. We can do both in parallel since we have a working implementation downstream of @b_to_i.> >>> ## memory effects (unresolved) >>> >>> [I haven't come up with a good model for the memory effects of >>> `@llvm.guard_on`, suggestions are very welcome.] >>> >>> I'd really like to model `@llvm.guard_on` as a readonly function, >>> since it does not write to memory if it returns; and e.g. forwarding >>> loads across a call to `@llvm.guard_on` should be legal. >>> >>> However, I'm in a quandary around representing the "may never return" >>> aspect of `@llvm.guard_on`: I have to make it illegal to, say, hoist a >>> load form `%ptr` across a guard on `%ptr != null`. >> Modeling this as memory dependence just seems wrong. We already have to >> model control dependence on functions which may throw. I don't think >> there's anything new here. > I am trying to model this as control dependence, but the difficult bit > is to do that while still maintaining that the call does not clobber > any memory. I'm worried that there may be reasons (practical or > theoretical) why we "readonly" functions always have to terminate and > be nothrow.I think we should not bother modeling the call as readonly. As we've seen with @llvm.assume, we can get something which is readonly in practice without it being readonly per se. :) And, as above, it's not clear that readonly is even a correct way to model a guard as all.> >> The only unusual bit is that we're going to want to teach AliasAnalysis that >> the guard does write to any memory location (to allow forwarding) while >> still preserving the control dependence. > So you're saying that we model the guard as otherwise read/write (thus > sidestepping the readonly non-returning quandary) but teach > AliasAnalysis that it doesn't clobber any memory? That would work.Yes. As we've done for @llvm.assume and a number of other one off cases. Once we've implemented the exact semantics we want, we can decide if there's a property lurking which we can extract as a new attribute.> > We can also use the same tool to solve the "may return to its caller's > caller with arbitrary heap state" issue by teaching AA that a guard > does not alias with reads in its own (physical) function, but clobbers > the heap for other (physical) functions.I'd have to think about this a bit further before agreeing, but on the surface this seems reasonable.> > Notation: I'm differentiating between physical functions == functions > that create actual stack frames and inlined functions == logical Java > functions that don't create separate physical frames. Inlining IR > from one java level function into another usually creates a physical > function that contains more than one logical function. > >>> There are couple >>> of ways I can think of dealing with this, none of them are both easy >>> and neat: >>> >>> - State that since `@llvm.guard_on` could have had an infinite loop >>> in it, it may never return. Unfortunately, the LLVM IR has some >>> rough edges on readonly infinite loops (since C++ disallows that), >>> so LLVM will require some preparatory work before we can do this >>> soundly. >>> >>> - State that `@llvm.guard_on` can unwind, and thus has non-local >>> control flow. This can actually work (and is pretty close to >>> the actual semantics), but is somewhat of hack since >>> `@llvm.guard_on` doesn't _really_ throw an exception. >> Er, careful. Semantically, the guard *might* throw an exception. It could >> be that's what the interpreter does when evaluating the continuation implied >> by the guard and any of our callers have to account for the fact the >> function which contains the guard might throw. The easiest way to ensure >> that is to model the guard call as possibly throwing. > Yes, it does not throw an exception into its own caller, but may throw > into its caller's caller. > > -- Sanjoy
Sanjoy Das via llvm-dev
2016-Feb-18 05:59 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames <listmail at philipreames.com> wrote:> I think you're jumping ahead a bit here. I'm not sure the semantics are > anywhere near as weird as you're framing them to be. :)I now think this weirdness actually does not have to do anything with guard_on or bail_to_interpeter, but it has to do with deopt bundles itself. Our notion of of "deopt bundles are readonly" is broken to begin with, and that is what's manifesting as the complication we're seeing here. Consider something like ``` declare @foo() readonly def @bar() { call @foo() [ "deopt"(XXX) ] } def @baz() { call @bar() [ "deopt"(YYY) ] } ``` Right now according to the semantics of "deopt" operand bundles as in the LangRef, every call site above is readonly. However, it is possible for @baz() to write to memory if @bar is deoptimized at the call site with the call to @foo. You could say that it isn't legal to mark @foo as readonly, since the action of deoptimizing one's caller is not a readonly operation. But that doesn't work in cases like this: ``` global *ptr declare @foo() readwrite def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } ``` Naively, it looks like an inter-proc CSE can forward 42 to v0, but that's unsound, since @bar could get deoptimized at the call to @foo(), and then who knows what'll get written to *ptr. My interpretation here is that we're not modeling the deopt continuations correctly. Above, the XXX continuation is a delimited continuation that terminates at the boundary of @bar, and seen from its caller, the memory effect (and any other effect) of @bar has to take into account that the "remainder" of @bar() after @foo has returned is either what it can see in the IR, or the XXX continuation (which it //could// analyze in theory, but in practice is unlikely to). This is kind of a bummer since what I said above directly contradicts the "As long as the behavior of an operand bundle is describable within these restrictions, LLVM does not need to have special knowledge of the operand bundle to not miscompile programs containing it." bit in the LangRef. :(> Essentially, we'd be introducing an aliasing rule along the following: > "reads nothing on normal path, reads/writes world if guard is taken (in > which case, does not return)." Yes, implementing that will be a bit > complicated, but I don't see this as a fundamental issue.Yup, and this is a property of deopt operand bundles, not just guards.>> How is it more general? > > You can express a guard as a conditional branch to a @bail_to_interpreter > construct. Without the @bail_to_interpreter (which is the thing which has > those weird aliasing properties we're talking about), you're stuck.I thought earlier you were suggesting bail_to_interpreter is more general than side_exit (when I thought they were one and the same thing), not that bail_to_interpreter is more general than guard. Aside: theoretically, if you have @guard() as a primitive then bail_to_interpreter is just @guard(false). -- Sanjoy