thr3ads.net - llvm dev - [llvm-dev] RFC: Add guard intrinsics to LLVM [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Feb-18 00:41 UTC

[llvm-dev] RFC: Add guard intrinsics to LLVM

Replies inline.

At a high level, it feels like we'll eventually need a new instruction
to represent the kind of control flow a guard entails (to be clear: we
should probably still start with an intrinsic) -- they are fairly
well-behaved, i.e. readonly, nounwind etc. as far as the immediate
"physical" caller is concerned, but not so as far as its callers's
callers are concerned.

On Wed, Feb 17, 2016 at 3:40 PM, Philip Reames
<listmail at philipreames.com> wrote:
>> one very important difference -- `@llvm.guard_on(i1 <false>)` is
well
>> defined (and not UB).  `@llvm.guard_on` on a false predicate bails to
>> the interpreter and that is always safe (but slow), and so
>> `@llvm.guard_on(i1 false)` is basically a `noreturn` call that
>> unconditionally transitions the current compilation to the
>> interpreter.
>
> It's also worth noting that @llvm.guard_on(i1 true) is useful and well
> defined as well.  When lowering, such a guard simply disappears, but it can
> be useful to keep around in the IR during optimization.  It gives a well
> defined point for widening transforms to apply with a well known
> deoptimization state already available.
Yes!  Actually, I had exactly this in an earlier version of this
writeup, which I removed to make the (already long) RFC shorter.
> I'd suggest a small change to Sanjoy's declaration.  I think we
should allow
> additional arguments to the guard, not just the condition.  What exactly
> those arguments mean would be up to the runtime, but various runtimes might
> want to provide additional arguments to the OSR mechanism.
We'll still have to make a call on the signature of the intrinsic (or
are you suggesting a varargs intrinsic)?

I suppose we could also have a family of intrinsics, that take on
argument of variable type.
>> Bailing out to the interpreter involves re-creating the state of the
>> interpreter frames as-if the compilee had been executing in the
>> interpreter all along.  This state is represented and maintained using
>> a `"deopt"` operand bundle attached to the call to
`@llvm.guard_on`.
>> The verifier will reject calls to `@llvm.guard_on` without a
`"deopt"`
>> operand bundle.
>
> This introduces a very subtle point.  The side exit only effects the
> *function* which contains the guard.  A caller of that function in the same
> module may be returned to by either the function itself, or the interpreter
> after running the continuation implied by the guard.  This introduces a
> complication for IPA/IPO; any guard (really, any side exit, of which guards
> are one form) has to be treated as a possible return point from the callee
> with an unknowable return value and memory state.
This is a really good point.  This has strong implications for the
guards memory effects as well -- even though a guard can be seen as
readonly in its containing functions, things that call the containing
function has to see the guard as read-write.  IOW @foo below is
read/write, even though v0 can be forwarded to v1:

```
int @foo(int cond) {
  int v0 = this->field;
  guard_on(cond);
  int v1 = this->field;
  return v0 + v1;
}
```

As you point out, we're also introducing a newish kind of control flow
here.  It is not fundamentally new, since longjmp does something
similar (but not quite the same).

I hate to say this, but perhaps we're really looking at (eventually) a
new instruction here, and not just a new intrinsic.
>> `@llvm.guard_on` cannot be `invoke`ed (that would be
>> meaningless anyway, since the method it would've "thrown"
into is
>> about to go away).
>
> I disagree with this bit.  It needlessly complicates the inliner. Allowing
> an invoke of a guard which SimplifyCFG converts to calls just like it would
> a nothrow function seems much cleaner.
SGTM.
>> The observable behavior of `@llvm.guard_on` is specified as:
>>
>> ```
>>    void @llvm.guard_on(i1 %pred) {
>>    entry:
>>      %unknown_cond = < unknown source >
>>      %cond = and i1 %unknown_cond, %pred
>>      br i1 %cond, label %left, label %right
>>
>>    left:
>>      call void @bail_to_interpreter() [ "deopt"() ] noreturn
>>      unreachable
>>
>>    right:
>>      ret void
>>    }
>> ```
>>
>> So, precisely speaking, `@llvm.guard_on` is guaranteed to bail to the
>> interpreter if `%pred` is false, but it **may** bail to the
>> interpreter if `%pred` is true.  It is this bit that lets us soundly
>> widen `%pred`, since all we're doing is "refining" `<
unknown source >`.
>
> Unless I'm misreading this, it looks like Sanjoy got the IR in the
> specification wrong.  The intrinsic is specified to side exit if the
> condition is false (so that it's true in the caller after the guard),
not
> the other way around.  The text description appears correct.
Yes, and thanks for catching.  The branch should have been "br i1
%cond, label %right, label %left".
>> `@bail_to_interpreter` does not return to the current compilation, but
>> it returns to the `"deopt"` continuation that is has been
given (once
>> inlined, the empty "deopt"() continuation will be fixed up to
have the
>> right
>> continuation).
>
> This "bail_to_interpreter" is a the more general form of side
exit I
> mentioned above.
How is it more general?
> As a starting point, I'd likely do this just before code gen prep with
some
> custom sinking logic to pull instructions only used on the failing path
into
> the newly introduced block.  Long term, we'd probably want to do the
same
> thing over MI, but mucking with control flow at that layer is a bit more
> complicated.
>
> Rather than framing this as inlining, I'd frame it as expansion to a
well
> known body (pretty much the one Sanjoy gives above).  The
> @bail_to_interpreter construct could be lowered directly to a function call
> to some well known symbol name (which JIT users can intercept and bind to
> whatever they want.)  Something like __llvm_side_exit seems like a
> reasonable choice.
SGTM.
>> with this one having the semantics that it always throws an exception
>> if `%predicate` fails.  Only the non-widening optimizations for
>> `@llvm.guard_on` will apply to `@llvm.exception_on`.
>
> Not sure we actually need this.  A valid implementation of the side exit
> handler (which would do the OSR for us) is to call a runtime routine which
> generates and throws the exception.  The only bit we might need is a
> distinction between widdenable and non-widenable guards.
Yes, and that's the only distinction I'm trying to make here.
>> ## memory effects (unresolved)
>>
>> [I haven't come up with a good model for the memory effects of
>>   `@llvm.guard_on`, suggestions are very welcome.]
>>
>> I'd really like to model `@llvm.guard_on` as a readonly function,
>> since it does not write to memory if it returns; and e.g. forwarding
>> loads across a call to `@llvm.guard_on` should be legal.
>>
>> However, I'm in a quandary around representing the "may never
return"
>> aspect of `@llvm.guard_on`: I have to make it illegal to, say, hoist a
>> load form `%ptr` across a guard on `%ptr != null`.
>
> Modeling this as memory dependence just seems wrong.  We already have to
> model control dependence on functions which may throw.  I don't think
> there's anything new here.
I am trying to model this as control dependence, but the difficult bit
is to do that while still maintaining that the call does not clobber
any memory.  I'm worried that there may be reasons (practical or
theoretical) why we "readonly" functions always have to terminate and
be nothrow.
> The only unusual bit is that we're going to want to teach AliasAnalysis
that
> the guard does write to any memory location (to allow forwarding) while
> still preserving the control dependence.
So you're saying that we model the guard as otherwise read/write (thus
sidestepping the readonly non-returning quandary) but teach
AliasAnalysis that it doesn't clobber any memory?  That would work.

We can also use the same tool to solve the "may return to its caller's
caller with arbitrary heap state" issue by teaching AA that a guard
does not alias with reads in its own (physical) function, but clobbers
the heap for other (physical) functions.

Notation: I'm differentiating between physical functions == functions
that create actual stack frames and inlined functions == logical Java
functions that don't create separate physical frames.  Inlining IR
from one java level function into another usually creates a physical
function that contains more than one logical function.
>>
>> There are couple
>> of ways I can think of dealing with this, none of them are both easy
>> and neat:
>>
>>   - State that since `@llvm.guard_on` could have had an infinite loop
>>     in it, it may never return. Unfortunately, the LLVM IR has some
>>     rough edges on readonly infinite loops (since C++ disallows that),
>>     so LLVM will require some preparatory work before we can do this
>>     soundly.
>>
>>   - State that `@llvm.guard_on` can unwind, and thus has non-local
>>     control flow.  This can actually work (and is pretty close to
>>     the actual semantics), but is somewhat of hack since
>>     `@llvm.guard_on` doesn't _really_ throw an exception.
>
> Er, careful.  Semantically, the guard *might* throw an exception. It could
> be that's what the interpreter does when evaluating the continuation
implied
> by the guard and any of our callers have to account for the fact the
> function which contains the guard might throw.  The easiest way to ensure
> that is to model the guard call as possibly throwing.
Yes, it does not throw an exception into its own caller, but may throw
into its caller's caller.

-- Sanjoy

Philip Reames via llvm-dev

2016-Feb-18 04:53 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On 02/17/2016 04:41 PM, Sanjoy Das wrote:> Replies inline.
>
> At a high level, it feels like we'll eventually need a new instruction
> to represent the kind of control flow a guard entails (to be clear: we
> should probably still start with an intrinsic) -- they are fairly
> well-behaved, i.e. readonly, nounwind etc. as far as the immediate
> "physical" caller is concerned, but not so as far as its
callers's
> callers are concerned.I think you're jumping ahead a bit here.  I'm not sure the semantics are
anywhere near as weird as you're framing them to be. 
:)>
> On Wed, Feb 17, 2016 at 3:40 PM, Philip Reames
> <listmail at philipreames.com> wrote:
>
>
>> I'd suggest a small change to Sanjoy's declaration.  I think we
should allow
>> additional arguments to the guard, not just the condition.  What
exactly
>> those arguments mean would be up to the runtime, but various runtimes
might
>> want to provide additional arguments to the OSR mechanism.
> We'll still have to make a call on the signature of the intrinsic (or
> are you suggesting a varargs intrinsic)?
>
> I suppose we could also have a family of intrinsics, that take on
> argument of variable type.I was proposing a varargs _intrinsic_.  (Not varargs as in C/C++, but 
varargs as-in polymorphic over all static signatures.)>
>>> Bailing out to the interpreter involves re-creating the state of
the
>>> interpreter frames as-if the compilee had been executing in the
>>> interpreter all along.  This state is represented and maintained
using
>>> a `"deopt"` operand bundle attached to the call to
`@llvm.guard_on`.
>>> The verifier will reject calls to `@llvm.guard_on` without a
`"deopt"`
>>> operand bundle.
>> This introduces a very subtle point.  The side exit only effects the
>> *function* which contains the guard.  A caller of that function in the
same
>> module may be returned to by either the function itself, or the
interpreter
>> after running the continuation implied by the guard.  This introduces a
>> complication for IPA/IPO; any guard (really, any side exit, of which
guards
>> are one form) has to be treated as a possible return point from the
callee
>> with an unknowable return value and memory state.
> This is a really good point.  This has strong implications for the
> guards memory effects as well -- even though a guard can be seen as
> readonly in its containing functions, things that call the containing
> function has to see the guard as read-write.  IOW @foo below is
> read/write, even though v0 can be forwarded to v1:
>
> ```
> int @foo(int cond) {
>    int v0 = this->field;
>    guard_on(cond);
>    int v1 = this->field;
>    return v0 + v1;
> }
> ```Essentially, we'd be introducing an aliasing rule along the following: 
"reads nothing on normal path, reads/writes world if guard is taken (in 
which case, does not return)."  Yes, implementing that will be a bit 
complicated, but I don't see this as a fundamental issue.
>
> As you point out, we're also introducing a newish kind of control flow
> here.  It is not fundamentally new, since longjmp does something
> similar (but not quite the same).
>
> I hate to say this, but perhaps we're really looking at (eventually) a
> new instruction here, and not just a new intrinsic.
>
>>> `@bail_to_interpreter` does not return to the current compilation,
but
>>> it returns to the `"deopt"` continuation that is has been
given (once
>>> inlined, the empty "deopt"() continuation will be fixed
up to have the
>>> right
>>> continuation).
>> This "bail_to_interpreter" is a the more general form of side
exit I
>> mentioned above.
> How is it more general?You can express a guard as a conditional branch to a 
@bail_to_interpreter construct.  Without the @bail_to_interpreter (which 
is the thing which has those weird aliasing properties we're talking 
about), you're stuck.

(In our tree, we've implemented exactly the "bail_to_interpreter" 
construct under a different name, marked as readonly with an exception 
in FunctionAttrs to resolve the IPO problem.)

What the guard nodes add is a) conciseness, b) potentially better 
optimization, and c) the ability to widen.

Thinking about it further, I think we should probably have started with 
proposing the @bail_to_interpreter construct, gotten that working fully 
upstream, then done the guard, but oh well.  We can do both in parallel 
since we have a working implementation downstream of
@b_to_i.>
>>> ## memory effects (unresolved)
>>>
>>> [I haven't come up with a good model for the memory effects of
>>>    `@llvm.guard_on`, suggestions are very welcome.]
>>>
>>> I'd really like to model `@llvm.guard_on` as a readonly
function,
>>> since it does not write to memory if it returns; and e.g.
forwarding
>>> loads across a call to `@llvm.guard_on` should be legal.
>>>
>>> However, I'm in a quandary around representing the "may
never return"
>>> aspect of `@llvm.guard_on`: I have to make it illegal to, say,
hoist a
>>> load form `%ptr` across a guard on `%ptr != null`.
>> Modeling this as memory dependence just seems wrong.  We already have
to
>> model control dependence on functions which may throw.  I don't
think
>> there's anything new here.
> I am trying to model this as control dependence, but the difficult bit
> is to do that while still maintaining that the call does not clobber
> any memory.  I'm worried that there may be reasons (practical or
> theoretical) why we "readonly" functions always have to terminate
and
> be nothrow.I think we should not bother modeling the call as readonly.  As we've 
seen with @llvm.assume, we can get something which is readonly in 
practice without it being readonly per se.  :)  And, as above, it's not 
clear that readonly is even a correct way to model a guard as all.
>
>> The only unusual bit is that we're going to want to teach
AliasAnalysis that
>> the guard does write to any memory location (to allow forwarding) while
>> still preserving the control dependence.
> So you're saying that we model the guard as otherwise read/write (thus
> sidestepping the readonly non-returning quandary) but teach
> AliasAnalysis that it doesn't clobber any memory?  That would work.Yes.  As we've done for @llvm.assume and a number of other one off 
cases.  Once we've implemented the exact semantics we want, we can 
decide if there's a property lurking which we can extract as a new 
attribute.>
> We can also use the same tool to solve the "may return to its
caller's
> caller with arbitrary heap state" issue by teaching AA that a guard
> does not alias with reads in its own (physical) function, but clobbers
> the heap for other (physical) functions.I'd have to think about this a bit further before agreeing, but on the 
surface this seems reasonable.>
> Notation: I'm differentiating between physical functions == functions
> that create actual stack frames and inlined functions == logical Java
> functions that don't create separate physical frames.  Inlining IR
> from one java level function into another usually creates a physical
> function that contains more than one logical function.
>
>>> There are couple
>>> of ways I can think of dealing with this, none of them are both
easy
>>> and neat:
>>>
>>>    - State that since `@llvm.guard_on` could have had an infinite
loop
>>>      in it, it may never return. Unfortunately, the LLVM IR has
some
>>>      rough edges on readonly infinite loops (since C++ disallows
that),
>>>      so LLVM will require some preparatory work before we can do
this
>>>      soundly.
>>>
>>>    - State that `@llvm.guard_on` can unwind, and thus has non-local
>>>      control flow.  This can actually work (and is pretty close to
>>>      the actual semantics), but is somewhat of hack since
>>>      `@llvm.guard_on` doesn't _really_ throw an exception.
>> Er, careful.  Semantically, the guard *might* throw an exception. It
could
>> be that's what the interpreter does when evaluating the
continuation implied
>> by the guard and any of our callers have to account for the fact the
>> function which contains the guard might throw.  The easiest way to
ensure
>> that is to model the guard call as possibly throwing.
> Yes, it does not throw an exception into its own caller, but may throw
> into its caller's caller.
>
> -- Sanjoy

Sanjoy Das via llvm-dev

2016-Feb-18 05:59 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames
<listmail at philipreames.com> wrote:
> I think you're jumping ahead a bit here.  I'm not sure the
semantics are
> anywhere near as weird as you're framing them to be.  :)
I now think this weirdness actually does not have to do anything with
guard_on or bail_to_interpeter, but it has to do with deopt bundles
itself.  Our notion of of "deopt bundles are readonly" is broken to
begin with, and that is what's manifesting as the complication we're
seeing here.

Consider something like

```
declare @foo() readonly
def @bar() { call @foo() [ "deopt"(XXX) ] }
def @baz() { call @bar() [ "deopt"(YYY) ] }
```

Right now according to the semantics of "deopt" operand bundles as in
the LangRef, every call site above is readonly.  However, it is
possible for @baz() to write to memory if @bar is deoptimized at the
call site with the call to @foo.

You could say that it isn't legal to mark @foo as readonly, since the
action of deoptimizing one's caller is not a readonly operation.  But
that doesn't work in cases like this:

```
global *ptr
declare @foo() readwrite
def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
```

Naively, it looks like an inter-proc CSE can forward 42 to v0, but
that's unsound, since @bar could get deoptimized at the call to
@foo(), and then who knows what'll get written to *ptr.

My interpretation here is that we're not modeling the deopt
continuations correctly.  Above, the XXX continuation is a delimited
continuation that terminates at the boundary of @bar, and seen from
its caller, the memory effect (and any other effect) of @bar has to
take into account that the "remainder" of @bar() after @foo has
returned is either what it can see in the IR, or the XXX continuation
(which it //could// analyze in theory, but in practice is unlikely
to).

This is kind of a bummer since what I said above directly contradicts
the "As long as the behavior of an operand bundle is describable
within these restrictions, LLVM does not need to have special
knowledge of the operand bundle to not miscompile programs containing
it." bit in the LangRef.  :(
> Essentially, we'd be introducing an aliasing rule along the following:
> "reads nothing on normal path, reads/writes world if guard is taken
(in
> which case, does not return)."  Yes, implementing that will be a bit
> complicated, but I don't see this as a fundamental issue.
Yup, and this is a property of deopt operand bundles, not just guards.
>> How is it more general?
>
> You can express a guard as a conditional branch to a @bail_to_interpreter
> construct.  Without the @bail_to_interpreter (which is the thing which has
> those weird aliasing properties we're talking about), you're stuck.
I thought earlier you were suggesting bail_to_interpreter is more
general than side_exit (when I thought they were one and the same
thing), not that bail_to_interpreter is more general than guard.

Aside: theoretically, if you have @guard() as a primitive then
bail_to_interpreter is just @guard(false).

-- Sanjoy

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Feb 2016 - RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

Apparently Analagous Threads