thr3ads.net - llvm dev - [llvm-dev] RFC: Add guard intrinsics to LLVM [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Feb-18 05:59 UTC

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames
<listmail at philipreames.com> wrote:
> I think you're jumping ahead a bit here.  I'm not sure the
semantics are
> anywhere near as weird as you're framing them to be.  :)
I now think this weirdness actually does not have to do anything with
guard_on or bail_to_interpeter, but it has to do with deopt bundles
itself.  Our notion of of "deopt bundles are readonly" is broken to
begin with, and that is what's manifesting as the complication we're
seeing here.

Consider something like

```
declare @foo() readonly
def @bar() { call @foo() [ "deopt"(XXX) ] }
def @baz() { call @bar() [ "deopt"(YYY) ] }
```

Right now according to the semantics of "deopt" operand bundles as in
the LangRef, every call site above is readonly.  However, it is
possible for @baz() to write to memory if @bar is deoptimized at the
call site with the call to @foo.

You could say that it isn't legal to mark @foo as readonly, since the
action of deoptimizing one's caller is not a readonly operation.  But
that doesn't work in cases like this:

```
global *ptr
declare @foo() readwrite
def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
```

Naively, it looks like an inter-proc CSE can forward 42 to v0, but
that's unsound, since @bar could get deoptimized at the call to
@foo(), and then who knows what'll get written to *ptr.

My interpretation here is that we're not modeling the deopt
continuations correctly.  Above, the XXX continuation is a delimited
continuation that terminates at the boundary of @bar, and seen from
its caller, the memory effect (and any other effect) of @bar has to
take into account that the "remainder" of @bar() after @foo has
returned is either what it can see in the IR, or the XXX continuation
(which it //could// analyze in theory, but in practice is unlikely
to).

This is kind of a bummer since what I said above directly contradicts
the "As long as the behavior of an operand bundle is describable
within these restrictions, LLVM does not need to have special
knowledge of the operand bundle to not miscompile programs containing
it." bit in the LangRef.  :(
> Essentially, we'd be introducing an aliasing rule along the following:
> "reads nothing on normal path, reads/writes world if guard is taken
(in
> which case, does not return)."  Yes, implementing that will be a bit
> complicated, but I don't see this as a fundamental issue.
Yup, and this is a property of deopt operand bundles, not just guards.
>> How is it more general?
>
> You can express a guard as a conditional branch to a @bail_to_interpreter
> construct.  Without the @bail_to_interpreter (which is the thing which has
> those weird aliasing properties we're talking about), you're stuck.
I thought earlier you were suggesting bail_to_interpreter is more
general than side_exit (when I thought they were one and the same
thing), not that bail_to_interpreter is more general than guard.

Aside: theoretically, if you have @guard() as a primitive then
bail_to_interpreter is just @guard(false).

-- Sanjoy

Sanjoy Das via llvm-dev

2016-Feb-18 06:27 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

Some minor additions to what I said earlier:

On Wed, Feb 17, 2016 at 9:59 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> My interpretation here is that we're not modeling the deopt
> continuations correctly.  Above, the XXX continuation is a delimited
> continuation that terminates at the boundary of @bar, and seen from
> its caller, the memory effect (and any other effect) of @bar has to
> take into account that the "remainder" of @bar() after @foo has
> returned is either what it can see in the IR, or the XXX continuation
> (which it //could// analyze in theory, but in practice is unlikely
> to).
A related question is: down below, we're clearly allowed to forward 42
into v0 after inlining through the call to @bar (while before inlining
we weren't, as shown earlier) -- what changed?

```
global *ptr
declare @foo() readwrite
def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }

inlining ==>

global *ptr
declare @foo() readwrite
def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
def @baz() { call @foo() [ "deopt"(YYY XXX) ]; *ptr = 42; int v0 =
*ptr }

```

What changed is that inlining composed the normal (non-deopt)
continuation in @baz() with the non-deopt continuation in @bar (and
the deopt continuation with the deopt continuation).  Thus the
non-deopt continuation in @baz no longer sees a merge of the final
states of the deopt and non-deopt continuations in @bar and can thus
be less pessimistic.
> This is kind of a bummer since what I said above directly contradicts
> the "As long as the behavior of an operand bundle is describable
> within these restrictions, LLVM does not need to have special
> knowledge of the operand bundle to not miscompile programs containing
> it." bit in the LangRef.  :(
Now that I think about it, this isn't too bad.  This means "deopt"
operand bundles will need some special handling in IPO passes, but
they get that anyway in the inliner.

-- Sanjoy

Sanjoy Das via llvm-dev

2016-Feb-18 18:58 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

So, to summarize, the action item here is (let me know if you
disagree): I need to go and fix the
semantics of deopt operand bundles around IPO, and once that is done,
the weirdness around guards being readonly only in their immediate
callers will no longer be an issue.

-- Sanjoy

On Wed, Feb 17, 2016 at 10:27 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> Some minor additions to what I said earlier:
>
> On Wed, Feb 17, 2016 at 9:59 PM, Sanjoy Das
> <sanjoy at playingwithpointers.com> wrote:
>> My interpretation here is that we're not modeling the deopt
>> continuations correctly.  Above, the XXX continuation is a delimited
>> continuation that terminates at the boundary of @bar, and seen from
>> its caller, the memory effect (and any other effect) of @bar has to
>> take into account that the "remainder" of @bar() after @foo
has
>> returned is either what it can see in the IR, or the XXX continuation
>> (which it //could// analyze in theory, but in practice is unlikely
>> to).
>
> A related question is: down below, we're clearly allowed to forward 42
> into v0 after inlining through the call to @bar (while before inlining
> we weren't, as shown earlier) -- what changed?
>
> ```
> global *ptr
> declare @foo() readwrite
> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
>
> inlining ==>
>
> global *ptr
> declare @foo() readwrite
> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
> def @baz() { call @foo() [ "deopt"(YYY XXX) ]; *ptr = 42; int v0
= *ptr }
>
> ```
>
> What changed is that inlining composed the normal (non-deopt)
> continuation in @baz() with the non-deopt continuation in @bar (and
> the deopt continuation with the deopt continuation).  Thus the
> non-deopt continuation in @baz no longer sees a merge of the final
> states of the deopt and non-deopt continuations in @bar and can thus
> be less pessimistic.
>
>> This is kind of a bummer since what I said above directly contradicts
>> the "As long as the behavior of an operand bundle is describable
>> within these restrictions, LLVM does not need to have special
>> knowledge of the operand bundle to not miscompile programs containing
>> it." bit in the LangRef.  :(
>
> Now that I think about it, this isn't too bad.  This means
"deopt"
> operand bundles will need some special handling in IPO passes, but
> they get that anyway in the inliner.
>
> -- Sanjoy


-- 
Sanjoy Das
http://playingwithpointers.com

Philip Reames via llvm-dev

2016-Feb-22 19:03 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On 02/17/2016 09:59 PM, Sanjoy Das wrote:> On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames
> <listmail at philipreames.com> wrote:
>
>> I think you're jumping ahead a bit here.  I'm not sure the
semantics are
>> anywhere near as weird as you're framing them to be.  :)
> I now think this weirdness actually does not have to do anything with
> guard_on or bail_to_interpeter, but it has to do with deopt bundles
> itself.  Our notion of of "deopt bundles are readonly" is broken
to
> begin with, and that is what's manifesting as the complication
we're
> seeing here.
>
> Consider something like
>
> ```
> declare @foo() readonly
> def @bar() { call @foo() [ "deopt"(XXX) ] }
> def @baz() { call @bar() [ "deopt"(YYY) ] }
> ```
>
> Right now according to the semantics of "deopt" operand bundles
as in
> the LangRef, every call site above is readonly.  However, it is
> possible for @baz() to write to memory if @bar is deoptimized at the
> call site with the call to @foo.
>
> You could say that it isn't legal to mark @foo as readonly, since the
> action of deoptimizing one's caller is not a readonly operation.  But
> that doesn't work in cases like this:
>
> ```
> global *ptr
> declare @foo() readwrite
> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
> ```
>
> Naively, it looks like an inter-proc CSE can forward 42 to v0, but
> that's unsound, since @bar could get deoptimized at the call to
> @foo(), and then who knows what'll get written to *ptr.Ok, I think this example does a good job of getting at the root issue.  
You claim this is not legal, I claim it is.  :) Specifically, because 
the use of the inferred information will never be executed in baz.  (see 
below)

Specifically, I think the problem here is that we're mixing a couple of 
notions.  First, we've got the state required for the deoptimization to 
occur (i.e. deopt information).  Second, we've got the actual 
deoptimization mechanism.  Third, we've got the *policy* under which 
deoptimization occurs.

The distinction between the later two is subtle and important.  The 
*mechanism* of exiting the callee and replacing it with an arbitrary 
alternate implementation could absolutely break the deopt semantics as 
you've pointed out.  The policy we actually use does not. Specifically, 
we've got the following restrictions:
1) We only replace callees with more general versions of themselves.  
Given we might be invalidating a speculative assumption, this could be a 
*much* more general version which includes actions and control flow 
invalidate any attribute inference done over the callee.
2) We invalidate all callers of @foo which could have observed the 
incorrect inference.  (This is required to preserve correctness.)

I think we probably need to separate out something to represent the 
interposition/replacement semantics implied by invalidation 
deoptimization.  In it's most generic form, this would model the full 
generality of the mechanism and thus prevent nearly all inference.  We 
could then clearly express our *policy* as a restriction over that full 
generality.

Another interesting case to consider:

global *ptr
declare @foo() readwrite
def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
def @baz() {
   v0 = 42;
   while (C) {
     call @bar() [ "deopt"(v0) ];
     int v0 = *ptr
   }
}

Could we end up deoptimization with an incorrect deopt value for v0 
based on circular logic?  We can infer that v0 is always 42 in this 
example.  I claim that's legal precisely up to the point at which we 
deoptimize @bar and @baz together.  If we deoptimized @bar, let @baz run 
another loop iteration, then invalidated @baz, that would be incorrect.

>
> My interpretation here is that we're not modeling the deopt
> continuations correctly.  Above, the XXX continuation is a delimited
> continuation that terminates at the boundary of @bar, and seen from
> its caller, the memory effect (and any other effect) of @bar has to
> take into account that the "remainder" of @bar() after @foo has
> returned is either what it can see in the IR, or the XXX continuation
> (which it //could// analyze in theory, but in practice is unlikely
> to).
>
> This is kind of a bummer since what I said above directly contradicts
> the "As long as the behavior of an operand bundle is describable
> within these restrictions, LLVM does not need to have special
> knowledge of the operand bundle to not miscompile programs containing
> it." bit in the LangRef.  :(Per above, I think we're fine for invalidation deoptimization.

For side exits, the runtime function called can never be marked readonly 
(or just about any other restricted semantics) precisely because it can 
execute an arbitrary continuation.  In principle, we could do bytecode 
inference to establish restricted semantics per call site.

Philip

Sanjoy Das via llvm-dev

2016-Feb-22 20:15 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Mon, Feb 22, 2016 at 11:03 AM, Philip Reames
<listmail at philipreames.com> wrote:>> ```
>> global *ptr
>> declare @foo() readwrite
>> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
>> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
>> ```
>>
>> Naively, it looks like an inter-proc CSE can forward 42 to v0, but
>> that's unsound, since @bar could get deoptimized at the call to
>> @foo(), and then who knows what'll get written to *ptr.
>
> Ok, I think this example does a good job of getting at the root issue.  You
> claim this is not legal, I claim it is.  :) Specifically, because the use
of
> the inferred information will never be executed in baz.  (see below)
>
> Specifically, I think the problem here is that we're mixing a couple of
> notions.  First, we've got the state required for the deoptimization to
> occur (i.e. deopt information).  Second, we've got the actual
deoptimization
> mechanism.  Third, we've got the *policy* under which deoptimization
occurs.
>
> The distinction between the later two is subtle and important.  The
> *mechanism* of exiting the callee and replacing it with an arbitrary
> alternate implementation could absolutely break the deopt semantics as
> you've pointed out.  The policy we actually use does not. Specifically,
> we've got the following restrictions:
> 1) We only replace callees with more general versions of themselves.  Given
> we might be invalidating a speculative assumption, this could be a *much*
> more general version which includes actions and control flow invalidate any
> attribute inference done over the callee.
> 2) We invalidate all callers of @foo which could have observed the
incorrect
> inference.  (This is required to preserve correctness.)
Yes.  I think I was too dramatic when I claimed that the
deoptimization model in LLMV is "wrong" -- the real story is more on
the lines of "frontend authors need to be aware of some subtleties".
> I think we probably need to separate out something to represent the
> interposition/replacement semantics implied by invalidation deoptimization.
> In it's most generic form, this would model the full generality of the
> mechanism and thus prevent nearly all inference.  We could then clearly
> express our *policy* as a restriction over that full generality.
Yes.  LLVM already has a "mayBeOverriden" flag, we should just add a
function attribute, `interposable`, that makes `mayBeOverriden` return
true.
> Per above, I think we're fine for invalidation deoptimization.
>
> For side exits, the runtime function called can never be marked readonly
(or
> just about any other restricted semantics) precisely because it can execute
> an arbitrary continuation.
The problem is a little easier with side exits, since with side exits,
we will either have them at the tail position, or have them follow an
unreachable (so having them as read/write/may-unwind is not a problem).

With guards, we have to solve a harder problem -- we don't want to
mark the guard as "can read write all memory", since we'd like to
forward `val` to `val1` in the example below:

  int val = *ptr;
  guard_on(arbitrary condition)
  int val1 = *ptr;

But, as discussed earlier, we're probably okay if we mark guard_on as
read/write; and use alias analysis to sneakily make it "practically
readonly".

-- Sanjoy

Andrew Trick via llvm-dev

2016-Feb-23 05:43 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

> On Feb 22, 2016, at 11:03 AM, Philip Reames <listmail at
philipreames.com> wrote:
> 
>> ```
>> global *ptr
>> declare @foo() readwrite
>> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
>> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr }
>> ```
>> 
>> Naively, it looks like an inter-proc CSE can forward 42 to v0, but
>> that's unsound, since @bar could get deoptimized at the call to
>> @foo(), and then who knows what'll get written to *ptr.
> Ok, I think this example does a good job of getting at the root issue.  You
claim this is not legal, I claim it is.  :) Specifically, because the use of the
inferred information will never be executed in baz.  (see below)
> 
> Specifically, I think the problem here is that we're mixing a couple of
notions.  First, we've got the state required for the deoptimization to
occur (i.e. deopt information).  Second, we've got the actual deoptimization
mechanism.  Third, we've got the *policy* under which deoptimization occurs.
> 
> The distinction between the later two is subtle and important.  The
*mechanism* of exiting the callee and replacing it with an arbitrary alternate
implementation could absolutely break the deopt semantics as you've pointed
out.  The policy we actually use does not. Specifically, we've got the
following restrictions:
> 1) We only replace callees with more general versions of themselves.  Given
we might be invalidating a speculative assumption, this could be a *much* more
general version which includes actions and control flow invalidate any attribute
inference done over the callee.
> 2) We invalidate all callers of @foo which could have observed the
incorrect inference.  (This is required to preserve correctness.)
> 
> I think we probably need to separate out something to represent the
interposition/replacement semantics implied by invalidation deoptimization.  In
it's most generic form, this would model the full generality of the
mechanism and thus prevent nearly all inference.  We could then clearly express
our *policy* as a restriction over that full generality.
> 
> Another interesting case to consider:
> 
> global *ptr
> declare @foo() readwrite
> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 }
> def @baz() {
>  v0 = 42;
>  while (C) {
>    call @bar() [ "deopt"(v0) ];
>    int v0 = *ptr
>  }
> }
> 
> Could we end up deoptimization with an incorrect deopt value for v0 based
on circular logic?  We can infer that v0 is always 42 in this example.  I claim
that's legal precisely up to the point at which we deoptimize @bar and @baz
together.  If we deoptimized @bar, let @baz run another loop iteration, then
invalidated @baz, that would be incorrect.
Wait a sec… it’s legal for the frontend to do the interprocedural CSE, but not
LLVM. The frontend can guarantee that multiple functions can be deoptimized as a
unit, but LLVM can’t make that assumption. As far as it knows, @guard_on will
resume in the immediate caller.

- Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/1e555173/attachment.html>

llvm dev - Feb 2016 - RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM