Sanjoy Das via llvm-dev
2016-Feb-18 05:59 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames <listmail at philipreames.com> wrote:> I think you're jumping ahead a bit here. I'm not sure the semantics are > anywhere near as weird as you're framing them to be. :)I now think this weirdness actually does not have to do anything with guard_on or bail_to_interpeter, but it has to do with deopt bundles itself. Our notion of of "deopt bundles are readonly" is broken to begin with, and that is what's manifesting as the complication we're seeing here. Consider something like ``` declare @foo() readonly def @bar() { call @foo() [ "deopt"(XXX) ] } def @baz() { call @bar() [ "deopt"(YYY) ] } ``` Right now according to the semantics of "deopt" operand bundles as in the LangRef, every call site above is readonly. However, it is possible for @baz() to write to memory if @bar is deoptimized at the call site with the call to @foo. You could say that it isn't legal to mark @foo as readonly, since the action of deoptimizing one's caller is not a readonly operation. But that doesn't work in cases like this: ``` global *ptr declare @foo() readwrite def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } ``` Naively, it looks like an inter-proc CSE can forward 42 to v0, but that's unsound, since @bar could get deoptimized at the call to @foo(), and then who knows what'll get written to *ptr. My interpretation here is that we're not modeling the deopt continuations correctly. Above, the XXX continuation is a delimited continuation that terminates at the boundary of @bar, and seen from its caller, the memory effect (and any other effect) of @bar has to take into account that the "remainder" of @bar() after @foo has returned is either what it can see in the IR, or the XXX continuation (which it //could// analyze in theory, but in practice is unlikely to). This is kind of a bummer since what I said above directly contradicts the "As long as the behavior of an operand bundle is describable within these restrictions, LLVM does not need to have special knowledge of the operand bundle to not miscompile programs containing it." bit in the LangRef. :(> Essentially, we'd be introducing an aliasing rule along the following: > "reads nothing on normal path, reads/writes world if guard is taken (in > which case, does not return)." Yes, implementing that will be a bit > complicated, but I don't see this as a fundamental issue.Yup, and this is a property of deopt operand bundles, not just guards.>> How is it more general? > > You can express a guard as a conditional branch to a @bail_to_interpreter > construct. Without the @bail_to_interpreter (which is the thing which has > those weird aliasing properties we're talking about), you're stuck.I thought earlier you were suggesting bail_to_interpreter is more general than side_exit (when I thought they were one and the same thing), not that bail_to_interpreter is more general than guard. Aside: theoretically, if you have @guard() as a primitive then bail_to_interpreter is just @guard(false). -- Sanjoy
Sanjoy Das via llvm-dev
2016-Feb-18 06:27 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
Some minor additions to what I said earlier: On Wed, Feb 17, 2016 at 9:59 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> My interpretation here is that we're not modeling the deopt > continuations correctly. Above, the XXX continuation is a delimited > continuation that terminates at the boundary of @bar, and seen from > its caller, the memory effect (and any other effect) of @bar has to > take into account that the "remainder" of @bar() after @foo has > returned is either what it can see in the IR, or the XXX continuation > (which it //could// analyze in theory, but in practice is unlikely > to).A related question is: down below, we're clearly allowed to forward 42 into v0 after inlining through the call to @bar (while before inlining we weren't, as shown earlier) -- what changed? ``` global *ptr declare @foo() readwrite def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } inlining ==> global *ptr declare @foo() readwrite def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } def @baz() { call @foo() [ "deopt"(YYY XXX) ]; *ptr = 42; int v0 = *ptr } ``` What changed is that inlining composed the normal (non-deopt) continuation in @baz() with the non-deopt continuation in @bar (and the deopt continuation with the deopt continuation). Thus the non-deopt continuation in @baz no longer sees a merge of the final states of the deopt and non-deopt continuations in @bar and can thus be less pessimistic.> This is kind of a bummer since what I said above directly contradicts > the "As long as the behavior of an operand bundle is describable > within these restrictions, LLVM does not need to have special > knowledge of the operand bundle to not miscompile programs containing > it." bit in the LangRef. :(Now that I think about it, this isn't too bad. This means "deopt" operand bundles will need some special handling in IPO passes, but they get that anyway in the inliner. -- Sanjoy
Sanjoy Das via llvm-dev
2016-Feb-18 18:58 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
So, to summarize, the action item here is (let me know if you disagree): I need to go and fix the semantics of deopt operand bundles around IPO, and once that is done, the weirdness around guards being readonly only in their immediate callers will no longer be an issue. -- Sanjoy On Wed, Feb 17, 2016 at 10:27 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Some minor additions to what I said earlier: > > On Wed, Feb 17, 2016 at 9:59 PM, Sanjoy Das > <sanjoy at playingwithpointers.com> wrote: >> My interpretation here is that we're not modeling the deopt >> continuations correctly. Above, the XXX continuation is a delimited >> continuation that terminates at the boundary of @bar, and seen from >> its caller, the memory effect (and any other effect) of @bar has to >> take into account that the "remainder" of @bar() after @foo has >> returned is either what it can see in the IR, or the XXX continuation >> (which it //could// analyze in theory, but in practice is unlikely >> to). > > A related question is: down below, we're clearly allowed to forward 42 > into v0 after inlining through the call to @bar (while before inlining > we weren't, as shown earlier) -- what changed? > > ``` > global *ptr > declare @foo() readwrite > def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } > def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } > > inlining ==> > > global *ptr > declare @foo() readwrite > def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } > def @baz() { call @foo() [ "deopt"(YYY XXX) ]; *ptr = 42; int v0 = *ptr } > > ``` > > What changed is that inlining composed the normal (non-deopt) > continuation in @baz() with the non-deopt continuation in @bar (and > the deopt continuation with the deopt continuation). Thus the > non-deopt continuation in @baz no longer sees a merge of the final > states of the deopt and non-deopt continuations in @bar and can thus > be less pessimistic. > >> This is kind of a bummer since what I said above directly contradicts >> the "As long as the behavior of an operand bundle is describable >> within these restrictions, LLVM does not need to have special >> knowledge of the operand bundle to not miscompile programs containing >> it." bit in the LangRef. :( > > Now that I think about it, this isn't too bad. This means "deopt" > operand bundles will need some special handling in IPO passes, but > they get that anyway in the inliner. > > -- Sanjoy-- Sanjoy Das http://playingwithpointers.com
Philip Reames via llvm-dev
2016-Feb-22 19:03 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On 02/17/2016 09:59 PM, Sanjoy Das wrote:> On Wed, Feb 17, 2016 at 8:53 PM, Philip Reames > <listmail at philipreames.com> wrote: > >> I think you're jumping ahead a bit here. I'm not sure the semantics are >> anywhere near as weird as you're framing them to be. :) > I now think this weirdness actually does not have to do anything with > guard_on or bail_to_interpeter, but it has to do with deopt bundles > itself. Our notion of of "deopt bundles are readonly" is broken to > begin with, and that is what's manifesting as the complication we're > seeing here. > > Consider something like > > ``` > declare @foo() readonly > def @bar() { call @foo() [ "deopt"(XXX) ] } > def @baz() { call @bar() [ "deopt"(YYY) ] } > ``` > > Right now according to the semantics of "deopt" operand bundles as in > the LangRef, every call site above is readonly. However, it is > possible for @baz() to write to memory if @bar is deoptimized at the > call site with the call to @foo. > > You could say that it isn't legal to mark @foo as readonly, since the > action of deoptimizing one's caller is not a readonly operation. But > that doesn't work in cases like this: > > ``` > global *ptr > declare @foo() readwrite > def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } > def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } > ``` > > Naively, it looks like an inter-proc CSE can forward 42 to v0, but > that's unsound, since @bar could get deoptimized at the call to > @foo(), and then who knows what'll get written to *ptr.Ok, I think this example does a good job of getting at the root issue. You claim this is not legal, I claim it is. :) Specifically, because the use of the inferred information will never be executed in baz. (see below) Specifically, I think the problem here is that we're mixing a couple of notions. First, we've got the state required for the deoptimization to occur (i.e. deopt information). Second, we've got the actual deoptimization mechanism. Third, we've got the *policy* under which deoptimization occurs. The distinction between the later two is subtle and important. The *mechanism* of exiting the callee and replacing it with an arbitrary alternate implementation could absolutely break the deopt semantics as you've pointed out. The policy we actually use does not. Specifically, we've got the following restrictions: 1) We only replace callees with more general versions of themselves. Given we might be invalidating a speculative assumption, this could be a *much* more general version which includes actions and control flow invalidate any attribute inference done over the callee. 2) We invalidate all callers of @foo which could have observed the incorrect inference. (This is required to preserve correctness.) I think we probably need to separate out something to represent the interposition/replacement semantics implied by invalidation deoptimization. In it's most generic form, this would model the full generality of the mechanism and thus prevent nearly all inference. We could then clearly express our *policy* as a restriction over that full generality. Another interesting case to consider: global *ptr declare @foo() readwrite def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } def @baz() { v0 = 42; while (C) { call @bar() [ "deopt"(v0) ]; int v0 = *ptr } } Could we end up deoptimization with an incorrect deopt value for v0 based on circular logic? We can infer that v0 is always 42 in this example. I claim that's legal precisely up to the point at which we deoptimize @bar and @baz together. If we deoptimized @bar, let @baz run another loop iteration, then invalidated @baz, that would be incorrect.> > My interpretation here is that we're not modeling the deopt > continuations correctly. Above, the XXX continuation is a delimited > continuation that terminates at the boundary of @bar, and seen from > its caller, the memory effect (and any other effect) of @bar has to > take into account that the "remainder" of @bar() after @foo has > returned is either what it can see in the IR, or the XXX continuation > (which it //could// analyze in theory, but in practice is unlikely > to). > > This is kind of a bummer since what I said above directly contradicts > the "As long as the behavior of an operand bundle is describable > within these restrictions, LLVM does not need to have special > knowledge of the operand bundle to not miscompile programs containing > it." bit in the LangRef. :(Per above, I think we're fine for invalidation deoptimization. For side exits, the runtime function called can never be marked readonly (or just about any other restricted semantics) precisely because it can execute an arbitrary continuation. In principle, we could do bytecode inference to establish restricted semantics per call site. Philip
Sanjoy Das via llvm-dev
2016-Feb-22 20:15 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On Mon, Feb 22, 2016 at 11:03 AM, Philip Reames <listmail at philipreames.com> wrote:>> ``` >> global *ptr >> declare @foo() readwrite >> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } >> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } >> ``` >> >> Naively, it looks like an inter-proc CSE can forward 42 to v0, but >> that's unsound, since @bar could get deoptimized at the call to >> @foo(), and then who knows what'll get written to *ptr. > > Ok, I think this example does a good job of getting at the root issue. You > claim this is not legal, I claim it is. :) Specifically, because the use of > the inferred information will never be executed in baz. (see below) > > Specifically, I think the problem here is that we're mixing a couple of > notions. First, we've got the state required for the deoptimization to > occur (i.e. deopt information). Second, we've got the actual deoptimization > mechanism. Third, we've got the *policy* under which deoptimization occurs. > > The distinction between the later two is subtle and important. The > *mechanism* of exiting the callee and replacing it with an arbitrary > alternate implementation could absolutely break the deopt semantics as > you've pointed out. The policy we actually use does not. Specifically, > we've got the following restrictions: > 1) We only replace callees with more general versions of themselves. Given > we might be invalidating a speculative assumption, this could be a *much* > more general version which includes actions and control flow invalidate any > attribute inference done over the callee. > 2) We invalidate all callers of @foo which could have observed the incorrect > inference. (This is required to preserve correctness.)Yes. I think I was too dramatic when I claimed that the deoptimization model in LLMV is "wrong" -- the real story is more on the lines of "frontend authors need to be aware of some subtleties".> I think we probably need to separate out something to represent the > interposition/replacement semantics implied by invalidation deoptimization. > In it's most generic form, this would model the full generality of the > mechanism and thus prevent nearly all inference. We could then clearly > express our *policy* as a restriction over that full generality.Yes. LLVM already has a "mayBeOverriden" flag, we should just add a function attribute, `interposable`, that makes `mayBeOverriden` return true.> Per above, I think we're fine for invalidation deoptimization. > > For side exits, the runtime function called can never be marked readonly (or > just about any other restricted semantics) precisely because it can execute > an arbitrary continuation.The problem is a little easier with side exits, since with side exits, we will either have them at the tail position, or have them follow an unreachable (so having them as read/write/may-unwind is not a problem). With guards, we have to solve a harder problem -- we don't want to mark the guard as "can read write all memory", since we'd like to forward `val` to `val1` in the example below: int val = *ptr; guard_on(arbitrary condition) int val1 = *ptr; But, as discussed earlier, we're probably okay if we mark guard_on as read/write; and use alias analysis to sneakily make it "practically readonly". -- Sanjoy
Andrew Trick via llvm-dev
2016-Feb-23 05:43 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
> On Feb 22, 2016, at 11:03 AM, Philip Reames <listmail at philipreames.com> wrote: > >> ``` >> global *ptr >> declare @foo() readwrite >> def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } >> def @baz() { call @bar() [ "deopt"(YYY) ]; int v0 = *ptr } >> ``` >> >> Naively, it looks like an inter-proc CSE can forward 42 to v0, but >> that's unsound, since @bar could get deoptimized at the call to >> @foo(), and then who knows what'll get written to *ptr. > Ok, I think this example does a good job of getting at the root issue. You claim this is not legal, I claim it is. :) Specifically, because the use of the inferred information will never be executed in baz. (see below) > > Specifically, I think the problem here is that we're mixing a couple of notions. First, we've got the state required for the deoptimization to occur (i.e. deopt information). Second, we've got the actual deoptimization mechanism. Third, we've got the *policy* under which deoptimization occurs. > > The distinction between the later two is subtle and important. The *mechanism* of exiting the callee and replacing it with an arbitrary alternate implementation could absolutely break the deopt semantics as you've pointed out. The policy we actually use does not. Specifically, we've got the following restrictions: > 1) We only replace callees with more general versions of themselves. Given we might be invalidating a speculative assumption, this could be a *much* more general version which includes actions and control flow invalidate any attribute inference done over the callee. > 2) We invalidate all callers of @foo which could have observed the incorrect inference. (This is required to preserve correctness.) > > I think we probably need to separate out something to represent the interposition/replacement semantics implied by invalidation deoptimization. In it's most generic form, this would model the full generality of the mechanism and thus prevent nearly all inference. We could then clearly express our *policy* as a restriction over that full generality. > > Another interesting case to consider: > > global *ptr > declare @foo() readwrite > def @bar() { call @foo() [ "deopt"(XXX) ]; *ptr = 42 } > def @baz() { > v0 = 42; > while (C) { > call @bar() [ "deopt"(v0) ]; > int v0 = *ptr > } > } > > Could we end up deoptimization with an incorrect deopt value for v0 based on circular logic? We can infer that v0 is always 42 in this example. I claim that's legal precisely up to the point at which we deoptimize @bar and @baz together. If we deoptimized @bar, let @baz run another loop iteration, then invalidated @baz, that would be incorrect.Wait a sec… it’s legal for the frontend to do the interprocedural CSE, but not LLVM. The frontend can guarantee that multiple functions can be deoptimized as a unit, but LLVM can’t make that assumption. As far as it knows, @guard_on will resume in the immediate caller. - Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/1e555173/attachment.html>