Sanjoy Das via llvm-dev
2016-Feb-23 06:26 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
Assuming everyone is on the same page, here's a rough high level agenda: # step A: Introduce an `interposable` function attribute We can bike shed on the name and the exact specification, but the general idea is that you cannot do IPA / IPO over callsites calling `interposable` functions without inlining them. This attribute will (usually) have to be used on function bodies that can deoptimize (e.g. has a side exit / guard it in); but also has more general use cases. # step B: Introduce a `side_exit` intrinsic Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the return type: - Consumes a "deopt" continuation, and replaces the current physical stack frame with one or more interpreter frames (implementation provided by the runtime). - Calls to this intrinsic must be `musttail` (verifier will check this) - We'll have some minor logic in the inliner such that when inlining @f into @g in define i32 @f() { if (X) return side_exit() [ "deopt"(X) ]; return i32 20; } define i64 @g() { if (Y) { r = f() [ "deopt"(Y) ]; print(r); } We get define i64 @g() { if (Y) { if (X) return side_exit() [ "deopt"(Y, X) ]; print(20); } } and not define i64 @g() { if (Y) { r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20; print(r); } # step C: Introduce a `guard_on` intrinsic Will be based around what was discussed / is going to be discussed on this thread. (I think Philip was right in suggesting to split out a "step B" that only introduces a `side_exit` intrinsic. We *will* have to specify them, since we'd like to optimize some after we've lowered guards into explicit control flow, and for that we need a specification of side exits.) # aside: non-managed languages and guards Chandler raised some points on IRC around making `guard_on` (and possibly `side_exit`?) more generally applicable to unmanaged languages; so we'd want to be careful to specify these in a way that allows for implementations in an unmanaged environments (by function cloning, for instance). -- Sanjoy
Sanjoy Das via llvm-dev
2016-Feb-23 06:31 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
I noticed this after sending, but the examples have some potential for confusion -- the X in the deopt state has nothing specifically to do with the X in the condition.> > define i32 @f() { > if (X) return side_exit() [ "deopt"(X) ]; > return i32 20; > } > > define i64 @g() { > if (Y) { > r = f() [ "deopt"(Y) ]; > print(r); > } > > We get > > define i64 @g() { > if (Y) { > if (X) return side_exit() [ "deopt"(Y, X) ]; > print(20); > } > } > > and not > > define i64 @g() { > if (Y) { > r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20; > print(r); > }-- Sanjoy
Andrew Trick via llvm-dev
2016-Feb-23 07:05 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
> On Feb 22, 2016, at 10:26 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Assuming everyone is on the same page, here's a rough high level agenda: > > > # step A: Introduce an `interposable` function attribute > > We can bike shed on the name and the exact specification, but the > general idea is that you cannot do IPA / IPO over callsites calling > `interposable` functions without inlining them. This attribute will > (usually) have to be used on function bodies that can deoptimize (e.g. has a > side exit / guard it in); but also has more general use cases.+1> # step B: Introduce a `side_exit` intrinsic > > Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the > return type:I didn’t know intrinsics could be polymorphic on the return type.> - Consumes a "deopt" continuation, and replaces the current physical > stack frame with one or more interpreter frames (implementation > provided by the runtime). > - Calls to this intrinsic must be `musttail` (verifier will check this) > - We'll have some minor logic in the inliner such that when inlining @f into @g > in > > define i32 @f() { > if (X) return side_exit() [ "deopt"(X) ]; > return i32 20; > } > > define i64 @g() { > if (Y) { > r = f() [ "deopt"(Y) ]; > print(r); > } > > We get > > define i64 @g() { > if (Y) { > if (X) return side_exit() [ "deopt"(Y, X) ]; > print(20); > } > } > > and not > > define i64 @g() { > if (Y) { > r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20; > print(r); > }I understand why you’re doing this: explicitly model the resume-at-return path. But… - It’s a bit awkward vs. side_exit(); unreachable, as evidenced by inlining. - It would be nice to be able to model frequent OSR points as branch-to-unreachable because it may lead to better optimization, codegen, and compile time. I don’t think those are really fundamental problems though aside from adding a large number of return block users, but it may be work to find all of the small performance issues. - Do you think this will make sense for all return argument conventions, including sret? (I actually think this is a great approach, I’m just playing Devil’s advocate here.)> # step C: Introduce a `guard_on` intrinsic > > Will be based around what was discussed / is going to be discussed on > this thread. > > > (I think Philip was right in suggesting to split out a "step B" that > only introduces a `side_exit` intrinsic. We *will* have to specify > them, since we'd like to optimize some after we've lowered guards into > explicit control flow, and for that we need a specification of side > exits.)+1 -Andy> > > # aside: non-managed languages and guards > > Chandler raised some points on IRC around making `guard_on` (and > possibly `side_exit`?) more generally applicable to unmanaged > languages; so we'd want to be careful to specify these in a way that > allows for implementations in an unmanaged environments (by function > cloning, for instance). > > -- Sanjoy
Chandler Carruth via llvm-dev
2016-Feb-23 07:18 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
I've not had time to really dig into all of this thread, but I wanted to point out: On Mon, Feb 22, 2016 at 10:27 PM Sanjoy Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Assuming everyone is on the same page, here's a rough high level agenda: > > > # step A: Introduce an `interposable` function attribute > > We can bike shed on the name and the exact specification, but the > general idea is that you cannot do IPA / IPO over callsites calling > `interposable` functions without inlining them. This attribute will > (usually) have to be used on function bodies that can deoptimize (e.g. has > a > side exit / guard it in); but also has more general use cases. >Note that we already have this *exact* concept in the IR via linkage for better or worse. I think it is really confusing as you are currently describing it because it seems deeply overlapping with linkage, which is where the whole interposition thing comes from, and yet you never mention how it interacts with linkage at all. What does it mean to have a common linkage function that lacks the interposable attribute? Or a LinkOnceODR function that does have that attribute? If the goal is to factor replaceability out of linkage, we should actually factor it out rather than adding yet one more way to talk about this. And generally, we need to be *really* careful adding function attributes. Look at the challenges we had figuring out norecurse. Adding attributes needs to be viewed as nearly as high cost as adding instructions, substantially higher cost than intrinsics.> > > # step B: Introduce a `side_exit` intrinsic > > Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the > return type: > > - Consumes a "deopt" continuation, and replaces the current physical > stack frame with one or more interpreter frames (implementation > provided by the runtime). >I think it would be really helpful to work to describe these things in terms of semantic contracts on the IR rather than in terms of implementation strategies. For example, not all IR interacts with an interpreter, and so I don't think we should use the term "interpreter" to specify the semantic model exposed by the IR. - Calls to this intrinsic must be `musttail` (verifier will check this)> - We'll have some minor logic in the inliner such that when inlining @f > into @g > in > > define i32 @f() { > if (X) return side_exit() [ "deopt"(X) ]; > return i32 20; > } > > define i64 @g() { > if (Y) { > r = f() [ "deopt"(Y) ]; > print(r); > } > > We get > > define i64 @g() { > if (Y) { > if (X) return side_exit() [ "deopt"(Y, X) ]; > print(20); > } > } > > and not > > define i64 @g() { > if (Y) { > r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20; > print(r); > } > > > # step C: Introduce a `guard_on` intrinsic > > Will be based around what was discussed / is going to be discussed on > this thread. > > > (I think Philip was right in suggesting to split out a "step B" that > only introduces a `side_exit` intrinsic. We *will* have to specify > them, since we'd like to optimize some after we've lowered guards into > explicit control flow, and for that we need a specification of side > exits.) > > > # aside: non-managed languages and guards > > Chandler raised some points on IRC around making `guard_on` (and > possibly `side_exit`?) more generally applicable to unmanaged > languages; so we'd want to be careful to specify these in a way that > allows for implementations in an unmanaged environments (by function > cloning, for instance). > > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/7e20468e/attachment.html>
Sanjoy Das via llvm-dev
2016-Feb-23 16:51 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On Mon, Feb 22, 2016 at 11:05 PM, Andrew Trick <atrick at apple.com> wrote:> > I didn’t know intrinsics could be polymorphic on the return type.@llvm.experimental.gc.result is polymorphic on its return type, for instance.> I understand why you’re doing this: explicitly model the > resume-at-return path. But… > > - It’s a bit awkward vs. side_exit(); unreachable, as evidenced by > inlining.Part of the reason why we thought this scheme, return-result-of-side-exit, was better than the side-exit-then-unreachable scheme is that the former is more honest about data flow; and that would prevent some nastiness around IPA. But given that we're talking about introducing an `interposable` attribute that prevents IPA, the side-exit-then-unreachable approach sounds feasible now. I need to see if there are other reasons for keeping the return-result-of-side-exit variant; if not, I'll use the side-exit-then-unreachable scheme.> - It would be nice to be able to model frequent OSR points as > branch-to-unreachable because it may lead to better optimization, > codegen, and compile time.Agreed.> I don’t think those are really fundamental > problems though aside from adding a large number of return block > usersReturn block users? Does LLVM coalesce all `ret` instructions to a single `ret PHI`? I couldn't reproduce this in a small example IR.> but it may be work to find all of the small performance issues. > > - Do you think this will make sense for all return argument > conventions, including sret?I'm not very familiar with sret, but skimming the docs I don't see why not. But generally, the frontend will have to know to generate @side_exits that are legal. -- Sanjoy
Sanjoy Das via llvm-dev
2016-Feb-23 17:32 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
On Mon, Feb 22, 2016 at 11:18 PM, Chandler Carruth <chandlerc at gmail.com> wrote:>> # step A: Introduce an `interposable` function attribute >> >> We can bike shed on the name and the exact specification, but the >> general idea is that you cannot do IPA / IPO over callsites calling >> `interposable` functions without inlining them. This attribute will >> (usually) have to be used on function bodies that can deoptimize (e.g. has >> a >> side exit / guard it in); but also has more general use cases. > > > Note that we already have this *exact* concept in the IR via linkage for > better or worse. I think it is really confusing as you are currentlyI was going to have a more detailed discussion on this in the (yet to be started) review thread for `interposable`: we'd like to be able to inline `interposable` functions. The "interposition" can only happen in physical function boundaries, so opt is allowed to do as much IPA/IPO it wants once it makes the physical function boundary go away via inlining. None of linkage types seem to have this property. Part of the challenge here is to specify the attribute in a way that allows inlining, but not IPA without inlining. In fact, maybe it is best to not call it "interposable" at all? Actually, I think one of the problems we're trying to solve with `interposable` is applicable to the available_externally linkage as well. Say we have ``` void foo() available_externally { %t0 = load atomic %ptr %t1 = load atomic %ptr if (%t0 != %t1) print("X"); } void main() { foo(); print("Y"); } ``` Now the possible behaviors of the above program are {print("X"), print("Y")} or {print("Y")}. But if we run opt then we have ``` void foo() available_externally readnone nounwind { ;; After CSE'ing the two loads and folding the condition } void main() { foo(); print("Y"); } ``` and some generic reordering ``` void foo() available_externally readnone nounwind { ;; After CSE'ing the two loads and folding the condition } void main() { print("Y"); foo(); // legal since we're moving a readnone nounwind function that // was guaranteed to execute (hence can't have UB) } ``` Now if we do not inline @foo(), and instead re-link the call site in @main to some non-optimized copy (or differently optimized copy) of foo, then it is possible for the program to have the behavior {print("Y"); print ("X")}, which was disallowed in the earlier program. In other words, opt refined the semantics of @foo() (i.e. reduced the set of behaviors it may have) in ways that would make later optimizations invalid if we de-refine the implementation of @foo(). Given this, I'd say we don't need a new attribute / linkage type, and can add our restriction to the available_externally linkage.> describing it because it seems deeply overlapping with linkage, which is > where the whole interposition thing comes from, and yet you never mention > how it interacts with linkage at all. What does it mean to have a common > linkage function that lacks the interposable attribute? Or a LinkOnceODR > function that does have that attribute?What would you say about adding this as a new kind of linkage? I was trying to avoid doing that since the intended semantics of, GlobalValue::InterposableLinkage don't just describe what a linker does, but also restricts what can be legally linked in (for the can-inline-but-can't-IPA property to hold), but perhaps that's the best way forward? [Edit: I wrote this section before I wrote the available_externally thing above.]> If the goal is to factor replaceability out of linkage, we should actually > factor it out rather than adding yet one more way to talk about this. > > And generally, we need to be *really* careful adding function attributes. > Look at the challenges we had figuring out norecurse. Adding attributes > needs to be viewed as nearly as high cost as adding instructions, > substantially higher cost than intrinsics.Only indirectly relevant to this discussion, but this is news to me -- my mental cost model was "attributes are easy to add and maintain", so I didn't think too hard about alternatives.> I think it would be really helpful to work to describe these things in terms > of semantic contracts on the IR rather than in terms of implementation > strategies. For example, not all IR interacts with an interpreter, and so I > don't think we should use the term "interpreter" to specify the semantic > model exposed by the IR.That's what I was getting at by:>> Chandler raised some points on IRC around making `guard_on` (and >> possibly `side_exit`?) more generally applicable to unmanaged >> languages; so we'd want to be careful to specify these in a way that >> allows for implementations in an unmanaged environments (by function >> cloning, for instance).-- Sanjoy