thr3ads.net - llvm dev - [llvm-dev] RFC: Add guard intrinsics to LLVM [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Feb-23 06:26 UTC

[llvm-dev] RFC: Add guard intrinsics to LLVM

Assuming everyone is on the same page, here's a rough high level agenda:


# step A: Introduce an `interposable` function attribute

We can bike shed on the name and the exact specification, but the
general idea is that you cannot do IPA / IPO over callsites calling
`interposable` functions without inlining them.  This attribute will
(usually) have to be used on function bodies that can deoptimize (e.g. has a
side exit / guard it in); but also has more general use cases.


# step B: Introduce a `side_exit` intrinsic

Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the
return type:

 - Consumes a "deopt" continuation, and replaces the current physical
   stack frame with one or more interpreter frames (implementation
   provided by the runtime).
 - Calls to this intrinsic must be `musttail` (verifier will check this)
 - We'll have some minor logic in the inliner such that when inlining @f
into @g
   in

     define i32 @f() {
       if (X) return side_exit() [ "deopt"(X) ];
       return i32 20;
     }

     define i64 @g() {
       if (Y) {
         r = f() [ "deopt"(Y) ];
         print(r);
     }

   We get

     define i64 @g() {
       if (Y) {
         if (X) return side_exit() [ "deopt"(Y, X) ];
         print(20);
       }
     }

   and not

     define i64 @g() {
       if (Y) {
         r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20;
         print(r);
     }


# step C: Introduce a `guard_on` intrinsic

Will be based around what was discussed / is going to be discussed on
this thread.


(I think Philip was right in suggesting to split out a "step B" that
only introduces a `side_exit` intrinsic.  We *will* have to specify
them, since we'd like to optimize some after we've lowered guards into
explicit control flow, and for that we need a specification of side
exits.)


# aside: non-managed languages and guards

Chandler raised some points on IRC around making `guard_on` (and
possibly `side_exit`?) more generally applicable to unmanaged
languages; so we'd want to be careful to specify these in a way that
allows for implementations in an unmanaged environments (by function
cloning, for instance).

-- Sanjoy

Sanjoy Das via llvm-dev

2016-Feb-23 06:31 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

I noticed this after sending, but the examples have some potential for
confusion -- the X in the deopt state has nothing specifically to do
with the X in the condition.
>
>      define i32 @f() {
>        if (X) return side_exit() [ "deopt"(X) ];
>        return i32 20;
>      }
>
>      define i64 @g() {
>        if (Y) {
>          r = f() [ "deopt"(Y) ];
>          print(r);
>      }
>
>    We get
>
>      define i64 @g() {
>        if (Y) {
>          if (X) return side_exit() [ "deopt"(Y, X) ];
>          print(20);
>        }
>      }
>
>    and not
>
>      define i64 @g() {
>        if (Y) {
>          r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20;
>          print(r);
>      }
-- Sanjoy

Andrew Trick via llvm-dev

2016-Feb-23 07:05 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

> On Feb 22, 2016, at 10:26 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> Assuming everyone is on the same page, here's a rough high level
agenda:
> 
> 
> # step A: Introduce an `interposable` function attribute
> 
> We can bike shed on the name and the exact specification, but the
> general idea is that you cannot do IPA / IPO over callsites calling
> `interposable` functions without inlining them.  This attribute will
> (usually) have to be used on function bodies that can deoptimize (e.g. has
a
> side exit / guard it in); but also has more general use cases.
+1
> # step B: Introduce a `side_exit` intrinsic
> 
> Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the
> return type:
I didn’t know intrinsics could be polymorphic on the return type.
> - Consumes a "deopt" continuation, and replaces the current
physical
>   stack frame with one or more interpreter frames (implementation
>   provided by the runtime).
> - Calls to this intrinsic must be `musttail` (verifier will check this)
> - We'll have some minor logic in the inliner such that when inlining @f
into @g
>   in
> 
>     define i32 @f() {
>       if (X) return side_exit() [ "deopt"(X) ];
>       return i32 20;
>     }
> 
>     define i64 @g() {
>       if (Y) {
>         r = f() [ "deopt"(Y) ];
>         print(r);
>     }
> 
>   We get
> 
>     define i64 @g() {
>       if (Y) {
>         if (X) return side_exit() [ "deopt"(Y, X) ];
>         print(20);
>       }
>     }
> 
>   and not
> 
>     define i64 @g() {
>       if (Y) {
>         r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20;
>         print(r);
>     }
I understand why you’re doing this: explicitly model the resume-at-return path.
But…

- It’s a bit awkward vs. side_exit(); unreachable, as evidenced by inlining.

- It would be nice to be able to model frequent OSR points as
branch-to-unreachable because it may lead to better optimization, codegen, and
compile time. I don’t think those are really fundamental problems though aside
from adding a large number of return block users, but it may be work to find all
of the small performance issues.

- Do you think this will make sense for all return argument conventions,
including sret?

(I actually think this is a great approach, I’m just playing Devil’s advocate
here.)
> # step C: Introduce a `guard_on` intrinsic
> 
> Will be based around what was discussed / is going to be discussed on
> this thread.
> 
> 
> (I think Philip was right in suggesting to split out a "step B"
that
> only introduces a `side_exit` intrinsic.  We *will* have to specify
> them, since we'd like to optimize some after we've lowered guards
into
> explicit control flow, and for that we need a specification of side
> exits.)
+1

-Andy
> 
> 
> # aside: non-managed languages and guards
> 
> Chandler raised some points on IRC around making `guard_on` (and
> possibly `side_exit`?) more generally applicable to unmanaged
> languages; so we'd want to be careful to specify these in a way that
> allows for implementations in an unmanaged environments (by function
> cloning, for instance).
> 
> -- Sanjoy

Chandler Carruth via llvm-dev

2016-Feb-23 07:18 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

I've not had time to really dig into all of this thread, but I wanted to
point out:

On Mon, Feb 22, 2016 at 10:27 PM Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Assuming everyone is on the same page, here's a rough high level
agenda:
>
>
> # step A: Introduce an `interposable` function attribute
>
> We can bike shed on the name and the exact specification, but the
> general idea is that you cannot do IPA / IPO over callsites calling
> `interposable` functions without inlining them.  This attribute will
> (usually) have to be used on function bodies that can deoptimize (e.g. has
> a
> side exit / guard it in); but also has more general use cases.
>
Note that we already have this *exact* concept in the IR via linkage for
better or worse. I think it is really confusing as you are currently
describing it because it seems deeply overlapping with linkage, which is
where the whole interposition thing comes from, and yet you never mention
how it interacts with linkage at all. What does it mean to have a common
linkage function that lacks the interposable attribute? Or a LinkOnceODR
function that does have that attribute?

If the goal is to factor replaceability out of linkage, we should actually
factor it out rather than adding yet one more way to talk about this.

And generally, we need to be *really* careful adding function attributes.
Look at the challenges we had figuring out norecurse. Adding attributes
needs to be viewed as nearly as high cost as adding instructions,
substantially higher cost than intrinsics.

>
>
> # step B: Introduce a `side_exit` intrinsic
>
> Specify an `@llvm.experimental.side_exit` intrinsic, polymorphic on the
> return type:
>
>  - Consumes a "deopt" continuation, and replaces the current
physical
>    stack frame with one or more interpreter frames (implementation
>    provided by the runtime).
>
I think it would be really helpful to work to describe these things in
terms of semantic contracts on the IR rather than in terms of
implementation strategies. For example, not all IR interacts with an
interpreter, and so I don't think we should use the term
"interpreter" to
specify the semantic model exposed by the IR.

 - Calls to this intrinsic must be `musttail` (verifier will check
this)>  - We'll have some minor logic in the inliner such that when inlining
@f
> into @g
>    in
>
>      define i32 @f() {
>        if (X) return side_exit() [ "deopt"(X) ];
>        return i32 20;
>      }
>
>      define i64 @g() {
>        if (Y) {
>          r = f() [ "deopt"(Y) ];
>          print(r);
>      }
>
>    We get
>
>      define i64 @g() {
>        if (Y) {
>          if (X) return side_exit() [ "deopt"(Y, X) ];
>          print(20);
>        }
>      }
>
>    and not
>
>      define i64 @g() {
>        if (Y) {
>          r = X ? (side_exit() [ "deopt"(Y, X) ]) : 20;
>          print(r);
>      }
>
>
> # step C: Introduce a `guard_on` intrinsic
>
> Will be based around what was discussed / is going to be discussed on
> this thread.
>
>
> (I think Philip was right in suggesting to split out a "step B"
that
> only introduces a `side_exit` intrinsic.  We *will* have to specify
> them, since we'd like to optimize some after we've lowered guards
into
> explicit control flow, and for that we need a specification of side
> exits.)
>
>
> # aside: non-managed languages and guards
>
> Chandler raised some points on IRC around making `guard_on` (and
> possibly `side_exit`?) more generally applicable to unmanaged
> languages; so we'd want to be careful to specify these in a way that
> allows for implementations in an unmanaged environments (by function
> cloning, for instance).
>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/7e20468e/attachment.html>

Sanjoy Das via llvm-dev

2016-Feb-23 16:51 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Mon, Feb 22, 2016 at 11:05 PM, Andrew Trick <atrick at apple.com>
wrote:>
> I didn’t know intrinsics could be polymorphic on the return type.
@llvm.experimental.gc.result is polymorphic on its return type, for
instance.
> I understand why you’re doing this: explicitly model the
> resume-at-return path. But…
>
> - It’s a bit awkward vs. side_exit(); unreachable, as evidenced by
> inlining.
Part of the reason why we thought this scheme,
return-result-of-side-exit, was better than the
side-exit-then-unreachable scheme is that the former is more honest
about data flow; and that would prevent some nastiness around IPA. But
given that we're talking about introducing an `interposable` attribute
that prevents IPA, the side-exit-then-unreachable approach sounds
feasible now.  I need to see if there are other reasons for keeping
the return-result-of-side-exit variant; if not, I'll use the
side-exit-then-unreachable scheme.
> - It would be nice to be able to model frequent OSR points as
> branch-to-unreachable because it may lead to better optimization,
> codegen, and compile time.
Agreed.
> I don’t think those are really fundamental
> problems though aside from adding a large number of return block
> users
Return block users?  Does LLVM coalesce all `ret` instructions to a
single `ret PHI`?  I couldn't reproduce this in a small example IR.
> but it may be work to find all of the small performance issues.
>
> - Do you think this will make sense for all return argument
> conventions, including sret?
I'm not very familiar with sret, but skimming the docs I don't see why
not.  But generally, the frontend will have to know to generate
@side_exits that are legal.

-- Sanjoy

Sanjoy Das via llvm-dev

2016-Feb-23 17:32 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Mon, Feb 22, 2016 at 11:18 PM, Chandler Carruth <chandlerc at
gmail.com> wrote:>> # step A: Introduce an `interposable` function attribute
>>
>> We can bike shed on the name and the exact specification, but the
>> general idea is that you cannot do IPA / IPO over callsites calling
>> `interposable` functions without inlining them.  This attribute will
>> (usually) have to be used on function bodies that can deoptimize (e.g.
has
>> a
>> side exit / guard it in); but also has more general use cases.
>
>
> Note that we already have this *exact* concept in the IR via linkage for
> better or worse. I think it is really confusing as you are currently
I was going to have a more detailed discussion on this in the (yet to
be started) review thread for `interposable`: we'd like to be able to
inline `interposable` functions.  The "interposition" can only happen
in physical function boundaries, so opt is allowed to do as much
IPA/IPO it wants once it makes the physical function boundary go away
via inlining.  None of linkage types seem to have this property.

Part of the challenge here is to specify the attribute in a way that
allows inlining, but not IPA without inlining.  In fact, maybe it is
best to not call it "interposable" at all?

Actually, I think one of the problems we're trying to solve with
`interposable` is applicable to the available_externally linkage as
well.  Say we have

```
void foo() available_externally {
  %t0 = load atomic %ptr
  %t1 = load atomic %ptr
  if (%t0 != %t1) print("X");
}
void main() {
  foo();
  print("Y");
}
```

Now the possible behaviors of the above program are {print("X"),
print("Y")} or {print("Y")}.  But if we run opt then we have

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  foo();
  print("Y");
}
```

and some generic reordering

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  print("Y");
  foo();  // legal since we're moving a readnone nounwind function that
          // was guaranteed to execute (hence can't have UB)
}
```

Now if we do not inline @foo(), and instead re-link the call site in
@main to some non-optimized copy (or differently optimized copy) of
foo, then it is possible for the program to have the behavior
{print("Y"); print ("X")}, which was disallowed in the
earlier
program.

In other words, opt refined the semantics of @foo() (i.e. reduced the
set of behaviors it may have) in ways that would make later
optimizations invalid if we de-refine the implementation of @foo().

Given this, I'd say we don't need a new attribute / linkage type, and
can add our restriction to the available_externally linkage.
> describing it because it seems deeply overlapping with linkage, which is
> where the whole interposition thing comes from, and yet you never mention
> how it interacts with linkage at all. What does it mean to have a common
> linkage function that lacks the interposable attribute? Or a LinkOnceODR
> function that does have that attribute?
What would you say about adding this as a new kind of linkage?  I was
trying to avoid doing that since the intended semantics of,
GlobalValue::InterposableLinkage don't just describe what a linker
does, but also restricts what can be legally linked in (for the
can-inline-but-can't-IPA property to hold), but perhaps that's the
best way forward?

[Edit: I wrote this section before I wrote the available_externally
thing above.]
> If the goal is to factor replaceability out of linkage, we should actually
> factor it out rather than adding yet one more way to talk about this.
>
> And generally, we need to be *really* careful adding function attributes.
> Look at the challenges we had figuring out norecurse. Adding attributes
> needs to be viewed as nearly as high cost as adding instructions,
> substantially higher cost than intrinsics.
Only indirectly relevant to this discussion, but this is news to me --
my mental cost model was "attributes are easy to add and maintain", so
I didn't think too hard about alternatives.
> I think it would be really helpful to work to describe these things in
terms
> of semantic contracts on the IR rather than in terms of implementation
> strategies. For example, not all IR interacts with an interpreter, and so I
> don't think we should use the term "interpreter" to specify
the semantic
> model exposed by the IR.
That's what I was getting at by:
>> Chandler raised some points on IRC around making `guard_on` (and
>> possibly `side_exit`?) more generally applicable to unmanaged
>> languages; so we'd want to be careful to specify these in a way
that
>> allows for implementations in an unmanaged environments (by function
>> cloning, for instance).
-- Sanjoy

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2016 - RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

Apparently Analagous Threads