thr3ads.net - llvm dev - [llvm-dev] RFC: Add "operand bundles" to calls and invokes [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2015-Aug-10 03:32 UTC

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

We'd like to propose a scheme to attach "operand bundles" to call
and
invoke instructions.  This is based on the offline discussion
mentioned in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.

# Motivation & Definition

Our motivation behind this is to track the state required for
deoptimization (described briefly later) through the LLVM pipeline as
a first-class IR citizen.  We want to do this is a way that is
generally useful.

An "operand bundle" is a set of SSA values (called "bundle
operands")
tagged with a string (called the "bundle tag").  One or more of such
bundles may be attached to a call or an invoke.  The intended use of
these values is to support "frame introspection"-like functionality
for managed languages.


# Abstract Syntax

The syntax of a call instruction will be changed to look like this:

<result> = [tail | musttail] call [cconv] [ret attrs] <ty>
[<fnty>*]
    <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]

where operand_bundle = tag '('[ value ] (',' value )*
')'
      value = normal SSA values
      tag = "< some name >"

In other words, after the function arguments we now have an optional
list of operand bundles of the form `"< bundle tag >"(bundle
attributes, values...)`.  There can be more than one operand bundle in
a call.  Two operand bundles in the same call instruction cannot have
the same tag.

We'd do something similar for invokes.  I'll omit the invoke syntax
from this RFC to keep things brief.

An example:

    define i32 @f(i32 %x) {
     entry:
      %t = add i32 %x, 1
      ret i32 %t
    }

    define void @g(i16 %val, i8* %ptr) {
     entry:
      call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32
100)
      call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
    }

Note 1: Operand bundles are *not* part of a function's signature, and
a given function may be called from multiple places with different
kinds of operand bundles.  This reflects the fact that the operand
bundles are conceptually a part of the *call*, not the callee being
dispatched to.

Note 2: There may be tag specific requirements not mentioned here.
E.g. we may add a rule in the future that says operand bundles with
the tag `"integer-id"` may only contain exactly one constant integer.


# IR Semantics

Bundle operands (SSA values part of some operand bundle) are normal
SSA values.  They need to dominate the call or invoke instruction
they're being passed into and can be optimized as usual.  For
instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a
load feeding into an operand bundle if legal.

Operand bundles are characterized by the `"< bundle tag >"`
string
associated with them.

The overall strategy is:

 1. The semantics are as conservative as is reasonable for operand
    bundles with tags that LLVM does not have a special understanding
    of.  This way LLVM does not miscompile code by default.

 2. LLVM understands the semantics of operand bundles with certain
    specific tags more precisely, and can optimize them better.

This RFC talks mainly about (1).  We will discuss (2) as we add smarts
to LLVM about specific kinds of operand bundles.

The IR-level semantics of an operand bundle with an arbitrary tag are:

 1. The bundle operands passed in to a call escape in unknown ways
    before transferring control to the callee.  For instance:

      declare void @opaque_runtime_fn()

      define void @f(i32* %v) { }

      define i32 @g() {
        %t = i32* @malloc(...)
        ;; "unknown" is a tag LLVM does not have any special knowledge
of
        call void @f(i32* %t) "unknown"(i32* %t)

        store i32 42, i32* %t
        call void @opaque_runtime_fn();
        ret (load i32, i32* %t)
      }

    Normally (without the `"unknown"` bundle) it would be okay to
    optimize `@g` to return `42`.  But the `"unknown"` operand bundle
    escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
    modify the location pointed to by `%t`.

 2. Calls and invokes with operand bundles have unknown read / write
    effect on the heap on entry and exit (even if the call target is
    `readnone` or `readonly`).  For instance:

      define void @f(i32* %v) { }

      define i32 @g() {
        %t = i32* @malloc(...)
        %t.unescaped = i32* @malloc(...)
        ;; "unknown" is a tag LLVM does not have any special knowledge
of
        call void @f(i32* %t) "unknown"(i32* %t)
        ret (load i32, i32* %t)
      }

    Normally it would be okay to optimize `@g` to return `undef`, but
    the `"unknown"` bundle potentially clobbers `%t`.  Note that it
    clobbers `%t` only because it was *also escaped* by the
    `"unknown"` operand bundle -- it does not clobber `%t.unescaped`
    because it isn't reachable from the heap yet.

    However, it is okay to optimize

      define void @f(i32* %v) {
        store i32 10, i32* %v
        print(load i32, i32* %v)
      }

      define void @g() {
        %t = ...
        ;; "unknown" is a tag LLVM does not have any special knowledge
of
        call void @f(i32* %t) "unknown"()
      }

    to

      define void @f(i32* %v) {
        store i32 10, i32* %v
        print(10)
      }

      define void @g() {
        %t = ...
        call void @f(i32* %t) "unknown"()
      }

    The arbitrary heap clobbering only happens on the boundaries of
    the call operation, and therefore we can still do store-load
    forwarding *within* `@f`.

Since we haven't specified any "pure" LLVM way of accessing the
contents of operand bundles, the client is required to model such
accesses as calls to opaque functions (or inline assembly).  This
ensures that things like IPSCCP work as intended.  E.g. it is legal to
optimize

   define i32 @f(i32* %v) { ret i32 10 }

   define void @g() {
     %t = i32* @malloc(...)
     %v = call i32 @f(i32* %t) "unknown"(i32* %t)
     print(%v)
   }

to

   define i32 @f(i32* %v) { ret i32 10 }

   define void @g() {
     %t = i32* @malloc(...)
     %v = call i32 @f(i32* %t) "unknown"(i32* %t)
     print(10)
   }

LLVM won't generally be able to inline through calls and invokes with
operand bundles -- the inliner does not know what to replace the
arbitrary heap accesses implied on function entry and exit with.
However, we intend to teach the inliner to inline through calls /
invokes with some specific kinds of operand bundles.


# Lowering

The lowering strategy will be special cased for each bundle tag.
There won't be any "generic" lowering strategy -- `llc` is
expected to
abort if it sees an operand bundle that it does not understand.

There is no requirement that the operand bundles actually make it to
the backend.  Rewriting the operand bundles into "vanilla" LLVM IR at
some point in the pipeline (instead of teaching codegen to lower them)
is a perfectly reasonable lowering strategy.


# Example use cases

A couple of usage scenarios are very briefly described below:

## Deoptimization

This is our motivating use case.  Some managed environments expect to
be able to discover the state of the abstract virtual machine at specific call
sites.  LLVM will be able to support this requirement by attaching a
`"deopt"` operand bundle containing the state of the abstract virtual
machine (as a vector of SSA values) at the appropriate call sites.
There is a straightforward way
to extend the inliner work with `"deopt"` operand bundles.

`"deopt"` operand bundles will not have to be as pessimistic about
heap effects as the general "unknown operand bundle" case -- they only
imply a read from the entire heap on function entry or function exit,
depending on what kind of deoptimization state we're interested in.
They also don't imply escaping semantics.


## Value injection

By passing in one or more `alloca`s to an `"injectable-value"` tagged
operand bundle, languages can allow the runtime to overwrite the
values of specific variables, while still preserving a significant
amount of optimization potential.



Thoughts?
-- Sanjoy

David Majnemer via llvm-dev

2015-Aug-11 04:38 UTC

head link

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> We'd like to propose a scheme to attach "operand bundles" to
call and
> invoke instructions.  This is based on the offline discussion
> mentioned in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.
>
> # Motivation & Definition
>
> Our motivation behind this is to track the state required for
> deoptimization (described briefly later) through the LLVM pipeline as
> a first-class IR citizen.  We want to do this is a way that is
> generally useful.
>
> An "operand bundle" is a set of SSA values (called "bundle
operands")
> tagged with a string (called the "bundle tag").  One or more of
such
> bundles may be attached to a call or an invoke.  The intended use of
> these values is to support "frame introspection"-like
functionality
> for managed languages.
>
>
> # Abstract Syntax
>
> The syntax of a call instruction will be changed to look like this:
>
> <result> = [tail | musttail] call [cconv] [ret attrs] <ty>
[<fnty>*]
>     <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]
>
> where operand_bundle = tag '('[ value ] (',' value )*
')'
>       value = normal SSA values
>       tag = "< some name >"
>
> In other words, after the function arguments we now have an optional
> list of operand bundles of the form `"< bundle tag
>"(bundle
> attributes, values...)`.  There can be more than one operand bundle in
> a call.  Two operand bundles in the same call instruction cannot have
> the same tag.
>
> We'd do something similar for invokes.  I'll omit the invoke syntax
> from this RFC to keep things brief.
>
> An example:
>
>     define i32 @f(i32 %x) {
>      entry:
>       %t = add i32 %x, 1
>       ret i32 %t
>     }
>
>     define void @g(i16 %val, i8* %ptr) {
>      entry:
>       call void @f(i32 10) "some-bundle"(i32 42)
"debug"(i32 100)
>       call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
>     }
>
> Note 1: Operand bundles are *not* part of a function's signature, and
> a given function may be called from multiple places with different
> kinds of operand bundles.  This reflects the fact that the operand
> bundles are conceptually a part of the *call*, not the callee being
> dispatched to.
>
> Note 2: There may be tag specific requirements not mentioned here.
> E.g. we may add a rule in the future that says operand bundles with
> the tag `"integer-id"` may only contain exactly one constant
integer.
>
>
> # IR Semantics
>
> Bundle operands (SSA values part of some operand bundle) are normal
> SSA values.  They need to dominate the call or invoke instruction
> they're being passed into and can be optimized as usual.  For
> instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a
> load feeding into an operand bundle if legal.
>
> Operand bundles are characterized by the `"< bundle tag >"`
string
> associated with them.
>
> The overall strategy is:
>
>  1. The semantics are as conservative as is reasonable for operand
>     bundles with tags that LLVM does not have a special understanding
>     of.  This way LLVM does not miscompile code by default.
>
>  2. LLVM understands the semantics of operand bundles with certain
>     specific tags more precisely, and can optimize them better.
>
> This RFC talks mainly about (1).  We will discuss (2) as we add smarts
> to LLVM about specific kinds of operand bundles.
>
> The IR-level semantics of an operand bundle with an arbitrary tag are:
>
>  1. The bundle operands passed in to a call escape in unknown ways
>     before transferring control to the callee.  For instance:
>
>       declare void @opaque_runtime_fn()
>
>       define void @f(i32* %v) { }
>
>       define i32 @g() {
>         %t = i32* @malloc(...)
>         ;; "unknown" is a tag LLVM does not have any special
knowledge of
>         call void @f(i32* %t) "unknown"(i32* %t)
>
>         store i32 42, i32* %t
>         call void @opaque_runtime_fn();
>         ret (load i32, i32* %t)
>       }
>
>     Normally (without the `"unknown"` bundle) it would be okay to
>     optimize `@g` to return `42`.  But the `"unknown"` operand
bundle
>     escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
>     modify the location pointed to by `%t`.
>
>  2. Calls and invokes with operand bundles have unknown read / write
>     effect on the heap on entry and exit (even if the call target is
>     `readnone` or `readonly`).  For instance:
>
>       define void @f(i32* %v) { }
>
>       define i32 @g() {
>         %t = i32* @malloc(...)
>         %t.unescaped = i32* @malloc(...)
>         ;; "unknown" is a tag LLVM does not have any special
knowledge of
>         call void @f(i32* %t) "unknown"(i32* %t)
>         ret (load i32, i32* %t)
>       }
>
>     Normally it would be okay to optimize `@g` to return `undef`, but
>     the `"unknown"` bundle potentially clobbers `%t`.  Note that
it
>     clobbers `%t` only because it was *also escaped* by the
>     `"unknown"` operand bundle -- it does not clobber
`%t.unescaped`
>     because it isn't reachable from the heap yet.
>
>     However, it is okay to optimize
>
>       define void @f(i32* %v) {
>         store i32 10, i32* %v
>         print(load i32, i32* %v)
>       }
>
>       define void @g() {
>         %t = ...
>         ;; "unknown" is a tag LLVM does not have any special
knowledge of
>         call void @f(i32* %t) "unknown"()
>       }
>
>     to
>
>       define void @f(i32* %v) {
>         store i32 10, i32* %v
>         print(10)
>       }
>
>       define void @g() {
>         %t = ...
>         call void @f(i32* %t) "unknown"()
>       }
>
>     The arbitrary heap clobbering only happens on the boundaries of
>     the call operation, and therefore we can still do store-load
>     forwarding *within* `@f`.
>
> Since we haven't specified any "pure" LLVM way of accessing
the
> contents of operand bundles, the client is required to model such
> accesses as calls to opaque functions (or inline assembly).  This
> ensures that things like IPSCCP work as intended.  E.g. it is legal to
> optimize
>
>    define i32 @f(i32* %v) { ret i32 10 }
>
>    define void @g() {
>      %t = i32* @malloc(...)
>      %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>      print(%v)
>    }
>
> to
>
>    define i32 @f(i32* %v) { ret i32 10 }
>
>    define void @g() {
>      %t = i32* @malloc(...)
>      %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>      print(10)
>    }
>
> LLVM won't generally be able to inline through calls and invokes with
> operand bundles -- the inliner does not know what to replace the
> arbitrary heap accesses implied on function entry and exit with.
> However, we intend to teach the inliner to inline through calls /
> invokes with some specific kinds of operand bundles.
>
>
> # Lowering
>
> The lowering strategy will be special cased for each bundle tag.
> There won't be any "generic" lowering strategy -- `llc` is
expected to
> abort if it sees an operand bundle that it does not understand.
>
> There is no requirement that the operand bundles actually make it to
> the backend.  Rewriting the operand bundles into "vanilla" LLVM
IR at
> some point in the pipeline (instead of teaching codegen to lower them)
> is a perfectly reasonable lowering strategy.
>
>
> # Example use cases
>
> A couple of usage scenarios are very briefly described below:
>
> ## Deoptimization
>
> This is our motivating use case.  Some managed environments expect to
> be able to discover the state of the abstract virtual machine at specific
> call
> sites.  LLVM will be able to support this requirement by attaching a
> `"deopt"` operand bundle containing the state of the abstract
virtual
> machine (as a vector of SSA values) at the appropriate call sites.
> There is a straightforward way
> to extend the inliner work with `"deopt"` operand bundles.
>
> `"deopt"` operand bundles will not have to be as pessimistic
about
> heap effects as the general "unknown operand bundle" case -- they
only
> imply a read from the entire heap on function entry or function exit,
> depending on what kind of deoptimization state we're interested in.
> They also don't imply escaping semantics.
>
>
> ## Value injection
>
> By passing in one or more `alloca`s to an `"injectable-value"`
tagged
> operand bundle, languages can allow the runtime to overwrite the
> values of specific variables, while still preserving a significant
> amount of optimization potential.
>
>
>
> Thoughts?
>
This seems pretty useful, generic, call-site annotation mechanism.  I
believe that this has immediate application outside of the context of GC.

Our exception handling personality routine has a desire to know whether
some code is inside a specific try or catch.  We can feed the value coming
out of our EH pad back into the call-site, making it very clear which EH
pad the call-site is associated with.

> -- Sanjoy
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150811/4482258f/attachment.html>

Sanjoy Das via llvm-dev

2015-Aug-11 05:10 UTC

head link

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

> This seems pretty useful, generic, call-site annotation mechanism.  I
> believe that this has immediate application outside of the context of GC.
As supporting evidence, let me say that we're not using this for GC
either :).  We will use to support deoptimization [1][2] [3].  We will
continue to support precise relocating garbage collection using
statepoints.

I can go into some detail on how we plan to use this for
deoptimization if you're interested; I left out most of deopt specific
bits to avoid cluttering up the main proposal.

[1]: http://www.philipreames.com/Blog/2015/05/20/deoptimization-terminology/
[2]: http://www.oracle.com/technetwork/java/whitepaper-135217.html#dynamic
[3]: https://blog.indutny.com/a.deoptimize-me-not

-- Sanjoy

Philip Reames via llvm-dev

2015-Aug-12 19:24 UTC

head link

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

On 08/09/2015 08:32 PM, Sanjoy Das wrote:> We'd like to propose a scheme to attach "operand bundles" to
call and
> invoke instructions.  This is based on the offline discussion
> mentioned in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.I'm (obviously) in support of the overall proposal.  :)  A few details 
below.>
> # Motivation & Definition
>
> Our motivation behind this is to track the state required for
> deoptimization (described briefly later) through the LLVM pipeline as
> a first-class IR citizen.  We want to do this is a way that is
> generally useful.
>
> An "operand bundle" is a set of SSA values (called "bundle
operands")
> tagged with a string (called the "bundle tag").  One or more of
such
> bundles may be attached to a call or an invoke.  The intended use of
> these values is to support "frame introspection"-like
functionality
> for managed languages.
>
>
> # Abstract Syntax
>
> The syntax of a call instruction will be changed to look like this:
>
> <result> = [tail | musttail] call [cconv] [ret attrs] <ty>
[<fnty>*]
>      <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]
>
> where operand_bundle = tag '('[ value ] (',' value )*
')'
>        value = normal SSA values
>        tag = "< some name >"tag needs to be "some string name" or <future keyword>.  We also
need to
be clear about what the compatibility guarantees are. If I remember 
correctly, we discussed something along the following:
- string bundle names are entirely version locked to particular revision 
of LLVM.  They are for experimentation and incremental development.  
There is no attempt to forward serialize them.  In particular, using a 
string name which is out of sync with the version of LLVM can result in 
miscompiles.
- keyword bundle names become first class parts of the IR, they are 
forward serialized, and fully supported.  Obviously, getting an 
experimental string bundle name promoted to a first class keyword bundle 
will require broad discussion and buy in.

We were deliberately trying to parallel the defacto policy around 
attributes vs string-attributes.>
> In other words, after the function arguments we now have an optional
> list of operand bundles of the form `"< bundle tag
>"(bundle
> attributes, values...)`.  There can be more than one operand bundle in
> a call.  Two operand bundles in the same call instruction cannot have
> the same tag.I don't think we need that last sentence.  It should be up to the bundle 
implementation if that's legal or not.  I don't have a strong preference
here and we could easily relax this later.>
> We'd do something similar for invokes.  I'll omit the invoke syntax
> from this RFC to keep things brief.
>
> An example:
>
>      define i32 @f(i32 %x) {
>       entry:
>        %t = add i32 %x, 1
>        ret i32 %t
>      }
>
>      define void @g(i16 %val, i8* %ptr) {
>       entry:
>        call void @f(i32 10) "some-bundle"(i32 42)
"debug"(i32 100)
>        call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
>      }
>
> Note 1: Operand bundles are *not* part of a function's signature, and
> a given function may be called from multiple places with different
> kinds of operand bundles.  This reflects the fact that the operand
> bundles are conceptually a part of the *call*, not the callee being
> dispatched to.
>
> Note 2: There may be tag specific requirements not mentioned here.
> E.g. we may add a rule in the future that says operand bundles with
> the tag `"integer-id"` may only contain exactly one constant
integer.
>
>
> # IR Semantics
>
> Bundle operands (SSA values part of some operand bundle) are normal
> SSA values.  They need to dominate the call or invoke instruction
> they're being passed into and can be optimized as usual.  For
> instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a
> load feeding into an operand bundle if legal.
>
> Operand bundles are characterized by the `"< bundle tag >"`
string
> associated with them.
>
> The overall strategy is:
>
>   1. The semantics are as conservative as is reasonable for operand
>      bundles with tags that LLVM does not have a special understanding
>      of.  This way LLVM does not miscompile code by default.
>
>   2. LLVM understands the semantics of operand bundles with certain
>      specific tags more precisely, and can optimize them better.
>
> This RFC talks mainly about (1).  We will discuss (2) as we add smarts
> to LLVM about specific kinds of operand bundles.
>
> The IR-level semantics of an operand bundle with an arbitrary tag are:
>
>   1. The bundle operands passed in to a call escape in unknown ways
>      before transferring control to the callee.  For instance:
>
>        declare void @opaque_runtime_fn()
>
>        define void @f(i32* %v) { }
>
>        define i32 @g() {
>          %t = i32* @malloc(...)
>          ;; "unknown" is a tag LLVM does not have any special
knowledge of
>          call void @f(i32* %t) "unknown"(i32* %t)
>
>          store i32 42, i32* %t
>          call void @opaque_runtime_fn();
>          ret (load i32, i32* %t)
>        }
>
>      Normally (without the `"unknown"` bundle) it would be okay
to
>      optimize `@g` to return `42`.  But the `"unknown"` operand
bundle
>      escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
>      modify the location pointed to by `%t`.
>
>   2. Calls and invokes with operand bundles have unknown read / write
>      effect on the heap on entry and exit (even if the call target is
>      `readnone` or `readonly`).  For instance:I don't think we actually need this.  I think it would be perfectly fine 
to require the frontend ensure that the called function is not readonly 
if it being readonly would be problematic for the call site.  I'm not 
really opposed to this generalization - I could see it being useful - 
but I'm worried about the amount of work involved.  A *lot* of the 
optimizer assumes that attributes on a call site strictly less 
conservative than the underlying function. Changing that could have a 
long bug tail.  I'd rather defer that work until someone defines an 
operand bundle type which requires it.  The motivating example 
(deoptimization) doesn't seem to require this.>
>        define void @f(i32* %v) { }
>
>        define i32 @g() {
>          %t = i32* @malloc(...)
>          %t.unescaped = i32* @malloc(...)
>          ;; "unknown" is a tag LLVM does not have any special
knowledge of
>          call void @f(i32* %t) "unknown"(i32* %t)
>          ret (load i32, i32* %t)
>        }
>
>      Normally it would be okay to optimize `@g` to return `undef`, but
>      the `"unknown"` bundle potentially clobbers `%t`.  Note that
it
>      clobbers `%t` only because it was *also escaped* by the
>      `"unknown"` operand bundle -- it does not clobber
`%t.unescaped`
>      because it isn't reachable from the heap yet.
>
>      However, it is okay to optimize
>
>        define void @f(i32* %v) {
>          store i32 10, i32* %v
>          print(load i32, i32* %v)
>        }
>
>        define void @g() {
>          %t = ...
>          ;; "unknown" is a tag LLVM does not have any special
knowledge of
>          call void @f(i32* %t) "unknown"()
>        }
>
>      to
>
>        define void @f(i32* %v) {
>          store i32 10, i32* %v
>          print(10)
>        }
>
>        define void @g() {
>          %t = ...
>          call void @f(i32* %t) "unknown"()
>        }
>
>      The arbitrary heap clobbering only happens on the boundaries of
>      the call operation, and therefore we can still do store-load
>      forwarding *within* `@f`.
>
> Since we haven't specified any "pure" LLVM way of accessing
the
> contents of operand bundles, the client is required to model such
> accesses as calls to opaque functions (or inline assembly).I'm a bit confused by this section.  By "client" do you mean
frontend?
And what are you trying to allow in the second sentence? The first 
sentence seems sufficient.> This
> ensures that things like IPSCCP work as intended.  E.g. it is legal to
> optimize
>
>     define i32 @f(i32* %v) { ret i32 10 }
>
>     define void @g() {
>       %t = i32* @malloc(...)
>       %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>       print(%v)
>     }
>
> to
>
>     define i32 @f(i32* %v) { ret i32 10 }
>
>     define void @g() {
>       %t = i32* @malloc(...)
>       %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>       print(10)
>     }To say this differently, an operand bundle at a call site can not change 
the implementation of the called function.  This is not a mechanism for 
function interposition.>
> LLVM won't generally be able to inline through calls and invokes with
> operand bundles -- the inliner does not know what to replace the
> arbitrary heap accesses implied on function entry and exit with.
> However, we intend to teach the inliner to inline through calls /
> invokes with some specific kinds of operand bundles.
>
>
> # Lowering
>
> The lowering strategy will be special cased for each bundle tag.
> There won't be any "generic" lowering strategy -- `llc` is
expected to
> abort if it sees an operand bundle that it does not understand.
>
> There is no requirement that the operand bundles actually make it to
> the backend.  Rewriting the operand bundles into "vanilla" LLVM
IR at
> some point in the pipeline (instead of teaching codegen to lower them)
> is a perfectly reasonable lowering strategy.
>
>
> # Example use cases
>
> A couple of usage scenarios are very briefly described below:
>
> ## Deoptimization
>
> This is our motivating use case.  Some managed environments expect to
> be able to discover the state of the abstract virtual machine at specific
call
> sites.  LLVM will be able to support this requirement by attaching a
> `"deopt"` operand bundle containing the state of the abstract
virtual
> machine (as a vector of SSA values) at the appropriate call sites.
> There is a straightforward way
> to extend the inliner work with `"deopt"` operand bundles.
>
> `"deopt"` operand bundles will not have to be as pessimistic
about
> heap effects as the general "unknown operand bundle" case -- they
only
> imply a read from the entire heap on function entry or function exit,
> depending on what kind of deoptimization state we're interested in.
> They also don't imply escaping semantics.An alternate framing here which would remove the attribute case I was 
worried about about would be to separate the memory and abstract state 
semantics of deoptimization.  If the deopt bundle only described the 
abstract state and it was up to the frontend to ensure the callee was at 
least readonly, we wouldn't need to model memory in the deopt bundle.  I 
think that's a much better starting place.>
>
> ## Value injection
>
> By passing in one or more `alloca`s to an `"injectable-value"`
tagged
> operand bundle, languages can allow the runtime to overwrite the
> values of specific variables, while still preserving a significant
> amount of optimization potential.To be clear, this was intended to model use cases like Python's ability 
to inject values into caller frames.>
>
>
> Thoughts?
> -- Sanjoy

Sanjoy Das via llvm-dev

2015-Aug-12 21:58 UTC

head link

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

> tag needs to be "some string name" or <future keyword>.  We
also need to be
> clear about what the compatibility guarantees are. If I remember correctly,
> we discussed something along the following:
> - string bundle names are entirely version locked to particular revision of
> LLVM.  They are for experimentation and incremental development.  There is
> no attempt to forward serialize them.  In particular, using a string name
> which is out of sync with the version of LLVM can result in miscompiles.
> - keyword bundle names become first class parts of the IR, they are forward
> serialized, and fully supported.  Obviously, getting an experimental string
> bundle name promoted to a first class keyword bundle will require broad
> discussion and buy in.
>
> We were deliberately trying to parallel the defacto policy around
attributes
> vs string-attributes.
Agreed.
>> In other words, after the function arguments we now have an optional
>> list of operand bundles of the form `"< bundle tag
>"(bundle
>> attributes, values...)`.  There can be more than one operand bundle in
>> a call.  Two operand bundles in the same call instruction cannot have
>> the same tag.
>
> I don't think we need that last sentence.  It should be up to the
bundle
> implementation if that's legal or not.  I don't have a strong
preference
> here and we could easily relax this later.
I'll remove the restriction.  I think it is reasonable to have this
decided per bundle type, as you suggested.
>>   2. Calls and invokes with operand bundles have unknown read / write
>>      effect on the heap on entry and exit (even if the call target is
>>      `readnone` or `readonly`).  For instance:
>
> I don't think we actually need this.  I think it would be perfectly
fine to
> require the frontend ensure that the called function is not readonly if it
> being readonly would be problematic for the call site.  I'm not really
> opposed to this generalization - I could see it being useful - but I'm
> worried about the amount of work involved.  A *lot* of the optimizer
assumes
> that attributes on a call site strictly less conservative than the
> underlying function. Changing that could have a long bug tail.  I'd
rather
> defer that work until someone defines an operand bundle type which requires
> it.  The motivating example (deoptimization) doesn't seem to require
this.
If we're doing late poll placement and if certain functions are
"frameless" in the abstract machine, then we will need this for
deoptimization.

The case I'm thinking of is:

  define void @foo() {
   ;; Can be just about any kind of uncounted loop that is readnone
   entry:
    br label %inf_loop

   inf_loop:
    br label %inf_loop
  }

  define void @caller() {
   entry:
    store i32 42, i32* @global
    call void @foo() "deopt"(i32 100)
    store i32 46, i32* @global
    ret void
  }

Right now `@foo` is `readnone`, so the first store of `i32 42` can be
DSE'ed.  However, if we insert a poll inside `@foo` later, that will
have to be given a JVM state, which we cannot do anymore since a store
that would have been done by the abstract machine has been elided.

[ moved here, because this is related ]>> `"deopt"` operand bundles will not have to be as pessimistic
about
>> heap effects as the general "unknown operand bundle" case --
they only
>> imply a read from the entire heap on function entry or function exit,
>> depending on what kind of deoptimization state we're interested in.
>> They also don't imply escaping semantics.
>
> An alternate framing here which would remove the attribute case I was
> worried about about would be to separate the memory and abstract state
> semantics of deoptimization.  If the deopt bundle only described the
> abstract state and it was up to the frontend to ensure the callee was at
> least readonly, we wouldn't need to model memory in the deopt bundle. 
I
> think that's a much better starting place.
Semantically, I think we need the state of the heap to be consistent
at method call boundaries, not within a method boundary.  For
instance, consider this:

  ;; @global is 0 to start with

  define void @f() readonly {
    ;; do whatever
    call read_only_safepoint_poll() readonly "deopt"( ... deopt state
local to @f ...)
  }

  define void @g() {
    call void @f() "deopt"( ... deopt state local to @g ...)
    if (*@global == 42) { side_effect(); }
    store i32 42, i32* @global
  }

If we do not have the reads-everything-on-exit property, then this is
a valid transform:

  define void @f() readonly {
    ;; do whatever
    call read_only_safepoint_poll() readonly "deopt"( ... deopt state
local to @f ...)
    if (*@global == 42) { side_effect(); }
    store i32 42, i32* @global
  }

  define void @g() {
    call void @f() "deopt"( ... deopt state local to @g ...)
  }

If we *don't* inline `@f` into `@g`, and `@f` wants to deoptimize `@g`
(and only `@g`) after halting the thread at
`read_only_safepoint_poll`, we're in trouble.  `@f` will execute the
store to `@global` before returning, and the deoptimized `@g` will
call `side_effect` when it shouldn't have.  (Note: I put the `if
(*@global == 42)` to make the problem more obvious, but in practice I
think doing the same store twice is also problematic).  Another way to
state this is that even though the state of the heap was consistent at
the call to `read_only_safepoint_poll`, it will not be consistent when
`@f` returns.  Therefore we cannot use a "deopt `@g` on return with
vmstate xyz" scheme, unless we model the operand bundle as reading the
entire heap on return of `@f` (this would force the state of the heap
to be consistent at the point where we actually use the vmstate).

There is an analogous case where we have to model the deopt operand
bundle as reads-everything-on-entry: if we have cases where we
deoptimize on entry.  IOW, something like this:

  ; @global starts off as 0

  define void @side_exit() readonly {
    call void @deoptimize_my_caller()
    return
  }

  define void @store_field(ref) {
   (*@global)++;
 lbl:
   if (ref == nullptr) {
     call void @side_exit() ;; vm_state = at label lbl
     unreachable
   } else {
     ref->field = 42;
   }
  }

could be transformed to

  define void @side_exit() readonly {
    (*@global)++;
    call void @deoptimize_my_caller()
    return
  }

  define void @store_field(ref) {
 lbl:
   if (ref == nullptr) {
     call void @side_exit() ;; vm_state = at label lbl
     unreachable
   } else {
     (*@global)++;
     ref->field = 42;
   }
  }

Now if `ref` is null and we do not inline `@side_exit` then we will
end up incrementing `@global` twice.

In practice I think we can work around these issues by marking
`@side_exit` and `@f` as external, so that inter-procedural code
motion does not happen but

 a. That would be a workaround, the semantic issues will still exist
 b. LLVM is still free to specialize external functions.

As a meta point, I think the right way to view operand bundles is as
something that *happens* before and after an call / invoke, not as a
set of values being passed around.  For that reason, do you think they
should be renamed to be something else?
>> Since we haven't specified any "pure" LLVM way of
accessing the
>> contents of operand bundles, the client is required to model such
>> accesses as calls to opaque functions (or inline assembly).
>
> I'm a bit confused by this section.  By "client" do you mean
frontend?  And
> what are you trying to allow in the second sentence? The first sentence
> seems sufficient.
>>
>> This
>> ensures that things like IPSCCP work as intended.  E.g. it is legal to
>> optimize
>>
> To say this differently, an operand bundle at a call site can not change
the
> implementation of the called function.  This is not a mechanism for
function
> interposition.
I was really trying to say "whatever the optimizer directly
understands about the IR is correct", so you're right, this is about
disallowing arbitrary function interposition.

-- Sanjoy

Hal Finkel via llvm-dev

2015-Aug-19 08:52 UTC

head link

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

----- Original Message -----
> From: "David Majnemer" <david.majnemer at gmail.com>
> To: "Sanjoy Das" <sanjoy at playingwithpointers.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames"
> <listmail at philipreames.com>, "Chandler Carruth"
> <chandlerc at gmail.com>, "Nick Lewycky" <nlewycky at
google.com>, "Hal
> Finkel" <hfinkel at anl.gov>, "Chen Li" <meloli87
at gmail.com>, "Russell
> Hadley" <rhadley at microsoft.com>, "Kevin
Modzelewski"
> <kmod at dropbox.com>, "Swaroop Sridhar"
> <Swaroop.Sridhar at microsoft.com>, rudi at dropbox.com, "Pat
Gavlin"
> <pagavlin at microsoft.com>, "Joseph Tremoulet" <jotrem
at microsoft.com>,
> "Reid Kleckner" <rnk at google.com>
> Sent: Monday, August 10, 2015 11:38:32 PM
> Subject: Re: RFC: Add "operand bundles" to calls and invokes
> On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das <
> sanjoy at playingwithpointers.com > wrote:
> > We'd like to propose a scheme to attach "operand
bundles" to call
> > and
> 
> > invoke instructions. This is based on the offline discussion
> 
> > mentioned in
> 
> > http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html .
> 
> > # Motivation & Definition
> 
> > Our motivation behind this is to track the state required for
> 
> > deoptimization (described briefly later) through the LLVM pipeline
> > as
> 
> > a first-class IR citizen. We want to do this is a way that is
> 
> > generally useful.
> 
> > An "operand bundle" is a set of SSA values (called
"bundle
> > operands")
> 
> > tagged with a string (called the "bundle tag"). One or more
of such
> 
> > bundles may be attached to a call or an invoke. The intended use of
> 
> > these values is to support "frame introspection"-like
functionality
> 
> > for managed languages.
> 
> > # Abstract Syntax
> 
> > The syntax of a call instruction will be changed to look like this:
> 
> > <result> = [tail | musttail] call [cconv] [ret attrs] <ty>
> > [<fnty>*]
> 
> > <fnptrval>(<function args>) [operand_bundle*] [fn attrs]
> 
> > where operand_bundle = tag '('[ value ] (',' value )*
')'
> 
> > value = normal SSA values
> 
> > tag = "< some name >"
> 
> > In other words, after the function arguments we now have an
> > optional
> 
> > list of operand bundles of the form `"< bundle tag
>"(bundle
> 
> > attributes, values...)`. There can be more than one operand bundle
> > in
> 
> > a call. Two operand bundles in the same call instruction cannot
> > have
> 
> > the same tag.
> 
> > We'd do something similar for invokes. I'll omit the invoke
syntax
> 
> > from this RFC to keep things brief.
> 
> > An example:
> 
> > define i32 @f(i32 %x) {
> 
> > entry:
> 
> > %t = add i32 %x, 1
> 
> > ret i32 %t
> 
> > }
> 
> > define void @g(i16 %val, i8* %ptr) {
> 
> > entry:
> 
> > call void @f(i32 10) "some-bundle"(i32 42)
"debug"(i32 100)
> 
> > call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
> 
> > }
> 
> > Note 1: Operand bundles are *not* part of a function's signature,
> > and
> 
> > a given function may be called from multiple places with different
> 
> > kinds of operand bundles. This reflects the fact that the operand
> 
> > bundles are conceptually a part of the *call*, not the callee being
> 
> > dispatched to.
> 
> > Note 2: There may be tag specific requirements not mentioned here.
> 
> > E.g. we may add a rule in the future that says operand bundles with
> 
> > the tag `"integer-id"` may only contain exactly one constant
> > integer.
> 
> > # IR Semantics
> 
> > Bundle operands (SSA values part of some operand bundle) are normal
> 
> > SSA values. They need to dominate the call or invoke instruction
> 
> > they're being passed into and can be optimized as usual. For
> 
> > instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM
> > a
> 
> > load feeding into an operand bundle if legal.
> 
> > Operand bundles are characterized by the `"< bundle tag
>"` string
> 
> > associated with them.
> 
> > The overall strategy is:
> 
> > 1. The semantics are as conservative as is reasonable for operand
> 
> > bundles with tags that LLVM does not have a special understanding
> 
> > of. This way LLVM does not miscompile code by default.
> 
> > 2. LLVM understands the semantics of operand bundles with certain
> 
> > specific tags more precisely, and can optimize them better.
> 
> > This RFC talks mainly about (1). We will discuss (2) as we add
> > smarts
> 
> > to LLVM about specific kinds of operand bundles.
> 
> > The IR-level semantics of an operand bundle with an arbitrary tag
> > are:
> 
> > 1. The bundle operands passed in to a call escape in unknown ways
> 
> > before transferring control to the callee. For instance:
> 
> > declare void @opaque_runtime_fn()
> 
> > define void @f(i32* %v) { }
> 
> > define i32 @g() {
> 
> > %t = i32* @malloc(...)
> 
> > ;; "unknown" is a tag LLVM does not have any special
knowledge of
> 
> > call void @f(i32* %t) "unknown"(i32* %t)
> 
> > store i32 42, i32* %t
> 
> > call void @opaque_runtime_fn();
> 
> > ret (load i32, i32* %t)
> 
> > }
> 
> > Normally (without the `"unknown"` bundle) it would be okay
to
> 
> > optimize `@g` to return `42`. But the `"unknown"` operand
bundle
> 
> > escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
> 
> > modify the location pointed to by `%t`.
> 
> > 2. Calls and invokes with operand bundles have unknown read / write
> 
> > effect on the heap on entry and exit (even if the call target is
> 
> > `readnone` or `readonly`). For instance:
> 
> > define void @f(i32* %v) { }
> 
> > define i32 @g() {
> 
> > %t = i32* @malloc(...)
> 
> > %t.unescaped = i32* @malloc(...)
> 
> > ;; "unknown" is a tag LLVM does not have any special
knowledge of
> 
> > call void @f(i32* %t) "unknown"(i32* %t)
> 
> > ret (load i32, i32* %t)
> 
> > }
> 
> > Normally it would be okay to optimize `@g` to return `undef`, but
> 
> > the `"unknown"` bundle potentially clobbers `%t`. Note that
it
> 
> > clobbers `%t` only because it was *also escaped* by the
> 
> > `"unknown"` operand bundle -- it does not clobber
`%t.unescaped`
> 
> > because it isn't reachable from the heap yet.
> 
> > However, it is okay to optimize
> 
> > define void @f(i32* %v) {
> 
> > store i32 10, i32* %v
> 
> > print(load i32, i32* %v)
> 
> > }
> 
> > define void @g() {
> 
> > %t = ...
> 
> > ;; "unknown" is a tag LLVM does not have any special
knowledge of
> 
> > call void @f(i32* %t) "unknown"()
> 
> > }
> 
> > to
> 
> > define void @f(i32* %v) {
> 
> > store i32 10, i32* %v
> 
> > print(10)
> 
> > }
> 
> > define void @g() {
> 
> > %t = ...
> 
> > call void @f(i32* %t) "unknown"()
> 
> > }
> 
> > The arbitrary heap clobbering only happens on the boundaries of
> 
> > the call operation, and therefore we can still do store-load
> 
> > forwarding *within* `@f`.
> 
> > Since we haven't specified any "pure" LLVM way of
accessing the
> 
> > contents of operand bundles, the client is required to model such
> 
> > accesses as calls to opaque functions (or inline assembly). This
> 
> > ensures that things like IPSCCP work as intended. E.g. it is legal
> > to
> 
> > optimize
> 
> > define i32 @f(i32* %v) { ret i32 10 }
> 
> > define void @g() {
> 
> > %t = i32* @malloc(...)
> 
> > %v = call i32 @f(i32* %t) "unknown"(i32* %t)
> 
> > print(%v)
> 
> > }
> 
> > to
> 
> > define i32 @f(i32* %v) { ret i32 10 }
> 
> > define void @g() {
> 
> > %t = i32* @malloc(...)
> 
> > %v = call i32 @f(i32* %t) "unknown"(i32* %t)
> 
> > print(10)
> 
> > }
> 
> > LLVM won't generally be able to inline through calls and invokes
> > with
> 
> > operand bundles -- the inliner does not know what to replace the
> 
> > arbitrary heap accesses implied on function entry and exit with.
> 
> > However, we intend to teach the inliner to inline through calls /
> 
> > invokes with some specific kinds of operand bundles.
> 
> > # Lowering
> 
> > The lowering strategy will be special cased for each bundle tag.
> 
> > There won't be any "generic" lowering strategy -- `llc`
is expected
> > to
> 
> > abort if it sees an operand bundle that it does not understand.
> 
> > There is no requirement that the operand bundles actually make it
> > to
> 
> > the backend. Rewriting the operand bundles into "vanilla"
LLVM IR
> > at
> 
> > some point in the pipeline (instead of teaching codegen to lower
> > them)
> 
> > is a perfectly reasonable lowering strategy.
> 
> > # Example use cases
> 
> > A couple of usage scenarios are very briefly described below:
> 
> > ## Deoptimization
> 
> > This is our motivating use case. Some managed environments expect
> > to
> 
> > be able to discover the state of the abstract virtual machine at
> > specific call
> 
> > sites. LLVM will be able to support this requirement by attaching a
> 
> > `"deopt"` operand bundle containing the state of the
abstract
> > virtual
> 
> > machine (as a vector of SSA values) at the appropriate call sites.
> 
> > There is a straightforward way
> 
> > to extend the inliner work with `"deopt"` operand bundles.
> 
> > `"deopt"` operand bundles will not have to be as pessimistic
about
> 
> > heap effects as the general "unknown operand bundle" case --
they
> > only
> 
> > imply a read from the entire heap on function entry or function
> > exit,
> 
> > depending on what kind of deoptimization state we're interested
in.
> 
> > They also don't imply escaping semantics.
> 
> > ## Value injection
> 
> > By passing in one or more `alloca`s to an
`"injectable-value"`
> > tagged
> 
> > operand bundle, languages can allow the runtime to overwrite the
> 
> > values of specific variables, while still preserving a significant
> 
> > amount of optimization potential.
> 
> > Thoughts?
> 
> This seems pretty useful, generic, call-site annotation mechanism.Agreed. It seems like these would be useful for our existing patchpoints too (to
record the live values for the associated stack map, instead of using extra
intrinsic arguments for them).

-Hal 
> I believe that this has immediate application outside of the context
> of GC.
> Our exception handling personality routine has a desire to know
> whether some code is inside a specific try or catch. We can feed the
> value coming out of our EH pad back into the call-site, making it
> very clear which EH pad the call-site is associated with.
> > -- Sanjoy
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150819/799904b3/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Aug 2015 - RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

[llvm-dev] RFC: Add "operand bundles" to calls and invokes

Possibly Parallel Threads