Sanjoy Das via llvm-dev
2015-Aug-10 03:32 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
We'd like to propose a scheme to attach "operand bundles" to call and invoke instructions. This is based on the offline discussion mentioned in http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html. # Motivation & Definition Our motivation behind this is to track the state required for deoptimization (described briefly later) through the LLVM pipeline as a first-class IR citizen. We want to do this is a way that is generally useful. An "operand bundle" is a set of SSA values (called "bundle operands") tagged with a string (called the "bundle tag"). One or more of such bundles may be attached to a call or an invoke. The intended use of these values is to support "frame introspection"-like functionality for managed languages. # Abstract Syntax The syntax of a call instruction will be changed to look like this: <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [operand_bundle*] [fn attrs] where operand_bundle = tag '('[ value ] (',' value )* ')' value = normal SSA values tag = "< some name >" In other words, after the function arguments we now have an optional list of operand bundles of the form `"< bundle tag >"(bundle attributes, values...)`. There can be more than one operand bundle in a call. Two operand bundles in the same call instruction cannot have the same tag. We'd do something similar for invokes. I'll omit the invoke syntax from this RFC to keep things brief. An example: define i32 @f(i32 %x) { entry: %t = add i32 %x, 1 ret i32 %t } define void @g(i16 %val, i8* %ptr) { entry: call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100) call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr) } Note 1: Operand bundles are *not* part of a function's signature, and a given function may be called from multiple places with different kinds of operand bundles. This reflects the fact that the operand bundles are conceptually a part of the *call*, not the callee being dispatched to. Note 2: There may be tag specific requirements not mentioned here. E.g. we may add a rule in the future that says operand bundles with the tag `"integer-id"` may only contain exactly one constant integer. # IR Semantics Bundle operands (SSA values part of some operand bundle) are normal SSA values. They need to dominate the call or invoke instruction they're being passed into and can be optimized as usual. For instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a load feeding into an operand bundle if legal. Operand bundles are characterized by the `"< bundle tag >"` string associated with them. The overall strategy is: 1. The semantics are as conservative as is reasonable for operand bundles with tags that LLVM does not have a special understanding of. This way LLVM does not miscompile code by default. 2. LLVM understands the semantics of operand bundles with certain specific tags more precisely, and can optimize them better. This RFC talks mainly about (1). We will discuss (2) as we add smarts to LLVM about specific kinds of operand bundles. The IR-level semantics of an operand bundle with an arbitrary tag are: 1. The bundle operands passed in to a call escape in unknown ways before transferring control to the callee. For instance: declare void @opaque_runtime_fn() define void @f(i32* %v) { } define i32 @g() { %t = i32* @malloc(...) ;; "unknown" is a tag LLVM does not have any special knowledge of call void @f(i32* %t) "unknown"(i32* %t) store i32 42, i32* %t call void @opaque_runtime_fn(); ret (load i32, i32* %t) } Normally (without the `"unknown"` bundle) it would be okay to optimize `@g` to return `42`. But the `"unknown"` operand bundle escapes `%t`, and the call to `@opaque_runtime_fn` can therefore modify the location pointed to by `%t`. 2. Calls and invokes with operand bundles have unknown read / write effect on the heap on entry and exit (even if the call target is `readnone` or `readonly`). For instance: define void @f(i32* %v) { } define i32 @g() { %t = i32* @malloc(...) %t.unescaped = i32* @malloc(...) ;; "unknown" is a tag LLVM does not have any special knowledge of call void @f(i32* %t) "unknown"(i32* %t) ret (load i32, i32* %t) } Normally it would be okay to optimize `@g` to return `undef`, but the `"unknown"` bundle potentially clobbers `%t`. Note that it clobbers `%t` only because it was *also escaped* by the `"unknown"` operand bundle -- it does not clobber `%t.unescaped` because it isn't reachable from the heap yet. However, it is okay to optimize define void @f(i32* %v) { store i32 10, i32* %v print(load i32, i32* %v) } define void @g() { %t = ... ;; "unknown" is a tag LLVM does not have any special knowledge of call void @f(i32* %t) "unknown"() } to define void @f(i32* %v) { store i32 10, i32* %v print(10) } define void @g() { %t = ... call void @f(i32* %t) "unknown"() } The arbitrary heap clobbering only happens on the boundaries of the call operation, and therefore we can still do store-load forwarding *within* `@f`. Since we haven't specified any "pure" LLVM way of accessing the contents of operand bundles, the client is required to model such accesses as calls to opaque functions (or inline assembly). This ensures that things like IPSCCP work as intended. E.g. it is legal to optimize define i32 @f(i32* %v) { ret i32 10 } define void @g() { %t = i32* @malloc(...) %v = call i32 @f(i32* %t) "unknown"(i32* %t) print(%v) } to define i32 @f(i32* %v) { ret i32 10 } define void @g() { %t = i32* @malloc(...) %v = call i32 @f(i32* %t) "unknown"(i32* %t) print(10) } LLVM won't generally be able to inline through calls and invokes with operand bundles -- the inliner does not know what to replace the arbitrary heap accesses implied on function entry and exit with. However, we intend to teach the inliner to inline through calls / invokes with some specific kinds of operand bundles. # Lowering The lowering strategy will be special cased for each bundle tag. There won't be any "generic" lowering strategy -- `llc` is expected to abort if it sees an operand bundle that it does not understand. There is no requirement that the operand bundles actually make it to the backend. Rewriting the operand bundles into "vanilla" LLVM IR at some point in the pipeline (instead of teaching codegen to lower them) is a perfectly reasonable lowering strategy. # Example use cases A couple of usage scenarios are very briefly described below: ## Deoptimization This is our motivating use case. Some managed environments expect to be able to discover the state of the abstract virtual machine at specific call sites. LLVM will be able to support this requirement by attaching a `"deopt"` operand bundle containing the state of the abstract virtual machine (as a vector of SSA values) at the appropriate call sites. There is a straightforward way to extend the inliner work with `"deopt"` operand bundles. `"deopt"` operand bundles will not have to be as pessimistic about heap effects as the general "unknown operand bundle" case -- they only imply a read from the entire heap on function entry or function exit, depending on what kind of deoptimization state we're interested in. They also don't imply escaping semantics. ## Value injection By passing in one or more `alloca`s to an `"injectable-value"` tagged operand bundle, languages can allow the runtime to overwrite the values of specific variables, while still preserving a significant amount of optimization potential. Thoughts? -- Sanjoy
David Majnemer via llvm-dev
2015-Aug-11 04:38 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> We'd like to propose a scheme to attach "operand bundles" to call and > invoke instructions. This is based on the offline discussion > mentioned in > http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html. > > # Motivation & Definition > > Our motivation behind this is to track the state required for > deoptimization (described briefly later) through the LLVM pipeline as > a first-class IR citizen. We want to do this is a way that is > generally useful. > > An "operand bundle" is a set of SSA values (called "bundle operands") > tagged with a string (called the "bundle tag"). One or more of such > bundles may be attached to a call or an invoke. The intended use of > these values is to support "frame introspection"-like functionality > for managed languages. > > > # Abstract Syntax > > The syntax of a call instruction will be changed to look like this: > > <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] > <fnptrval>(<function args>) [operand_bundle*] [fn attrs] > > where operand_bundle = tag '('[ value ] (',' value )* ')' > value = normal SSA values > tag = "< some name >" > > In other words, after the function arguments we now have an optional > list of operand bundles of the form `"< bundle tag >"(bundle > attributes, values...)`. There can be more than one operand bundle in > a call. Two operand bundles in the same call instruction cannot have > the same tag. > > We'd do something similar for invokes. I'll omit the invoke syntax > from this RFC to keep things brief. > > An example: > > define i32 @f(i32 %x) { > entry: > %t = add i32 %x, 1 > ret i32 %t > } > > define void @g(i16 %val, i8* %ptr) { > entry: > call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100) > call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr) > } > > Note 1: Operand bundles are *not* part of a function's signature, and > a given function may be called from multiple places with different > kinds of operand bundles. This reflects the fact that the operand > bundles are conceptually a part of the *call*, not the callee being > dispatched to. > > Note 2: There may be tag specific requirements not mentioned here. > E.g. we may add a rule in the future that says operand bundles with > the tag `"integer-id"` may only contain exactly one constant integer. > > > # IR Semantics > > Bundle operands (SSA values part of some operand bundle) are normal > SSA values. They need to dominate the call or invoke instruction > they're being passed into and can be optimized as usual. For > instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a > load feeding into an operand bundle if legal. > > Operand bundles are characterized by the `"< bundle tag >"` string > associated with them. > > The overall strategy is: > > 1. The semantics are as conservative as is reasonable for operand > bundles with tags that LLVM does not have a special understanding > of. This way LLVM does not miscompile code by default. > > 2. LLVM understands the semantics of operand bundles with certain > specific tags more precisely, and can optimize them better. > > This RFC talks mainly about (1). We will discuss (2) as we add smarts > to LLVM about specific kinds of operand bundles. > > The IR-level semantics of an operand bundle with an arbitrary tag are: > > 1. The bundle operands passed in to a call escape in unknown ways > before transferring control to the callee. For instance: > > declare void @opaque_runtime_fn() > > define void @f(i32* %v) { } > > define i32 @g() { > %t = i32* @malloc(...) > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"(i32* %t) > > store i32 42, i32* %t > call void @opaque_runtime_fn(); > ret (load i32, i32* %t) > } > > Normally (without the `"unknown"` bundle) it would be okay to > optimize `@g` to return `42`. But the `"unknown"` operand bundle > escapes `%t`, and the call to `@opaque_runtime_fn` can therefore > modify the location pointed to by `%t`. > > 2. Calls and invokes with operand bundles have unknown read / write > effect on the heap on entry and exit (even if the call target is > `readnone` or `readonly`). For instance: > > define void @f(i32* %v) { } > > define i32 @g() { > %t = i32* @malloc(...) > %t.unescaped = i32* @malloc(...) > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"(i32* %t) > ret (load i32, i32* %t) > } > > Normally it would be okay to optimize `@g` to return `undef`, but > the `"unknown"` bundle potentially clobbers `%t`. Note that it > clobbers `%t` only because it was *also escaped* by the > `"unknown"` operand bundle -- it does not clobber `%t.unescaped` > because it isn't reachable from the heap yet. > > However, it is okay to optimize > > define void @f(i32* %v) { > store i32 10, i32* %v > print(load i32, i32* %v) > } > > define void @g() { > %t = ... > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"() > } > > to > > define void @f(i32* %v) { > store i32 10, i32* %v > print(10) > } > > define void @g() { > %t = ... > call void @f(i32* %t) "unknown"() > } > > The arbitrary heap clobbering only happens on the boundaries of > the call operation, and therefore we can still do store-load > forwarding *within* `@f`. > > Since we haven't specified any "pure" LLVM way of accessing the > contents of operand bundles, the client is required to model such > accesses as calls to opaque functions (or inline assembly). This > ensures that things like IPSCCP work as intended. E.g. it is legal to > optimize > > define i32 @f(i32* %v) { ret i32 10 } > > define void @g() { > %t = i32* @malloc(...) > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > print(%v) > } > > to > > define i32 @f(i32* %v) { ret i32 10 } > > define void @g() { > %t = i32* @malloc(...) > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > print(10) > } > > LLVM won't generally be able to inline through calls and invokes with > operand bundles -- the inliner does not know what to replace the > arbitrary heap accesses implied on function entry and exit with. > However, we intend to teach the inliner to inline through calls / > invokes with some specific kinds of operand bundles. > > > # Lowering > > The lowering strategy will be special cased for each bundle tag. > There won't be any "generic" lowering strategy -- `llc` is expected to > abort if it sees an operand bundle that it does not understand. > > There is no requirement that the operand bundles actually make it to > the backend. Rewriting the operand bundles into "vanilla" LLVM IR at > some point in the pipeline (instead of teaching codegen to lower them) > is a perfectly reasonable lowering strategy. > > > # Example use cases > > A couple of usage scenarios are very briefly described below: > > ## Deoptimization > > This is our motivating use case. Some managed environments expect to > be able to discover the state of the abstract virtual machine at specific > call > sites. LLVM will be able to support this requirement by attaching a > `"deopt"` operand bundle containing the state of the abstract virtual > machine (as a vector of SSA values) at the appropriate call sites. > There is a straightforward way > to extend the inliner work with `"deopt"` operand bundles. > > `"deopt"` operand bundles will not have to be as pessimistic about > heap effects as the general "unknown operand bundle" case -- they only > imply a read from the entire heap on function entry or function exit, > depending on what kind of deoptimization state we're interested in. > They also don't imply escaping semantics. > > > ## Value injection > > By passing in one or more `alloca`s to an `"injectable-value"` tagged > operand bundle, languages can allow the runtime to overwrite the > values of specific variables, while still preserving a significant > amount of optimization potential. > > > > Thoughts? >This seems pretty useful, generic, call-site annotation mechanism. I believe that this has immediate application outside of the context of GC. Our exception handling personality routine has a desire to know whether some code is inside a specific try or catch. We can feed the value coming out of our EH pad back into the call-site, making it very clear which EH pad the call-site is associated with.> -- Sanjoy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150811/4482258f/attachment.html>
Sanjoy Das via llvm-dev
2015-Aug-11 05:10 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
> This seems pretty useful, generic, call-site annotation mechanism. I > believe that this has immediate application outside of the context of GC.As supporting evidence, let me say that we're not using this for GC either :). We will use to support deoptimization [1][2] [3]. We will continue to support precise relocating garbage collection using statepoints. I can go into some detail on how we plan to use this for deoptimization if you're interested; I left out most of deopt specific bits to avoid cluttering up the main proposal. [1]: http://www.philipreames.com/Blog/2015/05/20/deoptimization-terminology/ [2]: http://www.oracle.com/technetwork/java/whitepaper-135217.html#dynamic [3]: https://blog.indutny.com/a.deoptimize-me-not -- Sanjoy
Philip Reames via llvm-dev
2015-Aug-12 19:24 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
On 08/09/2015 08:32 PM, Sanjoy Das wrote:> We'd like to propose a scheme to attach "operand bundles" to call and > invoke instructions. This is based on the offline discussion > mentioned in > http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.I'm (obviously) in support of the overall proposal. :) A few details below.> > # Motivation & Definition > > Our motivation behind this is to track the state required for > deoptimization (described briefly later) through the LLVM pipeline as > a first-class IR citizen. We want to do this is a way that is > generally useful. > > An "operand bundle" is a set of SSA values (called "bundle operands") > tagged with a string (called the "bundle tag"). One or more of such > bundles may be attached to a call or an invoke. The intended use of > these values is to support "frame introspection"-like functionality > for managed languages. > > > # Abstract Syntax > > The syntax of a call instruction will be changed to look like this: > > <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] > <fnptrval>(<function args>) [operand_bundle*] [fn attrs] > > where operand_bundle = tag '('[ value ] (',' value )* ')' > value = normal SSA values > tag = "< some name >"tag needs to be "some string name" or <future keyword>. We also need to be clear about what the compatibility guarantees are. If I remember correctly, we discussed something along the following: - string bundle names are entirely version locked to particular revision of LLVM. They are for experimentation and incremental development. There is no attempt to forward serialize them. In particular, using a string name which is out of sync with the version of LLVM can result in miscompiles. - keyword bundle names become first class parts of the IR, they are forward serialized, and fully supported. Obviously, getting an experimental string bundle name promoted to a first class keyword bundle will require broad discussion and buy in. We were deliberately trying to parallel the defacto policy around attributes vs string-attributes.> > In other words, after the function arguments we now have an optional > list of operand bundles of the form `"< bundle tag >"(bundle > attributes, values...)`. There can be more than one operand bundle in > a call. Two operand bundles in the same call instruction cannot have > the same tag.I don't think we need that last sentence. It should be up to the bundle implementation if that's legal or not. I don't have a strong preference here and we could easily relax this later.> > We'd do something similar for invokes. I'll omit the invoke syntax > from this RFC to keep things brief. > > An example: > > define i32 @f(i32 %x) { > entry: > %t = add i32 %x, 1 > ret i32 %t > } > > define void @g(i16 %val, i8* %ptr) { > entry: > call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100) > call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr) > } > > Note 1: Operand bundles are *not* part of a function's signature, and > a given function may be called from multiple places with different > kinds of operand bundles. This reflects the fact that the operand > bundles are conceptually a part of the *call*, not the callee being > dispatched to. > > Note 2: There may be tag specific requirements not mentioned here. > E.g. we may add a rule in the future that says operand bundles with > the tag `"integer-id"` may only contain exactly one constant integer. > > > # IR Semantics > > Bundle operands (SSA values part of some operand bundle) are normal > SSA values. They need to dominate the call or invoke instruction > they're being passed into and can be optimized as usual. For > instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a > load feeding into an operand bundle if legal. > > Operand bundles are characterized by the `"< bundle tag >"` string > associated with them. > > The overall strategy is: > > 1. The semantics are as conservative as is reasonable for operand > bundles with tags that LLVM does not have a special understanding > of. This way LLVM does not miscompile code by default. > > 2. LLVM understands the semantics of operand bundles with certain > specific tags more precisely, and can optimize them better. > > This RFC talks mainly about (1). We will discuss (2) as we add smarts > to LLVM about specific kinds of operand bundles. > > The IR-level semantics of an operand bundle with an arbitrary tag are: > > 1. The bundle operands passed in to a call escape in unknown ways > before transferring control to the callee. For instance: > > declare void @opaque_runtime_fn() > > define void @f(i32* %v) { } > > define i32 @g() { > %t = i32* @malloc(...) > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"(i32* %t) > > store i32 42, i32* %t > call void @opaque_runtime_fn(); > ret (load i32, i32* %t) > } > > Normally (without the `"unknown"` bundle) it would be okay to > optimize `@g` to return `42`. But the `"unknown"` operand bundle > escapes `%t`, and the call to `@opaque_runtime_fn` can therefore > modify the location pointed to by `%t`. > > 2. Calls and invokes with operand bundles have unknown read / write > effect on the heap on entry and exit (even if the call target is > `readnone` or `readonly`). For instance:I don't think we actually need this. I think it would be perfectly fine to require the frontend ensure that the called function is not readonly if it being readonly would be problematic for the call site. I'm not really opposed to this generalization - I could see it being useful - but I'm worried about the amount of work involved. A *lot* of the optimizer assumes that attributes on a call site strictly less conservative than the underlying function. Changing that could have a long bug tail. I'd rather defer that work until someone defines an operand bundle type which requires it. The motivating example (deoptimization) doesn't seem to require this.> > define void @f(i32* %v) { } > > define i32 @g() { > %t = i32* @malloc(...) > %t.unescaped = i32* @malloc(...) > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"(i32* %t) > ret (load i32, i32* %t) > } > > Normally it would be okay to optimize `@g` to return `undef`, but > the `"unknown"` bundle potentially clobbers `%t`. Note that it > clobbers `%t` only because it was *also escaped* by the > `"unknown"` operand bundle -- it does not clobber `%t.unescaped` > because it isn't reachable from the heap yet. > > However, it is okay to optimize > > define void @f(i32* %v) { > store i32 10, i32* %v > print(load i32, i32* %v) > } > > define void @g() { > %t = ... > ;; "unknown" is a tag LLVM does not have any special knowledge of > call void @f(i32* %t) "unknown"() > } > > to > > define void @f(i32* %v) { > store i32 10, i32* %v > print(10) > } > > define void @g() { > %t = ... > call void @f(i32* %t) "unknown"() > } > > The arbitrary heap clobbering only happens on the boundaries of > the call operation, and therefore we can still do store-load > forwarding *within* `@f`. > > Since we haven't specified any "pure" LLVM way of accessing the > contents of operand bundles, the client is required to model such > accesses as calls to opaque functions (or inline assembly).I'm a bit confused by this section. By "client" do you mean frontend? And what are you trying to allow in the second sentence? The first sentence seems sufficient.> This > ensures that things like IPSCCP work as intended. E.g. it is legal to > optimize > > define i32 @f(i32* %v) { ret i32 10 } > > define void @g() { > %t = i32* @malloc(...) > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > print(%v) > } > > to > > define i32 @f(i32* %v) { ret i32 10 } > > define void @g() { > %t = i32* @malloc(...) > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > print(10) > }To say this differently, an operand bundle at a call site can not change the implementation of the called function. This is not a mechanism for function interposition.> > LLVM won't generally be able to inline through calls and invokes with > operand bundles -- the inliner does not know what to replace the > arbitrary heap accesses implied on function entry and exit with. > However, we intend to teach the inliner to inline through calls / > invokes with some specific kinds of operand bundles. > > > # Lowering > > The lowering strategy will be special cased for each bundle tag. > There won't be any "generic" lowering strategy -- `llc` is expected to > abort if it sees an operand bundle that it does not understand. > > There is no requirement that the operand bundles actually make it to > the backend. Rewriting the operand bundles into "vanilla" LLVM IR at > some point in the pipeline (instead of teaching codegen to lower them) > is a perfectly reasonable lowering strategy. > > > # Example use cases > > A couple of usage scenarios are very briefly described below: > > ## Deoptimization > > This is our motivating use case. Some managed environments expect to > be able to discover the state of the abstract virtual machine at specific call > sites. LLVM will be able to support this requirement by attaching a > `"deopt"` operand bundle containing the state of the abstract virtual > machine (as a vector of SSA values) at the appropriate call sites. > There is a straightforward way > to extend the inliner work with `"deopt"` operand bundles. > > `"deopt"` operand bundles will not have to be as pessimistic about > heap effects as the general "unknown operand bundle" case -- they only > imply a read from the entire heap on function entry or function exit, > depending on what kind of deoptimization state we're interested in. > They also don't imply escaping semantics.An alternate framing here which would remove the attribute case I was worried about about would be to separate the memory and abstract state semantics of deoptimization. If the deopt bundle only described the abstract state and it was up to the frontend to ensure the callee was at least readonly, we wouldn't need to model memory in the deopt bundle. I think that's a much better starting place.> > > ## Value injection > > By passing in one or more `alloca`s to an `"injectable-value"` tagged > operand bundle, languages can allow the runtime to overwrite the > values of specific variables, while still preserving a significant > amount of optimization potential.To be clear, this was intended to model use cases like Python's ability to inject values into caller frames.> > > > Thoughts? > -- Sanjoy
Sanjoy Das via llvm-dev
2015-Aug-12 21:58 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
> tag needs to be "some string name" or <future keyword>. We also need to be > clear about what the compatibility guarantees are. If I remember correctly, > we discussed something along the following: > - string bundle names are entirely version locked to particular revision of > LLVM. They are for experimentation and incremental development. There is > no attempt to forward serialize them. In particular, using a string name > which is out of sync with the version of LLVM can result in miscompiles. > - keyword bundle names become first class parts of the IR, they are forward > serialized, and fully supported. Obviously, getting an experimental string > bundle name promoted to a first class keyword bundle will require broad > discussion and buy in. > > We were deliberately trying to parallel the defacto policy around attributes > vs string-attributes.Agreed.>> In other words, after the function arguments we now have an optional >> list of operand bundles of the form `"< bundle tag >"(bundle >> attributes, values...)`. There can be more than one operand bundle in >> a call. Two operand bundles in the same call instruction cannot have >> the same tag. > > I don't think we need that last sentence. It should be up to the bundle > implementation if that's legal or not. I don't have a strong preference > here and we could easily relax this later.I'll remove the restriction. I think it is reasonable to have this decided per bundle type, as you suggested.>> 2. Calls and invokes with operand bundles have unknown read / write >> effect on the heap on entry and exit (even if the call target is >> `readnone` or `readonly`). For instance: > > I don't think we actually need this. I think it would be perfectly fine to > require the frontend ensure that the called function is not readonly if it > being readonly would be problematic for the call site. I'm not really > opposed to this generalization - I could see it being useful - but I'm > worried about the amount of work involved. A *lot* of the optimizer assumes > that attributes on a call site strictly less conservative than the > underlying function. Changing that could have a long bug tail. I'd rather > defer that work until someone defines an operand bundle type which requires > it. The motivating example (deoptimization) doesn't seem to require this.If we're doing late poll placement and if certain functions are "frameless" in the abstract machine, then we will need this for deoptimization. The case I'm thinking of is: define void @foo() { ;; Can be just about any kind of uncounted loop that is readnone entry: br label %inf_loop inf_loop: br label %inf_loop } define void @caller() { entry: store i32 42, i32* @global call void @foo() "deopt"(i32 100) store i32 46, i32* @global ret void } Right now `@foo` is `readnone`, so the first store of `i32 42` can be DSE'ed. However, if we insert a poll inside `@foo` later, that will have to be given a JVM state, which we cannot do anymore since a store that would have been done by the abstract machine has been elided. [ moved here, because this is related ]>> `"deopt"` operand bundles will not have to be as pessimistic about >> heap effects as the general "unknown operand bundle" case -- they only >> imply a read from the entire heap on function entry or function exit, >> depending on what kind of deoptimization state we're interested in. >> They also don't imply escaping semantics. > > An alternate framing here which would remove the attribute case I was > worried about about would be to separate the memory and abstract state > semantics of deoptimization. If the deopt bundle only described the > abstract state and it was up to the frontend to ensure the callee was at > least readonly, we wouldn't need to model memory in the deopt bundle. I > think that's a much better starting place.Semantically, I think we need the state of the heap to be consistent at method call boundaries, not within a method boundary. For instance, consider this: ;; @global is 0 to start with define void @f() readonly { ;; do whatever call read_only_safepoint_poll() readonly "deopt"( ... deopt state local to @f ...) } define void @g() { call void @f() "deopt"( ... deopt state local to @g ...) if (*@global == 42) { side_effect(); } store i32 42, i32* @global } If we do not have the reads-everything-on-exit property, then this is a valid transform: define void @f() readonly { ;; do whatever call read_only_safepoint_poll() readonly "deopt"( ... deopt state local to @f ...) if (*@global == 42) { side_effect(); } store i32 42, i32* @global } define void @g() { call void @f() "deopt"( ... deopt state local to @g ...) } If we *don't* inline `@f` into `@g`, and `@f` wants to deoptimize `@g` (and only `@g`) after halting the thread at `read_only_safepoint_poll`, we're in trouble. `@f` will execute the store to `@global` before returning, and the deoptimized `@g` will call `side_effect` when it shouldn't have. (Note: I put the `if (*@global == 42)` to make the problem more obvious, but in practice I think doing the same store twice is also problematic). Another way to state this is that even though the state of the heap was consistent at the call to `read_only_safepoint_poll`, it will not be consistent when `@f` returns. Therefore we cannot use a "deopt `@g` on return with vmstate xyz" scheme, unless we model the operand bundle as reading the entire heap on return of `@f` (this would force the state of the heap to be consistent at the point where we actually use the vmstate). There is an analogous case where we have to model the deopt operand bundle as reads-everything-on-entry: if we have cases where we deoptimize on entry. IOW, something like this: ; @global starts off as 0 define void @side_exit() readonly { call void @deoptimize_my_caller() return } define void @store_field(ref) { (*@global)++; lbl: if (ref == nullptr) { call void @side_exit() ;; vm_state = at label lbl unreachable } else { ref->field = 42; } } could be transformed to define void @side_exit() readonly { (*@global)++; call void @deoptimize_my_caller() return } define void @store_field(ref) { lbl: if (ref == nullptr) { call void @side_exit() ;; vm_state = at label lbl unreachable } else { (*@global)++; ref->field = 42; } } Now if `ref` is null and we do not inline `@side_exit` then we will end up incrementing `@global` twice. In practice I think we can work around these issues by marking `@side_exit` and `@f` as external, so that inter-procedural code motion does not happen but a. That would be a workaround, the semantic issues will still exist b. LLVM is still free to specialize external functions. As a meta point, I think the right way to view operand bundles is as something that *happens* before and after an call / invoke, not as a set of values being passed around. For that reason, do you think they should be renamed to be something else?>> Since we haven't specified any "pure" LLVM way of accessing the >> contents of operand bundles, the client is required to model such >> accesses as calls to opaque functions (or inline assembly). > > I'm a bit confused by this section. By "client" do you mean frontend? And > what are you trying to allow in the second sentence? The first sentence > seems sufficient. >> >> This >> ensures that things like IPSCCP work as intended. E.g. it is legal to >> optimize >> > To say this differently, an operand bundle at a call site can not change the > implementation of the called function. This is not a mechanism for function > interposition.I was really trying to say "whatever the optimizer directly understands about the IR is correct", so you're right, this is about disallowing arbitrary function interposition. -- Sanjoy
Hal Finkel via llvm-dev
2015-Aug-19 08:52 UTC
[llvm-dev] RFC: Add "operand bundles" to calls and invokes
----- Original Message -----> From: "David Majnemer" <david.majnemer at gmail.com> > To: "Sanjoy Das" <sanjoy at playingwithpointers.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Chandler Carruth" > <chandlerc at gmail.com>, "Nick Lewycky" <nlewycky at google.com>, "Hal > Finkel" <hfinkel at anl.gov>, "Chen Li" <meloli87 at gmail.com>, "Russell > Hadley" <rhadley at microsoft.com>, "Kevin Modzelewski" > <kmod at dropbox.com>, "Swaroop Sridhar" > <Swaroop.Sridhar at microsoft.com>, rudi at dropbox.com, "Pat Gavlin" > <pagavlin at microsoft.com>, "Joseph Tremoulet" <jotrem at microsoft.com>, > "Reid Kleckner" <rnk at google.com> > Sent: Monday, August 10, 2015 11:38:32 PM > Subject: Re: RFC: Add "operand bundles" to calls and invokes> On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das < > sanjoy at playingwithpointers.com > wrote:> > We'd like to propose a scheme to attach "operand bundles" to call > > and > > > invoke instructions. This is based on the offline discussion > > > mentioned in > > > http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html . >> > # Motivation & Definition >> > Our motivation behind this is to track the state required for > > > deoptimization (described briefly later) through the LLVM pipeline > > as > > > a first-class IR citizen. We want to do this is a way that is > > > generally useful. >> > An "operand bundle" is a set of SSA values (called "bundle > > operands") > > > tagged with a string (called the "bundle tag"). One or more of such > > > bundles may be attached to a call or an invoke. The intended use of > > > these values is to support "frame introspection"-like functionality > > > for managed languages. >> > # Abstract Syntax >> > The syntax of a call instruction will be changed to look like this: >> > <result> = [tail | musttail] call [cconv] [ret attrs] <ty> > > [<fnty>*] > > > <fnptrval>(<function args>) [operand_bundle*] [fn attrs] >> > where operand_bundle = tag '('[ value ] (',' value )* ')' > > > value = normal SSA values > > > tag = "< some name >" >> > In other words, after the function arguments we now have an > > optional > > > list of operand bundles of the form `"< bundle tag >"(bundle > > > attributes, values...)`. There can be more than one operand bundle > > in > > > a call. Two operand bundles in the same call instruction cannot > > have > > > the same tag. >> > We'd do something similar for invokes. I'll omit the invoke syntax > > > from this RFC to keep things brief. >> > An example: >> > define i32 @f(i32 %x) { > > > entry: > > > %t = add i32 %x, 1 > > > ret i32 %t > > > } >> > define void @g(i16 %val, i8* %ptr) { > > > entry: > > > call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100) > > > call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr) > > > } >> > Note 1: Operand bundles are *not* part of a function's signature, > > and > > > a given function may be called from multiple places with different > > > kinds of operand bundles. This reflects the fact that the operand > > > bundles are conceptually a part of the *call*, not the callee being > > > dispatched to. >> > Note 2: There may be tag specific requirements not mentioned here. > > > E.g. we may add a rule in the future that says operand bundles with > > > the tag `"integer-id"` may only contain exactly one constant > > integer. >> > # IR Semantics >> > Bundle operands (SSA values part of some operand bundle) are normal > > > SSA values. They need to dominate the call or invoke instruction > > > they're being passed into and can be optimized as usual. For > > > instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM > > a > > > load feeding into an operand bundle if legal. >> > Operand bundles are characterized by the `"< bundle tag >"` string > > > associated with them. >> > The overall strategy is: >> > 1. The semantics are as conservative as is reasonable for operand > > > bundles with tags that LLVM does not have a special understanding > > > of. This way LLVM does not miscompile code by default. >> > 2. LLVM understands the semantics of operand bundles with certain > > > specific tags more precisely, and can optimize them better. >> > This RFC talks mainly about (1). We will discuss (2) as we add > > smarts > > > to LLVM about specific kinds of operand bundles. >> > The IR-level semantics of an operand bundle with an arbitrary tag > > are: >> > 1. The bundle operands passed in to a call escape in unknown ways > > > before transferring control to the callee. For instance: >> > declare void @opaque_runtime_fn() >> > define void @f(i32* %v) { } >> > define i32 @g() { > > > %t = i32* @malloc(...) > > > ;; "unknown" is a tag LLVM does not have any special knowledge of > > > call void @f(i32* %t) "unknown"(i32* %t) >> > store i32 42, i32* %t > > > call void @opaque_runtime_fn(); > > > ret (load i32, i32* %t) > > > } >> > Normally (without the `"unknown"` bundle) it would be okay to > > > optimize `@g` to return `42`. But the `"unknown"` operand bundle > > > escapes `%t`, and the call to `@opaque_runtime_fn` can therefore > > > modify the location pointed to by `%t`. >> > 2. Calls and invokes with operand bundles have unknown read / write > > > effect on the heap on entry and exit (even if the call target is > > > `readnone` or `readonly`). For instance: >> > define void @f(i32* %v) { } >> > define i32 @g() { > > > %t = i32* @malloc(...) > > > %t.unescaped = i32* @malloc(...) > > > ;; "unknown" is a tag LLVM does not have any special knowledge of > > > call void @f(i32* %t) "unknown"(i32* %t) > > > ret (load i32, i32* %t) > > > } >> > Normally it would be okay to optimize `@g` to return `undef`, but > > > the `"unknown"` bundle potentially clobbers `%t`. Note that it > > > clobbers `%t` only because it was *also escaped* by the > > > `"unknown"` operand bundle -- it does not clobber `%t.unescaped` > > > because it isn't reachable from the heap yet. >> > However, it is okay to optimize >> > define void @f(i32* %v) { > > > store i32 10, i32* %v > > > print(load i32, i32* %v) > > > } >> > define void @g() { > > > %t = ... > > > ;; "unknown" is a tag LLVM does not have any special knowledge of > > > call void @f(i32* %t) "unknown"() > > > } >> > to >> > define void @f(i32* %v) { > > > store i32 10, i32* %v > > > print(10) > > > } >> > define void @g() { > > > %t = ... > > > call void @f(i32* %t) "unknown"() > > > } >> > The arbitrary heap clobbering only happens on the boundaries of > > > the call operation, and therefore we can still do store-load > > > forwarding *within* `@f`. >> > Since we haven't specified any "pure" LLVM way of accessing the > > > contents of operand bundles, the client is required to model such > > > accesses as calls to opaque functions (or inline assembly). This > > > ensures that things like IPSCCP work as intended. E.g. it is legal > > to > > > optimize >> > define i32 @f(i32* %v) { ret i32 10 } >> > define void @g() { > > > %t = i32* @malloc(...) > > > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > > > print(%v) > > > } >> > to >> > define i32 @f(i32* %v) { ret i32 10 } >> > define void @g() { > > > %t = i32* @malloc(...) > > > %v = call i32 @f(i32* %t) "unknown"(i32* %t) > > > print(10) > > > } >> > LLVM won't generally be able to inline through calls and invokes > > with > > > operand bundles -- the inliner does not know what to replace the > > > arbitrary heap accesses implied on function entry and exit with. > > > However, we intend to teach the inliner to inline through calls / > > > invokes with some specific kinds of operand bundles. >> > # Lowering >> > The lowering strategy will be special cased for each bundle tag. > > > There won't be any "generic" lowering strategy -- `llc` is expected > > to > > > abort if it sees an operand bundle that it does not understand. >> > There is no requirement that the operand bundles actually make it > > to > > > the backend. Rewriting the operand bundles into "vanilla" LLVM IR > > at > > > some point in the pipeline (instead of teaching codegen to lower > > them) > > > is a perfectly reasonable lowering strategy. >> > # Example use cases >> > A couple of usage scenarios are very briefly described below: >> > ## Deoptimization >> > This is our motivating use case. Some managed environments expect > > to > > > be able to discover the state of the abstract virtual machine at > > specific call > > > sites. LLVM will be able to support this requirement by attaching a > > > `"deopt"` operand bundle containing the state of the abstract > > virtual > > > machine (as a vector of SSA values) at the appropriate call sites. > > > There is a straightforward way > > > to extend the inliner work with `"deopt"` operand bundles. >> > `"deopt"` operand bundles will not have to be as pessimistic about > > > heap effects as the general "unknown operand bundle" case -- they > > only > > > imply a read from the entire heap on function entry or function > > exit, > > > depending on what kind of deoptimization state we're interested in. > > > They also don't imply escaping semantics. >> > ## Value injection >> > By passing in one or more `alloca`s to an `"injectable-value"` > > tagged > > > operand bundle, languages can allow the runtime to overwrite the > > > values of specific variables, while still preserving a significant > > > amount of optimization potential. >> > Thoughts? >> This seems pretty useful, generic, call-site annotation mechanism.Agreed. It seems like these would be useful for our existing patchpoints too (to record the live values for the associated stack map, instead of using extra intrinsic arguments for them). -Hal> I believe that this has immediate application outside of the context > of GC.> Our exception handling personality routine has a desire to know > whether some code is inside a specific try or catch. We can feed the > value coming out of our EH pad back into the call-site, making it > very clear which EH pad the call-site is associated with.> > -- Sanjoy >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150819/799904b3/attachment.html>