Artur Pilipenko via llvm-dev
2021-Jan-11 23:13 UTC
[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle
> On Jan 11, 2021, at 2:40 PM, Johannes Doerfert <johannesdoerfert at gmail.com> wrote: > > Hi Artur, > > On 1/11/21 4:25 PM, Artur Pilipenko wrote: >> I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all. > > Correct. > > >> So, as a workaround you introduce a use which implicitly reads and writes. > > Correct, for now. We could add "readonly"/"writeonly" etc. later on. > > >> But this might be a more general problem. For example: >> >> a = new ... >> store a, ptr, !nocapture >> a' = load ptr >> ; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing. >> v1 = load a >> store 5, a' >> v2 = load a >> >> We would happily replace v2 with v1 even though the memory was clobbered by the store through a’. > > Right. But that is not strictly speaking a problem. You can build things with the annotation > that are nonsensical, though, that is nothing new. Especially if you employ the annotations > alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .My concern here is that we miscompile a program which is seemingly correct. None of the users of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture. It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says: "``!nocapture`` metadata on the instruction tells the optimizer that the pointer stored is not captured in the sense that all uses of the pointer are explicitly marked otherwise” a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the Annotated instruction? Is it uses of the memory modified by the store? b) Does the example above violate this statement somehow? Basically, what am I doing wrong that I get a miscompile on this example? Artur> > Note that we do not inline a call with an "unkown" operand bundle, so there is no fear we > accidentally produce such a situation as you pointed out. A "proper" version of the example > would be: > > ``` > a = new > store a, ptr, !nocapture > call foo(ptr, a) !nocapture_use(a) > > void foo(arg_ptr. arg_a) { > a' = load arg_ptr > v1 = load arg_a > ... > } > ``` > which should be OK. > > Does that make sense? > > ~ Johannes > > >> >> Artur >> >>> On Jan 7, 2021, at 4:20 PM, Johannes Doerfert via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> TL;DR: A pointer stored in memory is not necessarily captured, let's add a way to express this. >>> >>> Phab: https://reviews.llvm.org/D93189 >>> >>> --- Commit Message / Rational --- >>> >>> Runtime functions, as well as regular functions, might require a pointer >>> to be passed in memory even though the memory is simply a means to pass >>> (multiple) arguments. That is, the indirection through memory is only >>> used on the call edge and not otherwise relevant. However, such pointers >>> are currently assumed to escape as soon as they are stored in memory >>> even if the callee only reloads them and use them in a "non-escaping" way. >>> Generally, storing a pointer might not cause it to escape if all "uses of >>> the memory" it is stored to all have the "nocapture" property. >>> >>> To allow optimizations in the presence of pointers stored to memory we >>> introduce two new IR extensions. `!nocapture` metadata on stores and >>> "nocapture_use" operand bundles for call(base) instructions. The former >>> ensures that the store can be ignored for the purpose of escape >>> analysis. The latter indicates that a call is using a pointer value >>> but not capturing it. This is important as the call might still read >>> or write the pointer and since the passing of the pointer through >>> memory is not considered "capturing" with the "nocapture" metadata, >>> we need to otherwise indicate the potential read/write. >>> >>> As an example use case where we can deduce `!nocapture` metadata, >>> consider the following code: >>> >>> ``` >>> struct Payload { >>> int *a; >>> double *b; >>> }; >>> >>> int pthread_create(pthread_t *thread, const pthread_attr_t *attr, >>> void *(*start_routine) (void *), void *arg); >>> >>> int use(double); >>> >>> void fn(void *v) { >>> Payload *p = (Payload*)(v); >>> // Load the pointers from the payload and then dereference them, >>> // this will not capture the pointers. >>> int *a = p->a; >>> double *b = p->b; >>> *a = use(*b); >>> } >>> >>> void foo(int *a, double *b) { >>> Payload p = {a, b}; >>> pthread_create(..., &fn, &p); >>> } >>> ``` >>> >>> Given the usage of the payload struct in `fn` we can conclude neither >>> `a` nor `b` in are captured in `foo`, however we could not express this >>> fact "locally" before. That is, we can deduce and annotate it for the >>> arguments `a` and `b` but only since there is no other use (later on). >>> Similarly, if the callee would not be known, we were not able to >>> describe the "nocapture" behavior of the API. >>> >>> A follow up patch will introduce `!nocapture` metadata to stores >>> generated during OpenMP lowering. This will, among other things, fix >>> PR48475. I generally expect us to find more APIs that could benefit from >>> the annotation in addition to the deduction we can do if we see the callee. >>> >>> --- >>> >>> As always, feedback is welcome. Feel free to look at the phab patch as well. >>> >>> Thanks, >>> Johannes >>> >>> >>> -- >>> ────────── >>> ∽ Johannes >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Johannes Doerfert via llvm-dev
2021-Jan-11 23:46 UTC
[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle
On 1/11/21 5:13 PM, Artur Pilipenko wrote:> >> On Jan 11, 2021, at 2:40 PM, Johannes Doerfert <johannesdoerfert at gmail.com> wrote: >> >> Hi Artur, >> >> On 1/11/21 4:25 PM, Artur Pilipenko wrote: >>> I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all. >> Correct. >> >> >>> So, as a workaround you introduce a use which implicitly reads and writes. >> Correct, for now. We could add "readonly"/"writeonly" etc. later on. >> >> >>> But this might be a more general problem. For example: >>> >>> a = new ... >>> store a, ptr, !nocapture >>> a' = load ptr >>> ; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing. >>> v1 = load a >>> store 5, a' >>> v2 = load a >>> >>> We would happily replace v2 with v1 even though the memory was clobbered by the store through a’. >> Right. But that is not strictly speaking a problem. You can build things with the annotation >> that are nonsensical, though, that is nothing new. Especially if you employ the annotations >> alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 . > My concern here is that we miscompile a program which is seemingly correct. None of the users > of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture. > > It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says: > "``!nocapture`` metadata on the instruction tells the optimizer that the pointer > stored is not captured in the sense that all uses of the pointer are explicitly > marked otherwise” > > a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the > Annotated instruction? Is it uses of the memory modified by the store?It is uses of the stored pointer. So if you never load the pointer from the location you stored it using `!nocapture`, there are no uses and "all uses" are explicitly marked. If you do load it, you should make sure the use is "explicitly marked otherwise" because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`. In your example, that explicit use is missing. So you load `ptr` but that instruction is not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use, but unlikely what you want. If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`, which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured. (This assumes you ensured `%a'` is not captured.) I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far as it would be more complex. On the other hand, it would make it clear that usage of only one of them is, so far, discouraged since it can easily lead to unexpected results.> b) Does the example above violate this statement somehow?So far, there is no violation per se possible. The semantics cannot be violated, as stated right now. Using the annotation changes the program semantic, if that change is not what you wanted, that is a problem but not a miscompile (IMHO).> > Basically, what am I doing wrong that I get a miscompile on this example?You don't get the clobber because you did not explicitly mark the use of `%a` in `%a'`. WDYT? ~ Johannes> > Artur > >> Note that we do not inline a call with an "unkown" operand bundle, so there is no fear we >> accidentally produce such a situation as you pointed out. A "proper" version of the example >> would be: >> >> ``` >> a = new >> store a, ptr, !nocapture >> call foo(ptr, a) !nocapture_use(a) >> >> void foo(arg_ptr. arg_a) { >> a' = load arg_ptr >> v1 = load arg_a >> ... >> } >> ``` >> which should be OK. >> >> Does that make sense? >> >> ~ Johannes >> >> >>> Artur >>> >>>> On Jan 7, 2021, at 4:20 PM, Johannes Doerfert via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>> >>>> TL;DR: A pointer stored in memory is not necessarily captured, let's add a way to express this. >>>> >>>> Phab: https://reviews.llvm.org/D93189 >>>> >>>> --- Commit Message / Rational --- >>>> >>>> Runtime functions, as well as regular functions, might require a pointer >>>> to be passed in memory even though the memory is simply a means to pass >>>> (multiple) arguments. That is, the indirection through memory is only >>>> used on the call edge and not otherwise relevant. However, such pointers >>>> are currently assumed to escape as soon as they are stored in memory >>>> even if the callee only reloads them and use them in a "non-escaping" way. >>>> Generally, storing a pointer might not cause it to escape if all "uses of >>>> the memory" it is stored to all have the "nocapture" property. >>>> >>>> To allow optimizations in the presence of pointers stored to memory we >>>> introduce two new IR extensions. `!nocapture` metadata on stores and >>>> "nocapture_use" operand bundles for call(base) instructions. The former >>>> ensures that the store can be ignored for the purpose of escape >>>> analysis. The latter indicates that a call is using a pointer value >>>> but not capturing it. This is important as the call might still read >>>> or write the pointer and since the passing of the pointer through >>>> memory is not considered "capturing" with the "nocapture" metadata, >>>> we need to otherwise indicate the potential read/write. >>>> >>>> As an example use case where we can deduce `!nocapture` metadata, >>>> consider the following code: >>>> >>>> ``` >>>> struct Payload { >>>> int *a; >>>> double *b; >>>> }; >>>> >>>> int pthread_create(pthread_t *thread, const pthread_attr_t *attr, >>>> void *(*start_routine) (void *), void *arg); >>>> >>>> int use(double); >>>> >>>> void fn(void *v) { >>>> Payload *p = (Payload*)(v); >>>> // Load the pointers from the payload and then dereference them, >>>> // this will not capture the pointers. >>>> int *a = p->a; >>>> double *b = p->b; >>>> *a = use(*b); >>>> } >>>> >>>> void foo(int *a, double *b) { >>>> Payload p = {a, b}; >>>> pthread_create(..., &fn, &p); >>>> } >>>> ``` >>>> >>>> Given the usage of the payload struct in `fn` we can conclude neither >>>> `a` nor `b` in are captured in `foo`, however we could not express this >>>> fact "locally" before. That is, we can deduce and annotate it for the >>>> arguments `a` and `b` but only since there is no other use (later on). >>>> Similarly, if the callee would not be known, we were not able to >>>> describe the "nocapture" behavior of the API. >>>> >>>> A follow up patch will introduce `!nocapture` metadata to stores >>>> generated during OpenMP lowering. This will, among other things, fix >>>> PR48475. I generally expect us to find more APIs that could benefit from >>>> the annotation in addition to the deduction we can do if we see the callee. >>>> >>>> --- >>>> >>>> As always, feedback is welcome. Feel free to look at the phab patch as well. >>>> >>>> Thanks, >>>> Johannes >>>> >>>> >>>> -- >>>> ────────── >>>> ∽ Johannes >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev