thr3ads.net - llvm dev - [llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2021-Jan-19 00:09 UTC

[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

On 1/12/21 8:41 PM, Artur Pilipenko wrote:>> On Jan 11, 2021, at 9:44 PM, Johannes Doerfert<johannesdoerfert at
gmail.com>  wrote:
>>
>> The email formatting is somewhat broken. Might be my client.
>> I try to answer inline, if I missed something or there is any other
problem, let me know.
>>
>>
>> On 1/11/21 6:39 PM, Artur Pilipenko wrote:
>>> On Jan 11, 2021, at 3:46 PM, Johannes Doerfert <johannesdoerfert
at gmail.com<mailto:johannesdoerfert at gmail.com>> wrote:
>>>
>>>
>>> On 1/11/21 5:13 PM, Artur Pilipenko wrote:
>>>
>>> On Jan 11, 2021, at 2:40 PM, Johannes Doerfert <johannesdoerfert
at gmail.com<mailto:johannesdoerfert at gmail.com>> wrote:
>>>
>>> Hi Artur,
>>>
>>> On 1/11/21 4:25 PM, Artur Pilipenko wrote:
>>> I'm a bit confused with nocapture_use. I guess you need this
because without it BasicAA would assume that the pointer is not accessed by the
call at all.
>>> Correct.
>>>
>>>
>>>   So, as a workaround you introduce a use which implicitly reads
and writes.
>>> Correct, for now. We could add
"readonly"/"writeonly" etc. later on.
>>>
>>>
>>> But this might be a more general problem. For example:
>>>
>>> a = new ...
>>> store a, ptr, !nocapture
>>> a' = load ptr
>>> ; Now you have 2 pointers to the same object (a' and a ) which
BasicAA considers as no aliasing.
>>> v1 = load a
>>> store 5, a'
>>> v2 = load a
>>>
>>> We would happily replace v2 with v1 even though the memory was
clobbered by the store through a’.
>>> Right. But that is not strictly speaking a problem. You can build
things with the annotation
>>> that are nonsensical, though, that is nothing new. Especially if
you employ the annotations
>>> alone you might not find a good use case,
seehttps://reviews.llvm.org/D93189#2485826  .
>>> My concern here is that we miscompile a program which is seemingly
correct. None of the users
>>> of pointer a escape the pointer. So, I assume it should be correct
to mark the store as !nocapture.
>>>
>>> It looks like you assume a more restrictive definition for
!nocapture. The proposed lang ref says:
>>> "``!nocapture`` metadata on the instruction tells the
optimizer that the pointer
>>> stored is not captured in the sense that all uses of the pointer
are explicitly
>>> marked otherwise”
>>>
>>> a) What do you mean by "uses of the pointer” here? Is it uses
of the pointer value stored by the
>>> Annotated instruction? Is it uses of the memory modified by the
store?
>>>
>>> It is uses of the stored pointer. So if you never load the pointer
from the location
>>> you stored it using `!nocapture`, there are no uses and "all
uses" are explicitly
>>> marked. If you do load it, you should make sure the use is
"explicitly marked otherwise"
>>> because you do not get a "dependence edge" from the
`store %a %ptr !nocapture` to `%a' = load %ptr`.
>>> In your example, that explicit use is missing. So you load `ptr`
but that instruction is
>>> not annotated with an explicit use of `a`. Now, this could actually
be OK, depending on the use,
>>> but unlikely what you want.
>>>
>>> If we would have operand bundles on loads you could do: `%a' =
load %ptr ["nocapture_use"(%a)]`,
>>> which would correspond to the intended use `call @f(%ptr)
["nocapture_use"(%a)]`. That way you would
>>> be able to communicate `%a` through memory (here `%ptr`) without
causing it to be captured.
>>> (This assumes you ensured `%a'` is not captured.)
>>>
>>> I think we could require `!nocapture` to be used with
"nocapture_use" but I resisted so far
>>> as it would be more complex.
>>> On the other hand, it would make it clear that usage of only one of
them is, so far,
>>> discouraged since it can easily lead to unexpected results.
>>> So, basically every use of the value loaded from memory which was
previously stored to as !nocapture
>>> needs to have an annotation indicating that this is an alias of the
original pointer.
>> Not necessarily an alias but a use of the original pointer. More below.
> In general case we might have a pointer which is not the original pointer
but an alias for it.
>
> E.g. a select between the original pointer and some other value:
> a = new ...
> store a, ptr, !nocapture
> a' = load ptr
> s = select c, a', some other ptr
> v1 = load s
>
> Or a derived pointer based off the original pointer:
> a = new ...
> store a, ptr, !nocapture
> a' = load ptr
> gep = gep a', offset
> v1 = load get
>>> Do we need to annotate things like geps/bitcasts?
>>>
>>> What if the use is a phi or a select?
>>> a = new ...
>>> store a, ptr, !nocapture
>>> a' = load ptr
>>> s = select c, a', some other ptr ; do we annotate the select?
>>> v1 = load s ; or this load?
>>>
>>> It looks like currently we don’t have the means to annotate the
uses of the loaded value. We might
>>> need to prohibit loads of !nocapture-stored values altogether (if
this is a load in the same function as
>>> the nocapture store).
>> Right, we could do that.
>>
>>
>>> b) Does the example above violate this statement somehow?
>>>
>>> So far, there is no violation per se possible. The semantics cannot
be violated,
>>> as stated right now. Using the annotation changes the program
semantic, if that change is not
>>> what you wanted, that is a problem but not a miscompile (IMHO).
>>> The way I’m looking at this is I have a program without !nocapture
metadata and operand bundles
>>> and I want to derive it using some analysis.
>>> (We actually have an escape analysis capable of handling object
graphs.
>>>
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
>>> https://www.youtube.com/watch?v=WHiU2-h_kRM
>>> We are looking into ways we can communicate facts this analysis
computes with the rest of the
>>> optimizer. But we were looking into noalias and alias.scope
metadata for such purpose.)
>>> So, I don’t want to change the behavior using the metadata, I want
to derive it and for that I need
>>> to understand the semantic it implies.
>>>
>>> BTW, in your motivating example, what are the benefits you expect
from the nocapture property?
>> You can mark the pointers in the caller nocapture, with all the
benefits that might bring.
>> For a more direct example, see this
thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
>> Basically, we pass a bunch of pointers to an API. Since we need to pass
an unknown number, we
>> do it with an array of void* and a size. The runtime will only load the
pointers and not cause
>> them to escape, though it looks like they could. The same problem
appears for memory latency hiding
>> when you do an OpenMP target map, basically a bulk device memcpy. Since
you do more than one, you
>> pass a bunch of pointers via a void* array, which causes them to
escape. This causes spurious aliases
>> that we know cannot exist, though the runtime is not (supposed to be)
linked statically and you need
>> domain knowledge to deal with this. I split the OpenMP usage code from
the patch but the initial version
>> still had it:https://reviews.llvm.org/D93189?id=311479  I'll clean
it up and put it on phab again soon.
> In general we want to have a way to express the results of some more
powerful escape analysis in the IR
> so CaptureTracking/BasicAA can take advantage of this analysis. In your
case this analysis is some domain
> specific knowledge, in our case it is a downstream analysis pass.
>
> We can easily express basic facts like capture/nocapture using attributes
and metadata. Things get more
> difficult when we need to express aliasing properties. For example, your
case when the call modifies an
> otherwise unescaped pointer. While the proposed solution works for the
motivational example, I’m afraid
> it’s not really extensible for the general case.
>
> First, we would need to have the ability to attach operand bundles to
instructions of any kind. Second, there
> are open questions, like how do we treat uses which are not a memory
operations? How do we deal with
> aliases? These things are not addressed in the current proposal.
>
> The best idea I had so far regarding expressing the results of our EA in
the IR was to use noalias and
> alias.scope metadata. Essentially every unescaped object can be assigned to
a separate scope and every
> memory operation can be marked as noalias wrt the scopes we know it doesn’t
touch.
>
> In fact, the scheme you propose where every user of a !nocapture pointer is
annotated with the original
> pointer resembles the alias scopes and noalias metadata. The difference is
alias scopes and noalias use
> Indirection through metadata while your proposal relies on operand bundles
to directly link the original
> pointer.
>
> Do you think you can apply something like this to you problem?
There is a fundamental difference with the proposal here and alias 
scopes that is not
"indirection through metadata". `!nocapture` +
`"nocapture_use"` are
independent of the
surrounding. This is crucial because it avoids the scaling problem that 
comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As 
part of the Fortran
alias discussions and the AA monthly working group, we have seen how 
problematic large
amounts of `noalias` metadata can be. You basically start listing edges 
in your
alias/dependence graph at some point. We are looking at alternative 
solutions but those
are so far unrelated to he use case that is discussed here.

>> Given an escape analysis you might be interested in the same use case.
A pointer is stored but you can
>> see from the uses of the stored-to location that it does not escape. As
you pointed out, if those uses
>> are non-calls we cannot attach "nocapture_use" for now, which
is a problem. On the other hand, if they
>> are in the same function, we could forward the pointer and eliminate
the load, or the memory is already
>> aliasing something and it is unclear how much we can deduce anyway.
> The unfortunately property of this approach is that we can’t simply compute
the analysis and annotate the
> IR with the results. First we need to shape the IR in the way so we can
attach the results of the analysis.
> This kind of shaping (store load forwarding to eliminate extra aliases) may
require the results of this analysis
> itself (this is assuming we want to reuse some of the existing LLVM
transforms to do store load forwarding).
You seem to argue this approach is not a good fit for your analysis, 
which I agree. That seems
to be not a problem on it's own. Do you have an alternative proposal 
that would subsume this
approach?


Partially related: Maybe you want to give a short presentation in our 
next AA call about your
analysis and the ideas you have in this space:

https://docs.google.com/document/d/1ybwEKDVtIbhIhK50qYtwKsL50K-NvB6LfuBsfepBZ9Y/edit?usp=sharing


~ Johannes


> Artur
>> If we would want to use `!nocapture`
>> more fine-grained we could try to come up with a way to tie it to a
"nocapture_use", but that would
>> certainly make it more complicated.
>> ~ Johannes
>>
>>
>>> Artur
>>>
>>>
>>>
>>> Basically, what am I doing wrong that I get a miscompile on this
example?
>>>
>>> You don't get the clobber because you did not explicitly mark
the use of `%a` in `%a'`.
>>>
>>> WDYT?
>>>
>>> ~ Johannes
>>>
>>>
>>>
>>> Artur
>>>
>>> Note that we do not inline a call with an "unkown"
operand bundle, so there is no fear we
>>> accidentally produce such a situation as you pointed out. A
"proper" version of the example
>>> would be:
>>>
>>> ```
>>> a = new
>>> store a, ptr, !nocapture
>>> call foo(ptr, a) !nocapture_use(a)
>>>
>>> void foo(arg_ptr. arg_a) {
>>>    a' = load arg_ptr
>>>    v1 = load arg_a
>>>   ...
>>> }
>>> ```
>>> which should be OK.
>>>
>>> Does that make sense?
>>>
>>> ~ Johannes
>>>
>>>
>>> Artur
>>>
>>> On Jan 7, 2021, at 4:20 PM, Johannes Doerfert via llvm-dev
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:
>>>
>>> TL;DR: A pointer stored in memory is not necessarily captured,
let's add a way to express this.
>>>
>>> Phab:https://reviews.llvm.org/D93189
>>>
>>> --- Commit Message / Rational ---
>>>
>>> Runtime functions, as well as regular functions, might require a
pointer
>>> to be passed in memory even though the memory is simply a means to
pass
>>> (multiple) arguments. That is, the indirection through memory is
only
>>> used on the call edge and not otherwise relevant. However, such
pointers
>>> are currently assumed to escape as soon as they are stored in
memory
>>> even if the callee only reloads them and use them in a
"non-escaping" way.
>>> Generally, storing a pointer might not cause it to escape if all
"uses of
>>> the memory" it is stored to all have the "nocapture"
property.
>>>
>>> To allow optimizations in the presence of pointers stored to memory
we
>>> introduce two new IR extensions. `!nocapture` metadata on stores
and
>>> "nocapture_use" operand bundles for call(base)
instructions. The former
>>> ensures that the store can be ignored for the purpose of escape
>>> analysis. The latter indicates that a call is using a pointer value
>>> but not capturing it. This is important as the call might still
read
>>> or write the pointer and since the passing of the pointer through
>>> memory is not considered "capturing" with the
"nocapture" metadata,
>>> we need to otherwise indicate the potential read/write.
>>>
>>> As an example use case where we can deduce `!nocapture` metadata,
>>> consider the following code:
>>>
>>> ```
>>> struct Payload {
>>>   int *a;
>>>   double *b;
>>> };
>>>
>>> int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
>>>                     void *(*start_routine) (void *), void *arg);
>>>
>>> int use(double);
>>>
>>> void fn(void *v) {
>>>   Payload *p = (Payload*)(v);
>>>   // Load the pointers from the payload and then dereference them,
>>>   // this will not capture the pointers.
>>>   int *a = p->a;
>>>   double *b = p->b;
>>>   *a = use(*b);
>>> }
>>>
>>> void foo(int *a, double *b) {
>>>   Payload p = {a, b};
>>>   pthread_create(..., &fn, &p);
>>> }
>>> ```
>>>
>>> Given the usage of the payload struct in `fn` we can conclude
neither
>>> `a` nor `b` in are captured in `foo`, however we could not express
this
>>> fact "locally" before. That is, we can deduce and
annotate it for the
>>> arguments `a` and `b` but only since there is no other use (later
on).
>>> Similarly, if the callee would not be known, we were not able to
>>> describe the "nocapture" behavior of the API.
>>>
>>> A follow up patch will introduce `!nocapture` metadata to stores
>>> generated during OpenMP lowering. This will, among other things,
fix
>>> PR48475. I generally expect us to find more APIs that could benefit
from
>>> the annotation in addition to the deduction we can do if we see the
callee.
>>>
>>> ---
>>>
>>> As always, feedback is welcome. Feel free to look at the phab patch
as well.
>>>
>>> Thanks,
>>>   Johannes
>>>
>>>
>>> --
>>> ──────────
>>> ∽ Johannes
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>

Artur Pilipenko via llvm-dev

2021-Jan-20 22:07 UTC

head link

[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

On Jan 18, 2021, at 4:09 PM, Johannes Doerfert <johannesdoerfert at
gmail.com<mailto:johannesdoerfert at gmail.com>> wrote:

On 1/12/21 8:41 PM, Artur Pilipenko wrote:
On Jan 11, 2021, at 9:44 PM, Johannes Doerfert<johannesdoerfert at
gmail.com<mailto:johannesdoerfert at gmail.com>>  wrote:

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let
me know.


On 1/11/21 6:39 PM, Artur Pilipenko wrote:
On Jan 11, 2021, at 3:46 PM, Johannes Doerfert <johannesdoerfert at
gmail.com<mailto:johannesdoerfert at gmail.com><mailto:johannesdoerfert
at gmail.com>> wrote:


On 1/11/21 5:13 PM, Artur Pilipenko wrote:

On Jan 11, 2021, at 2:40 PM, Johannes Doerfert <johannesdoerfert at
gmail.com<mailto:johannesdoerfert at gmail.com><mailto:johannesdoerfert
at gmail.com>> wrote:

Hi Artur,

On 1/11/21 4:25 PM, Artur Pilipenko wrote:
I'm a bit confused with nocapture_use. I guess you need this because without
it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.


 So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc.
later on.


But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA
considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the
store through a’.
Right. But that is not strictly speaking a problem. You can build things with
the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the
annotations
alone you might not find a good use case,
seehttps://reviews.llvm.org/D93189#2485826  .
My concern here is that we miscompile a program which is seemingly correct. None
of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the
store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The
proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the
pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the
pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the
location
you stored it using `!nocapture`, there are no uses and "all uses" are
explicitly
marked. If you do load it, you should make sure the use is "explicitly
marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr
!nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that
instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK,
depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr
["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr)
["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to
be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use"
but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so
far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously
stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original
pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original pointer but an
alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the
loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load
in the same function as
the nocapture store).
Right, we could do that.


b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that
change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and
operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
https://www.youtube.com/watch?v=WHiU2-h_kRM
We are looking into ways we can communicate facts this analysis computes with
the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such
purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it
and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the
nocapture property?
You can mark the pointers in the caller nocapture, with all the benefits that
might bring.
For a more direct example, see this
thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an
unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers
and not cause
them to escape, though it looks like they could. The same problem appears for
memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do
more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This
causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked
statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch
but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479  I'll clean it up and
put it on phab again soon.
In general we want to have a way to express the results of some more powerful
escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case
this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and
metadata. Things get more
difficult when we need to express aliasing properties. For example, your case
when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the
motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to
instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations?
How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR
was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a
separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t
touch.

In fact, the scheme you propose where every user of a !nocapture pointer is
annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias
scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to
directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias scopes that
is not
"indirection through metadata". `!nocapture` +
`"nocapture_use"` are independent of the
surrounding. This is crucial because it avoids the scaling problem that comes
with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As part of the
Fortran
alias discussions and the AA monthly working group, we have seen how problematic
large
amounts of `noalias` metadata can be. You basically start listing edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both cases
we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` +
`"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.


Given an escape analysis you might be interested in the same use case. A pointer
is stored but you can
see from the uses of the stored-to location that it does not escape. As you
pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a
problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load,
or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply compute the
analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach
the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may
require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms
to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis, which I
agree. That seems
to be not a problem on its own. Do you have an alternative proposal that would
subsume this
approach?
To me the fact that you need to annonate every single user of a `!nocapture`
stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this scheme. It
makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current wording of the
proposal doesn't address these
issues.

How far can we get with an approach which is conservatively correct without
explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1563
* ModRef for a isNonEscapingLocalObject and a call
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L839

If we were to support nocapture stores without explicit `"nocapture_use”`
annotations we would need to
differentiate between isNonEscapingLocalObject with StoreCaptures==true and
StoreCaptures==false.
These are two different properties and we would need to handle them separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource we would
need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false) case.
Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest thing
to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this is too
limiting for your example.
But we can come up with some analysis here as well. What we need to know is the
fact that a call doesn’t
capture pointers reachable through its arguments. This would require a new
argument attribute, something
like nocapture_content.

My next comment is only somewhat related to `"nocapture_use”` issue. Your
current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as nocapture because
the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact then
any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively correct
analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well (it can be
used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture` metadata makes
it possible to express
flow-sensitive facts (a store into the same underlying object might be nocapture
on one path, but capture on some
other). We would give up this flow-sensitivity if we make the underlying object
instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for a
conservatively correct analysis.



Partially related: Maybe you want to give a short presentation in our next AA
call about your
analysis and the ideas you have in this space:

https://docs.google.com/document/d/1ybwEKDVtIbhIhK50qYtwKsL50K-NvB6LfuBsfepBZ9Y/edit?usp=sharing
I’ll plan to join.

Artur


~ Johannes



Artur
If we would want to use `!nocapture`
more fine-grained we could try to come up with a way to tie it to a
"nocapture_use", but that would
certainly make it more complicated.
~ Johannes


Artur



Basically, what am I doing wrong that I get a miscompile on this example?

You don't get the clobber because you did not explicitly mark the use of
`%a` in `%a'`.

WDYT?

~ Johannes



Artur

Note that we do not inline a call with an "unkown" operand bundle, so
there is no fear we
accidentally produce such a situation as you pointed out. A "proper"
version of the example
would be:

```
a = new
store a, ptr, !nocapture
call foo(ptr, a) !nocapture_use(a)

void foo(arg_ptr. arg_a) {
  a' = load arg_ptr
  v1 = load arg_a
 ...
}
```
which should be OK.

Does that make sense?

~ Johannes


Artur

On Jan 7, 2021, at 4:20 PM, Johannes Doerfert via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org><mailto:llvm-dev at
lists.llvm.org>> wrote:

TL;DR: A pointer stored in memory is not necessarily captured, let's add a
way to express this.

Phab:https://reviews.llvm.org/D93189

--- Commit Message / Rational ---

Runtime functions, as well as regular functions, might require a pointer
to be passed in memory even though the memory is simply a means to pass
(multiple) arguments. That is, the indirection through memory is only
used on the call edge and not otherwise relevant. However, such pointers
are currently assumed to escape as soon as they are stored in memory
even if the callee only reloads them and use them in a "non-escaping"
way.
Generally, storing a pointer might not cause it to escape if all "uses of
the memory" it is stored to all have the "nocapture" property.

To allow optimizations in the presence of pointers stored to memory we
introduce two new IR extensions. `!nocapture` metadata on stores and
"nocapture_use" operand bundles for call(base) instructions. The
former
ensures that the store can be ignored for the purpose of escape
analysis. The latter indicates that a call is using a pointer value
but not capturing it. This is important as the call might still read
or write the pointer and since the passing of the pointer through
memory is not considered "capturing" with the "nocapture"
metadata,
we need to otherwise indicate the potential read/write.

As an example use case where we can deduce `!nocapture` metadata,
consider the following code:

```
struct Payload {
 int *a;
 double *b;
};

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                   void *(*start_routine) (void *), void *arg);

int use(double);

void fn(void *v) {
 Payload *p = (Payload*)(v);
 // Load the pointers from the payload and then dereference them,
 // this will not capture the pointers.
 int *a = p->a;
 double *b = p->b;
 *a = use(*b);
}

void foo(int *a, double *b) {
 Payload p = {a, b};
 pthread_create(..., &fn, &p);
}
```

Given the usage of the payload struct in `fn` we can conclude neither
`a` nor `b` in are captured in `foo`, however we could not express this
fact "locally" before. That is, we can deduce and annotate it for the
arguments `a` and `b` but only since there is no other use (later on).
Similarly, if the callee would not be known, we were not able to
describe the "nocapture" behavior of the API.

A follow up patch will introduce `!nocapture` metadata to stores
generated during OpenMP lowering. This will, among other things, fix
PR48475. I generally expect us to find more APIs that could benefit from
the annotation in addition to the deduction we can do if we see the callee.

---

As always, feedback is welcome. Feel free to look at the phab patch as well.

Thanks,
 Johannes


--
──────────
∽ Johannes

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org><mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210120/b1721193/attachment-0001.html>

llvm dev - Jan 2021 - [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

[llvm-dev] [RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle