thr3ads.net - llvm dev - [llvm-dev] [RFC] Introducing the maynotprogress IR attribute [Sep 2020]

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2020-Sep-05 05:04 UTC

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

On 9/4/20 7:39 PM, Hal Finkel via llvm-dev wrote:
 >
 > On 9/4/20 6:31 PM, Atmn Patel via llvm-dev wrote:
 >> Hi All,
 >>
 >> We’ve prepared a new function attribute `maynotprogress` and loop
 >> metadata `llvm.loop.mustprogress` in order to better formalize the way
 >> LLVM deals with infinite loops without observable side-effects. This
 >> is deeply related to the implicit forward progress requirements in the
 >> IR.
 >>
 >> Background:
 >> There has been a demonstrated need for clarity within the forward
 >> progress requirements in LLVM IR. This discussion started with the
 >> longstanding bug [1], followed by subsequent discussion in [2,3].
 >> TL;DR is that LLVM IR needed a clear way to deal with the fact that
 >> C/C++ deems certain infinite side-effect free loops as UB while other
 >> loops, and other languages, have well defined infinite side-effect
 >> free loops.
 >>
 >> The proposed mechanism was to add an intrinsic: `llvm.sideeffect()`
 >> that should be inserted by the frontend into loops to indicate that
 >> this loop should be treated as having side-effects, and should not be
 >> optimized out. This was implemented in [4], and long story short, it’s
 >> been an imperfect solution for other language frontends in various
 >> ways.
 >
 >
 > Can you make the long story not quite so short? What kinds of problems
 > have cropped up with this solution?

You mean, in addition to the conceptual problem of introducing an
arbitrary side-effect to prevent transformations from happening?
(And yes, we should revisit `llvm.assume` next.)

So why do this at all:
1) We settle the discussion and finally define what semantic LLVM-IR
    has. Basically revisiting [1] and [5] with all the reasoning there.
    IMHO, it is in itself a good idea to write "something" about
forward
    progress down. Either non-attributed IR has it or not, but limbo is
    bad. For example, we are fairly conservative right now but still
    don't promise not to assume forward progress, the worst situation.
    In fact, we have optimizations that assume forward progress and some
    that don't, ..
2a) Assuming we always had an implicit forward progress guarantee:
     We avoid generic pessimizations for languages that require other
     semantics for the IR: Rust, C with constant loop expressions, ...
     see [1]. A function that might have an endless loop but which does
     not write memory does not write memory. I mean, we still want to
     perform transformations even if there is a path in which such a
     function is called.
2b) Assuming we do not have an implicit forward progress guarantee:
     We can delete loops that do not have side effects even if the trip
     count is unknown. I mean, we could have done that in the other case
     (2a) but we didn't in loop deletion. Though, we do remove a call if
     the function only contained such a loop, ...

Why this way:
  - The discussions before concluded IR does have a forward process
    guarantee [1,5]. So we don't want to pessimize existing code.
    That said, we want to exploit the property, e.g. in LoopDeletion,
    during the deduction of `willreturn` (and thereby for
    `isGuaranteedToTransferExecutionToSuccessor` and everything that
    transitively uses it), ...
  - Function attributes are the most fine-grained level, except
    instructions, to provide sticky semantics. The instruction level has
    however only coarse grained effects, that is why we used
    `llvm.sideeffect` so far.
  - Loop metadata is reasonably sticky in the few cases we might need it
    for optimizations, the key is dropping it doesn't compromise
    correctness.


 >>   Even C/C++ has loops with and loops without this requirement,
 >> though we could not distinguish so far.
 >
 >
 > What kinds of loop could we not distinguish? Can you please provide an
 > example?

IIRC, this loop is UB and cannot be reached:

```
int x = ((int)y) + 1;
while (x < y);
```

while this loop is not UB and can be reached:

```
while (1);
```

Though, I'm not a C language lawyer.


 >> In addition, there has been ambiguity regarding the forward-progress
 >> requirements in IR, as pointed out in [5].
 >>
 >> The introduction of the `maynotprogress` IR function attribute and the
 >> `llvm.loop.mustprogress` loop metadata tries to resolve this
 >> situation. The changes proposed are:
 >> - Document that IR Functions are by default expected to make
 >> forward-progress (as defined in the C++ Spec [6]).
 >> - Implement a `maynotprogress` IR function attribute (not droppable)
 >> to indicate that this function is exempt from the above requirement.
 >> This will allow frontends to disable IR optimizations that would
 >> otherwise optimize away their infinite loops without side-effects.
 >> - Implement `llvm.loop.mustprogress` as a loop metadata to notify to
 >> the LLVM optimization passes such as LoopDeletion that even if this
 >> function is permitted to not make progress, this loop is still
 >> required to make progress because it is not one of the infinite
 >> side-effect free loops permitted by the frontend language specs. Note
 >> that loop metadata can be dropped, so even if this metadata is
 >> dropped, we would not optimize away loops that we don’t optimize now
 >> and we wouldn’t preserve loops that we don’t preserve now.
 >
 >
 > I'm a bit worried about the following: It seems like you want to
 > handle C functions that have loops that might be infinite (i.e., those
 > with constant controlling expressions) by adding the maynotprogress
 > attribute to the containing function, and then this attribute to all
 > of the other loops. Is that correct?

That was the idea (for C/C++), yes.


 > Also, it seems semantically incorrect to inline functions with the
 > attribute into functions without the attribute unless you add the
 > attribute to the caller. Is that correct?

Yes. We actually talked about this and simply forgot to make the
attribute "sticky", same as for example SpeculativeLoadHardeningAttr.
Patch is underway.


 > If those are true, then we can end up with cases where, with functions
 > from the same C source file, we're either disallowing inlining or
 > pessimizing the optimization of all loops in the caller.

Technically true, assuming we do not use the loop metadata or it is
dropped. However, that is still an improvement to the status quo. Right
now, either we have a forward progress guarantee and the C loop that
should not be deleted might be deleted, or we don't have such a
guarantee and no loop is deleted regardless of the presence of one that
should not be deleted.

Note that it is not only about "empty" loops. Assuming finite loops
helps for loops that do something too. Take:

```
unsigned count = 0;
for (void *p = s; p != e; p += 4)
   ++count;
```

which is just fine optimized but if we dropped the `inbounds` from the
GEP, at some point, we end up with a loop instead of a closed form
expression:
   https://clang.godbolt.org/z/6dYTYe


 > Unless, when inlining, we add the attribute to the caller and also add
 > this metadata to all other loops?

For "maximal optimization opportunity" yes. For correctness, we
don't
need the metadata at all. Atmn is putting a patch up now to make the
attribute sticky, it's a one liner + tests, and then we can look at the
metadata in a follow up.


 > In any case, please explain the intended behavior of the attribute and
 > the metadata upon inlining.

The attribute will be attached to the caller upon inlining as this is
the only way to keep semantics correct. Metadata can be added to the
loops of the caller if the caller did not have the attribute, but that
is an optimization and not required. The patch for the first part will
be part of this change set.


 >> The current implementations are in:
 >> - Changes to the LoopDeletion Pass: https://reviews.llvm.org/D86844
 >> - Changes to the Clang Frontend: https://reviews.llvm.org/D86841
 >> - Changes to LangRef: https://reviews.llvm.org/D86233
 >> - Changed to IR: https://reviews.llvm.org/D85393
 >>
 >> The changes preserve what was previously accepted as the “default
 >> behavior” [5]. That is, you get forward progress assumption in case a
 >> function is not marked with the `maynotprogress` attribute. Here the
 >> default behavior is that LLVM IR functions are required to make
 >> forward-progress and are assumed to make forward progress. These
 >> attributes are aimed at helping frontends write correct code as per
 >> their language specs, and in addition, optimize out some dead loops
 >> that we weren’t able to optimize out before but could’ve.
 >>
 >> Feedback welcome.
 >>
 >> (The name of the function attribute and the loop metadata are still
 >> under discussion, and we’re definitely open to changing them.)
 >
 >
 > Some additional thoughts on the bikeshedding:
 >
 > I'm not in love with this name. might_not_progress would be better. We
 > could choose something more colloquial, like may_inf_loop. Or more
 > formal, like no_forward_progress_guarantee. I like this because it
 > seems the most technically accurate and isn't likely to be misread.
 > It's long, however, and even though we have attribute lists, it will
 > still appear a lot.
 >
 > I think that I like nfpg the best (for "no forward-progress
 > guarantee"). Why nfpg? It's technically accurate and short. Short
is
 > useful because, for some languages (any maybe even for some IR from
 > C), the attribute is going to appear on nearly every function. I know
 > that we have attribute lists, but even so. no_fpg is good too.

I'm given up on the name. To be honest, if you don't read the langref
you don't know what these things do half the time, at least not in
detail. Let's just number them, this would be `attribute70` I think ;)

Cheers,
   Johannes


 > Thanks again,
 >
 > Hal
 >
 >
 >>
 >> Atmn and Johannes
 >>
 >> [1] https://bugs.llvm.org/show_bug.cgi?id=965
 >> [2] https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
 >> [3] https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
 >> [4] https://reviews.llvm.org/D38336
 >> [5] https://reviews.llvm.org/D65718
 >> [6] https://eel.is/c++draft/intro.progress
 >> _______________________________________________
 >> LLVM Developers mailing list
 >> llvm-dev at lists.llvm.org
 >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 >

Atmn Patel via llvm-dev

2020-Sep-05 05:40 UTC

head link

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

On Sat, Sep 5, 2020 at 1:07 AM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:>
>
> On 9/4/20 7:39 PM, Hal Finkel via llvm-dev wrote:
>  >
>  > On 9/4/20 6:31 PM, Atmn Patel via llvm-dev wrote:
>  >> Hi All,
>  >>
>  >> We’ve prepared a new function attribute `maynotprogress` and loop
>  >> metadata `llvm.loop.mustprogress` in order to better formalize
the way
>  >> LLVM deals with infinite loops without observable side-effects.
This
>  >> is deeply related to the implicit forward progress requirements
in the
>  >> IR.
>  >>
>  >> Background:
>  >> There has been a demonstrated need for clarity within the forward
>  >> progress requirements in LLVM IR. This discussion started with
the
>  >> longstanding bug [1], followed by subsequent discussion in [2,3].
>  >> TL;DR is that LLVM IR needed a clear way to deal with the fact
that
>  >> C/C++ deems certain infinite side-effect free loops as UB while
other
>  >> loops, and other languages, have well defined infinite
side-effect
>  >> free loops.
>  >>
>  >> The proposed mechanism was to add an intrinsic:
`llvm.sideeffect()`
>  >> that should be inserted by the frontend into loops to indicate
that
>  >> this loop should be treated as having side-effects, and should
not be
>  >> optimized out. This was implemented in [4], and long story short,
it’s
>  >> been an imperfect solution for other language frontends in
various
>  >> ways.
>  >
>  >
>  > Can you make the long story not quite so short? What kinds of
problems
>  > have cropped up with this solution?
>
> You mean, in addition to the conceptual problem of introducing an
> arbitrary side-effect to prevent transformations from happening?
> (And yes, we should revisit `llvm.assume` next.)
>
> So why do this at all:
> 1) We settle the discussion and finally define what semantic LLVM-IR
>     has. Basically revisiting [1] and [5] with all the reasoning there.
>     IMHO, it is in itself a good idea to write "something" about
forward
>     progress down. Either non-attributed IR has it or not, but limbo is
>     bad. For example, we are fairly conservative right now but still
>     don't promise not to assume forward progress, the worst situation.
>     In fact, we have optimizations that assume forward progress and some
>     that don't, ..
> 2a) Assuming we always had an implicit forward progress guarantee:
>      We avoid generic pessimizations for languages that require other
>      semantics for the IR: Rust, C with constant loop expressions, ...
>      see [1]. A function that might have an endless loop but which does
>      not write memory does not write memory. I mean, we still want to
>      perform transformations even if there is a path in which such a
>      function is called.
> 2b) Assuming we do not have an implicit forward progress guarantee:
>      We can delete loops that do not have side effects even if the trip
>      count is unknown. I mean, we could have done that in the other case
>      (2a) but we didn't in loop deletion. Though, we do remove a call
if
>      the function only contained such a loop, ...
>
> Why this way:
>   - The discussions before concluded IR does have a forward process
>     guarantee [1,5]. So we don't want to pessimize existing code.
>     That said, we want to exploit the property, e.g. in LoopDeletion,
>     during the deduction of `willreturn` (and thereby for
>     `isGuaranteedToTransferExecutionToSuccessor` and everything that
>     transitively uses it), ...
>   - Function attributes are the most fine-grained level, except
>     instructions, to provide sticky semantics. The instruction level has
>     however only coarse grained effects, that is why we used
>     `llvm.sideeffect` so far.
>   - Loop metadata is reasonably sticky in the few cases we might need it
>     for optimizations, the key is dropping it doesn't compromise
>     correctness.
Another reason why the intrinsic is an imperfect solution is that the
`llvm.sideeffect()` intrinsic would in theory need to be emitted by
the frontend many times - potentially once for each loop and perhaps
one for each function too, depending on how the frontend language
views forward-progress.

The Rust Team implemented the intrinsic into the Rust compiler, and
found that there are significant compile time regressions (5-30%) [7]
(which is not entirely attributable to rustc needing to emit more
instructions, see [8] for some indication of this and further Rust
dialogue) because these intrinsics end up being ubiquitous and LLVM
isn't able to effectively coalesce redundant occurrences.
>
>  >>   Even C/C++ has loops with and loops without this requirement,
>  >> though we could not distinguish so far.
>  >
>  >
>  > What kinds of loop could we not distinguish? Can you please provide
an
>  > example?
>
> IIRC, this loop is UB and cannot be reached:
>
> ```
> int x = ((int)y) + 1;
> while (x < y);
> ```
>
> while this loop is not UB and can be reached:
>
> ```
> while (1);
> ```
>
> Though, I'm not a C language lawyer.
>
>
>  >> In addition, there has been ambiguity regarding the
forward-progress
>  >> requirements in IR, as pointed out in [5].
>  >>
>  >> The introduction of the `maynotprogress` IR function attribute
and the
>  >> `llvm.loop.mustprogress` loop metadata tries to resolve this
>  >> situation. The changes proposed are:
>  >> - Document that IR Functions are by default expected to make
>  >> forward-progress (as defined in the C++ Spec [6]).
>  >> - Implement a `maynotprogress` IR function attribute (not
droppable)
>  >> to indicate that this function is exempt from the above
requirement.
>  >> This will allow frontends to disable IR optimizations that would
>  >> otherwise optimize away their infinite loops without
side-effects.
>  >> - Implement `llvm.loop.mustprogress` as a loop metadata to notify
to
>  >> the LLVM optimization passes such as LoopDeletion that even if
this
>  >> function is permitted to not make progress, this loop is still
>  >> required to make progress because it is not one of the infinite
>  >> side-effect free loops permitted by the frontend language specs.
Note
>  >> that loop metadata can be dropped, so even if this metadata is
>  >> dropped, we would not optimize away loops that we don’t optimize
now
>  >> and we wouldn’t preserve loops that we don’t preserve now.
>  >
>  >
>  > I'm a bit worried about the following: It seems like you want to
>  > handle C functions that have loops that might be infinite (i.e.,
those
>  > with constant controlling expressions) by adding the maynotprogress
>  > attribute to the containing function, and then this attribute to all
>  > of the other loops. Is that correct?
>
> That was the idea (for C/C++), yes.
>
>
>  > Also, it seems semantically incorrect to inline functions with the
>  > attribute into functions without the attribute unless you add the
>  > attribute to the caller. Is that correct?
>
> Yes. We actually talked about this and simply forgot to make the
> attribute "sticky", same as for example
SpeculativeLoadHardeningAttr.
> Patch is underway.
>
>
>  > If those are true, then we can end up with cases where, with
functions
>  > from the same C source file, we're either disallowing inlining or
>  > pessimizing the optimization of all loops in the caller.
>
> Technically true, assuming we do not use the loop metadata or it is
> dropped. However, that is still an improvement to the status quo. Right
> now, either we have a forward progress guarantee and the C loop that
> should not be deleted might be deleted, or we don't have such a
> guarantee and no loop is deleted regardless of the presence of one that
> should not be deleted.
>
> Note that it is not only about "empty" loops. Assuming finite
loops
> helps for loops that do something too. Take:
>
> ```
> unsigned count = 0;
> for (void *p = s; p != e; p += 4)
>    ++count;
> ```
>
> which is just fine optimized but if we dropped the `inbounds` from the
> GEP, at some point, we end up with a loop instead of a closed form
> expression:
>    https://clang.godbolt.org/z/6dYTYe
>
>
>  > Unless, when inlining, we add the attribute to the caller and also
add
>  > this metadata to all other loops?
>
> For "maximal optimization opportunity" yes. For correctness, we
don't
> need the metadata at all. Atmn is putting a patch up now to make the
> attribute sticky, it's a one liner + tests, and then we can look at the
> metadata in a follow up.
>
>
>  > In any case, please explain the intended behavior of the attribute
and
>  > the metadata upon inlining.
>
> The attribute will be attached to the caller upon inlining as this is
> the only way to keep semantics correct. Metadata can be added to the
> loops of the caller if the caller did not have the attribute, but that
> is an optimization and not required. The patch for the first part will
> be part of this change set.
>
>
>  >> The current implementations are in:
>  >> - Changes to the LoopDeletion Pass:
https://reviews.llvm.org/D86844
>  >> - Changes to the Clang Frontend: https://reviews.llvm.org/D86841
>  >> - Changes to LangRef: https://reviews.llvm.org/D86233
>  >> - Changed to IR: https://reviews.llvm.org/D85393
>  >>
>  >> The changes preserve what was previously accepted as the “default
>  >> behavior” [5]. That is, you get forward progress assumption in
case a
>  >> function is not marked with the `maynotprogress` attribute. Here
the
>  >> default behavior is that LLVM IR functions are required to make
>  >> forward-progress and are assumed to make forward progress. These
>  >> attributes are aimed at helping frontends write correct code as
per
>  >> their language specs, and in addition, optimize out some dead
loops
>  >> that we weren’t able to optimize out before but could’ve.
>  >>
>  >> Feedback welcome.
>  >>
>  >> (The name of the function attribute and the loop metadata are
still
>  >> under discussion, and we’re definitely open to changing them.)
>  >
>  >
>  > Some additional thoughts on the bikeshedding:
>  >
>  > I'm not in love with this name. might_not_progress would be
better. We
>  > could choose something more colloquial, like may_inf_loop. Or more
>  > formal, like no_forward_progress_guarantee. I like this because it
>  > seems the most technically accurate and isn't likely to be
misread.
>  > It's long, however, and even though we have attribute lists, it
will
>  > still appear a lot.
>  >
>  > I think that I like nfpg the best (for "no forward-progress
>  > guarantee"). Why nfpg? It's technically accurate and short.
Short is
>  > useful because, for some languages (any maybe even for some IR from
>  > C), the attribute is going to appear on nearly every function. I know
>  > that we have attribute lists, but even so. no_fpg is good too.
>
> I'm given up on the name. To be honest, if you don't read the
langref
> you don't know what these things do half the time, at least not in
> detail. Let's just number them, this would be `attribute70` I think ;)
>
> Cheers,
>    Johannes
>
>
>  > Thanks again,
>  >
>  > Hal
>  >
>  >
>  >>
>  >> Atmn and Johannes
>  >>
>  >> [1] https://bugs.llvm.org/show_bug.cgi?id=965
>  >> [2]
https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
>  >> [3]
https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
>  >> [4] https://reviews.llvm.org/D38336
>  >> [5] https://reviews.llvm.org/D65718
>  >> [6] https://eel.is/c++draft/intro.progress
>  >> _______________________________________________
>  >> LLVM Developers mailing list
>  >> llvm-dev at lists.llvm.org
>  >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>  >
>
Atmn

[7] https://github.com/rust-lang/compiler-team/issues/177#issuecomment-578549329
[8] https://blog.rust-lang.org/inside-rust/2020/03/19/terminating-rust.html

Nicolai Hähnle via llvm-dev

2020-Sep-07 15:56 UTC

head link

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

Hi Johannes and Atmn,

On Sat, Sep 5, 2020 at 7:07 AM Johannes Doerfert via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>  > In any case, please explain the intended behavior of the attribute
and
>  > the metadata upon inlining.
>
> The attribute will be attached to the caller upon inlining as this is
> the only way to keep semantics correct. Metadata can be added to the
> loops of the caller if the caller did not have the attribute, but that
> is an optimization and not required. The patch for the first part will
> be part of this change set.
I don't understand why this would be correct, unless you meant this
statement to only refer to loops in the caller that contain the
inlined call.

A deeper issue I have with the proposal is that I'm concerned about
adding more attributes which add constraints on program transforms.
I'm afraid I don't have the full history of these discussions, but is
there really a _good_ reason for doing so? It would feel more natural
to have no progress requirement by default and add a "willprogress"
attribute to indicate that the so annotated function will eventually
make progress (whatever that means, which is the other issue I have
with this :))

Assuming the simple flip of polarity from the previous paragraph, what
is the difference between the existing "willreturn" and the proposed
"willprogress"? Furthermore, if there is a difference, why is it
relevant to the compiler?

Assuming the simple flip of polarity, is a new attribute needed at
all? It seems like the "mustprogress" loop metadata would be
sufficient.

As a separate comment, I don't find the reference to the C++ spec in
https://reviews.llvm.org/D86233 to be informative enough. Whenever
that section of the C++ spec talks about "progress" it is referring to
how some abstract scheduler schedules execution of multiple threads.
That is, it is talking about the dynamic behavior of the
implementation. On the other hand, the proposed attribute presumably
makes a statement about the program itself _assuming that_ its
execution gets scheduled. So there is no applicable definition of
"progress" in the cited section of the C++ spec.

Cheers,
Nicolai

>  >> [1] https://bugs.llvm.org/show_bug.cgi?id=965
>  >> [2]
https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
>  >> [3]
https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
>  >> [4] https://reviews.llvm.org/D38336
>  >> [5] https://reviews.llvm.org/D65718
>  >> [6] https://eel.is/c++draft/intro.progress

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Johannes Doerfert via llvm-dev

2020-Sep-07 16:49 UTC

head link

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

On 9/7/20 10:56 AM, Nicolai Hähnle wrote:
 > Hi Johannes and Atmn,
 >
 > On Sat, Sep 5, 2020 at 7:07 AM Johannes Doerfert via llvm-dev
 > <llvm-dev at lists.llvm.org> wrote:
 >>  > In any case, please explain the intended behavior of the 
attribute and
 >>  > the metadata upon inlining.
 >>
 >> The attribute will be attached to the caller upon inlining as this is
 >> the only way to keep semantics correct. Metadata can be added to the
 >> loops of the caller if the caller did not have the attribute, but that
 >> is an optimization and not required. The patch for the first part will
 >> be part of this change set.
 >
 > I don't understand why this would be correct, unless you meant this
 > statement to only refer to loops in the caller that contain the
 > inlined call.

I'm not sure I interpret "loops in the caller that contain the inlined
call" correctly but let me give it a try.

So this is the first inliner patch: https://reviews.llvm.org/D87180
It is alone sufficient for correctness of this scheme. A follow up can
provide better optimization opportunities by using metadata to annotate
loops in the caller, assuming the caller did not have the
`maynotprogress` function attribute already. All loops originally in the
caller would be given the loop metadata that indicates the new
`maynotprogress` inherited from the callee does not apply to them. If
the metadata is lost for some reason, we do not loose correctness.

 > A deeper issue I have with the proposal is that I'm concerned about
 > adding more attributes which add constraints on program transforms.

Let me start with something that seems to get lost:
We miscompile C/C++ programs right now and we do not apply
transformations to hide the issue one level.

That is,
```C
while(1);
```
is *not* deleted by LoopDeletion right now, which is correct.

Though,
```C
int b = a + 1;
while(b > a);
```
is *not* deleted by LoopDeletion either, which is not necessary and a
side effect of our existing failure to distinguish the two.

Finally,
```
void foo() { while (1); }
void bar() { foo(); }
```
can be optimized by us to `void bar() {}` which is *incorrect*.

So, to summarize, we are in a situation in which we constraint program
transformations, perform wrong program transformations, and require
languages like Rust to add spurious side-effects to avoid this.

 > I'm afraid I don't have the full history of these discussions, but
is
 > there really a _good_ reason for doing so? It would feel more natural
 > to have no progress requirement by default and add a
"willprogress"
 > attribute to indicate that the so annotated function will eventually
 > make progress (whatever that means, which is the other issue I have
 > with this :))

We can certainly turn everything around. Though, that would basically
regress our current IR and the "normal" C/C++ case (which is forward
progress requirement). So if people feel we want to make the IR more
restrictive than it is arguably right now, with an opt-in way to get the
current semantics back, I won't oppose that. However, all conversations
up to this point seemed to indicate that we do not want this. Please see
[1] and [5].

 > Assuming the simple flip of polarity from the previous paragraph, what
 > is the difference between the existing "willreturn" and the
proposed
 > "willprogress"? Furthermore, if there is a difference, why is it
 > relevant to the compiler?

I mean there is plenty of cases where you have one but not the other:
```
while (1) {
   atomic_add(X, 1);
}
```
does make progress but will not return. More importantly, willreturn is
a property we derive, willprogress is a property of the input language
embedded in the IR. We can use the latter to derive the former, sure,
but only one of them is rooted in (high-level) language semantics.

We want `willreturn` to improve
isGuaranteedToTransferExecutionToSuccessor (or something like that)
which then has various uses.

Take
```
foo();
*ptr = 0;
```
If we know `foo` is `willreturn` and `nounwind` we know `ptr` is
`derefereceable` prior to the call (on most targets). This allows to
hoist/sink code, improves alias analysis, ... On the other end, with
https://reviews.llvm.org/rGb22910daab95 we can remove more dead code
based on `willreturn` (and `nounwind`).

 > Assuming the simple flip of polarity, is a new attribute needed at
 > all? It seems like the "mustprogress" loop metadata would be
 > sufficient.

I think that is correct. Though, then we are back the regressing all
existing IR. I'm not opposing this choice but it seems to me of little
gain.

To put the regression into perspective: We would basically need to stop
removing an unused `readonly` + `nounwind` call as it could loop
indefinitely and that would be well defined. In the proposed scheme you
would not remove them if they carry the `maynotprogress` attribute, and
you can drop the `maynotprogress` attribute once you proven it will make
progress, e.g., once you proven `willreturn`.

 > As a separate comment, I don't find the reference to the C++ spec in
 > https://reviews.llvm.org/D86233 to be informative enough. Whenever
 > that section of the C++ spec talks about "progress" it is
referring to
 > how some abstract scheduler schedules execution of multiple threads.
 > That is, it is talking about the dynamic behavior of the
 > implementation. On the other hand, the proposed attribute presumably
 > makes a statement about the program itself _assuming that_ its
 > execution gets scheduled. So there is no applicable definition of
 > "progress" in the cited section of the C++ spec.

I don't understand. What is the alternative to "assuming that its
execution gets scheduled"?

~ Johannes

 > Cheers,
 > Nicolai
 >
 >
 >>  >> [1] https://bugs.llvm.org/show_bug.cgi?id=965
 >>  >> [2]
https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
 >>  >> [3] 
https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
 >>  >> [4] https://reviews.llvm.org/D38336
 >>  >> [5] https://reviews.llvm.org/D65718
 >>  >> [6] https://eel.is/c++draft/intro.progress
 >
 >
 >

Hal Finkel via llvm-dev

2020-Sep-09 02:08 UTC

head link

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

On 9/5/20 12:40 AM, Atmn Patel wrote:> On Sat, Sep 5, 2020 at 1:07 AM Johannes Doerfert
> <johannesdoerfert at gmail.com> wrote:
>>
>> On 9/4/20 7:39 PM, Hal Finkel via llvm-dev wrote:
>>   >
>>   > On 9/4/20 6:31 PM, Atmn Patel via llvm-dev wrote:
>>   >> Hi All,
>>   >>
>>   >> We’ve prepared a new function attribute `maynotprogress` and
loop
>>   >> metadata `llvm.loop.mustprogress` in order to better
formalize the way
>>   >> LLVM deals with infinite loops without observable
side-effects. This
>>   >> is deeply related to the implicit forward progress
requirements in the
>>   >> IR.
>>   >>
>>   >> Background:
>>   >> There has been a demonstrated need for clarity within the
forward
>>   >> progress requirements in LLVM IR. This discussion started
with the
>>   >> longstanding bug [1], followed by subsequent discussion in
[2,3].
>>   >> TL;DR is that LLVM IR needed a clear way to deal with the
fact that
>>   >> C/C++ deems certain infinite side-effect free loops as UB
while other
>>   >> loops, and other languages, have well defined infinite
side-effect
>>   >> free loops.
>>   >>
>>   >> The proposed mechanism was to add an intrinsic:
`llvm.sideeffect()`
>>   >> that should be inserted by the frontend into loops to
indicate that
>>   >> this loop should be treated as having side-effects, and
should not be
>>   >> optimized out. This was implemented in [4], and long story
short, it’s
>>   >> been an imperfect solution for other language frontends in
various
>>   >> ways.
>>   >
>>   >
>>   > Can you make the long story not quite so short? What kinds of
problems
>>   > have cropped up with this solution?
>>
>> You mean, in addition to the conceptual problem of introducing an
>> arbitrary side-effect to prevent transformations from happening?
>> (And yes, we should revisit `llvm.assume` next.)
>>
>> So why do this at all:
>> 1) We settle the discussion and finally define what semantic LLVM-IR
>>      has. Basically revisiting [1] and [5] with all the reasoning
there.
>>      IMHO, it is in itself a good idea to write "something"
about forward
>>      progress down. Either non-attributed IR has it or not, but limbo
is
>>      bad. For example, we are fairly conservative right now but still
>>      don't promise not to assume forward progress, the worst
situation.
>>      In fact, we have optimizations that assume forward progress and
some
>>      that don't, ..
>> 2a) Assuming we always had an implicit forward progress guarantee:
>>       We avoid generic pessimizations for languages that require other
>>       semantics for the IR: Rust, C with constant loop expressions, ...
>>       see [1]. A function that might have an endless loop but which
does
>>       not write memory does not write memory. I mean, we still want to
>>       perform transformations even if there is a path in which such a
>>       function is called.
>> 2b) Assuming we do not have an implicit forward progress guarantee:
>>       We can delete loops that do not have side effects even if the
trip
>>       count is unknown. I mean, we could have done that in the other
case
>>       (2a) but we didn't in loop deletion. Though, we do remove a
call if
>>       the function only contained such a loop, ...
>>
>> Why this way:
>>    - The discussions before concluded IR does have a forward process
>>      guarantee [1,5]. So we don't want to pessimize existing code.
>>      That said, we want to exploit the property, e.g. in LoopDeletion,
>>      during the deduction of `willreturn` (and thereby for
>>      `isGuaranteedToTransferExecutionToSuccessor` and everything that
>>      transitively uses it), ...
>>    - Function attributes are the most fine-grained level, except
>>      instructions, to provide sticky semantics. The instruction level
has
>>      however only coarse grained effects, that is why we used
>>      `llvm.sideeffect` so far.
>>    - Loop metadata is reasonably sticky in the few cases we might need
it
>>      for optimizations, the key is dropping it doesn't compromise
>>      correctness.
> Another reason why the intrinsic is an imperfect solution is that the
> `llvm.sideeffect()` intrinsic would in theory need to be emitted by
> the frontend many times - potentially once for each loop and perhaps
> one for each function too, depending on how the frontend language
> views forward-progress.
>
> The Rust Team implemented the intrinsic into the Rust compiler, and
> found that there are significant compile time regressions (5-30%) [7]
> (which is not entirely attributable to rustc needing to emit more
> instructions, see [8] for some indication of this and further Rust
> dialogue) because these intrinsics end up being ubiquitous and LLVM
> isn't able to effectively coalesce redundant occurrences.

Thanks, Atmn, Johannes. I definitely agree that putting an artificial 
side effect in the loop is a conservative solution, and that we can do 
better. It's important that the rationale is in the RFC, and now it is.

There is another thing that I would like for us to document: do infinite 
loops have any relationship to thread synchronization? As I recall, this 
issue came up in the past when we discussed the deletion, or not, of 
infinite loops. If a thread writes a value to memory, and then enters an 
infinite loop, is that value eventually visible to other threads? My 
understanding is that some code depends on this property. If this is 
something that people would like to see guaranteed, I suspect that we 
may need specific handling (if nothing else, so we don't sink stores 
past possibly-infinite loops).

  -Hal

>
>>   >>   Even C/C++ has loops with and loops without this
requirement,
>>   >> though we could not distinguish so far.
>>   >
>>   >
>>   > What kinds of loop could we not distinguish? Can you please
provide an
>>   > example?
>>
>> IIRC, this loop is UB and cannot be reached:
>>
>> ```
>> int x = ((int)y) + 1;
>> while (x < y);
>> ```
>>
>> while this loop is not UB and can be reached:
>>
>> ```
>> while (1);
>> ```
>>
>> Though, I'm not a C language lawyer.
>>
>>
>>   >> In addition, there has been ambiguity regarding the
forward-progress
>>   >> requirements in IR, as pointed out in [5].
>>   >>
>>   >> The introduction of the `maynotprogress` IR function
attribute and the
>>   >> `llvm.loop.mustprogress` loop metadata tries to resolve this
>>   >> situation. The changes proposed are:
>>   >> - Document that IR Functions are by default expected to make
>>   >> forward-progress (as defined in the C++ Spec [6]).
>>   >> - Implement a `maynotprogress` IR function attribute (not
droppable)
>>   >> to indicate that this function is exempt from the above
requirement.
>>   >> This will allow frontends to disable IR optimizations that
would
>>   >> otherwise optimize away their infinite loops without
side-effects.
>>   >> - Implement `llvm.loop.mustprogress` as a loop metadata to
notify to
>>   >> the LLVM optimization passes such as LoopDeletion that even
if this
>>   >> function is permitted to not make progress, this loop is
still
>>   >> required to make progress because it is not one of the
infinite
>>   >> side-effect free loops permitted by the frontend language
specs. Note
>>   >> that loop metadata can be dropped, so even if this metadata
is
>>   >> dropped, we would not optimize away loops that we don’t
optimize now
>>   >> and we wouldn’t preserve loops that we don’t preserve now.
>>   >
>>   >
>>   > I'm a bit worried about the following: It seems like you
want to
>>   > handle C functions that have loops that might be infinite (i.e.,
those
>>   > with constant controlling expressions) by adding the
maynotprogress
>>   > attribute to the containing function, and then this attribute to
all
>>   > of the other loops. Is that correct?
>>
>> That was the idea (for C/C++), yes.
>>
>>
>>   > Also, it seems semantically incorrect to inline functions with
the
>>   > attribute into functions without the attribute unless you add
the
>>   > attribute to the caller. Is that correct?
>>
>> Yes. We actually talked about this and simply forgot to make the
>> attribute "sticky", same as for example
SpeculativeLoadHardeningAttr.
>> Patch is underway.
>>
>>
>>   > If those are true, then we can end up with cases where, with
functions
>>   > from the same C source file, we're either disallowing
inlining or
>>   > pessimizing the optimization of all loops in the caller.
>>
>> Technically true, assuming we do not use the loop metadata or it is
>> dropped. However, that is still an improvement to the status quo. Right
>> now, either we have a forward progress guarantee and the C loop that
>> should not be deleted might be deleted, or we don't have such a
>> guarantee and no loop is deleted regardless of the presence of one that
>> should not be deleted.
>>
>> Note that it is not only about "empty" loops. Assuming finite
loops
>> helps for loops that do something too. Take:
>>
>> ```
>> unsigned count = 0;
>> for (void *p = s; p != e; p += 4)
>>     ++count;
>> ```
>>
>> which is just fine optimized but if we dropped the `inbounds` from the
>> GEP, at some point, we end up with a loop instead of a closed form
>> expression:
>>     https://clang.godbolt.org/z/6dYTYe
>>
>>
>>   > Unless, when inlining, we add the attribute to the caller and
also add
>>   > this metadata to all other loops?
>>
>> For "maximal optimization opportunity" yes. For correctness,
we don't
>> need the metadata at all. Atmn is putting a patch up now to make the
>> attribute sticky, it's a one liner + tests, and then we can look at
the
>> metadata in a follow up.
>>
>>
>>   > In any case, please explain the intended behavior of the
attribute and
>>   > the metadata upon inlining.
>>
>> The attribute will be attached to the caller upon inlining as this is
>> the only way to keep semantics correct. Metadata can be added to the
>> loops of the caller if the caller did not have the attribute, but that
>> is an optimization and not required. The patch for the first part will
>> be part of this change set.
>>
>>
>>   >> The current implementations are in:
>>   >> - Changes to the LoopDeletion Pass:
https://reviews.llvm.org/D86844
>>   >> - Changes to the Clang Frontend:
https://reviews.llvm.org/D86841
>>   >> - Changes to LangRef: https://reviews.llvm.org/D86233
>>   >> - Changed to IR: https://reviews.llvm.org/D85393
>>   >>
>>   >> The changes preserve what was previously accepted as the
“default
>>   >> behavior” [5]. That is, you get forward progress assumption
in case a
>>   >> function is not marked with the `maynotprogress` attribute.
Here the
>>   >> default behavior is that LLVM IR functions are required to
make
>>   >> forward-progress and are assumed to make forward progress.
These
>>   >> attributes are aimed at helping frontends write correct code
as per
>>   >> their language specs, and in addition, optimize out some
dead loops
>>   >> that we weren’t able to optimize out before but could’ve.
>>   >>
>>   >> Feedback welcome.
>>   >>
>>   >> (The name of the function attribute and the loop metadata
are still
>>   >> under discussion, and we’re definitely open to changing
them.)
>>   >
>>   >
>>   > Some additional thoughts on the bikeshedding:
>>   >
>>   > I'm not in love with this name. might_not_progress would be
better. We
>>   > could choose something more colloquial, like may_inf_loop. Or
more
>>   > formal, like no_forward_progress_guarantee. I like this because
it
>>   > seems the most technically accurate and isn't likely to be
misread.
>>   > It's long, however, and even though we have attribute lists,
it will
>>   > still appear a lot.
>>   >
>>   > I think that I like nfpg the best (for "no forward-progress
>>   > guarantee"). Why nfpg? It's technically accurate and
short. Short is
>>   > useful because, for some languages (any maybe even for some IR
from
>>   > C), the attribute is going to appear on nearly every function. I
know
>>   > that we have attribute lists, but even so. no_fpg is good too.
>>
>> I'm given up on the name. To be honest, if you don't read the
langref
>> you don't know what these things do half the time, at least not in
>> detail. Let's just number them, this would be `attribute70` I think
;)
>>
>> Cheers,
>>     Johannes
>>
>>
>>   > Thanks again,
>>   >
>>   > Hal
>>   >
>>   >
>>   >>
>>   >> Atmn and Johannes
>>   >>
>>   >> [1] https://bugs.llvm.org/show_bug.cgi?id=965
>>   >> [2]
https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
>>   >> [3]
https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html
>>   >> [4] https://reviews.llvm.org/D38336
>>   >> [5] https://reviews.llvm.org/D65718
>>   >> [6] https://eel.is/c++draft/intro.progress
>>   >> _______________________________________________
>>   >> LLVM Developers mailing list
>>   >> llvm-dev at lists.llvm.org
>>   >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>   >
>>
> Atmn
>
> [7]
https://github.com/rust-lang/compiler-team/issues/177#issuecomment-578549329
> [8] https://blog.rust-lang.org/inside-rust/2020/03/19/terminating-rust.html
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

llvm dev - Sep 2020 - [RFC] Introducing the maynotprogress IR attribute

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute

[llvm-dev] [RFC] Introducing the maynotprogress IR attribute