Atmn Patel via llvm-dev
2020-Sep-04 23:31 UTC
[llvm-dev] [RFC] Introducing the maynotprogress IR attribute
Hi All, We’ve prepared a new function attribute `maynotprogress` and loop metadata `llvm.loop.mustprogress` in order to better formalize the way LLVM deals with infinite loops without observable side-effects. This is deeply related to the implicit forward progress requirements in the IR. Background: There has been a demonstrated need for clarity within the forward progress requirements in LLVM IR. This discussion started with the longstanding bug [1], followed by subsequent discussion in [2,3]. TL;DR is that LLVM IR needed a clear way to deal with the fact that C/C++ deems certain infinite side-effect free loops as UB while other loops, and other languages, have well defined infinite side-effect free loops. The proposed mechanism was to add an intrinsic: `llvm.sideeffect()` that should be inserted by the frontend into loops to indicate that this loop should be treated as having side-effects, and should not be optimized out. This was implemented in [4], and long story short, it’s been an imperfect solution for other language frontends in various ways. Even C/C++ has loops with and loops without this requirement, though we could not distinguish so far. In addition, there has been ambiguity regarding the forward-progress requirements in IR, as pointed out in [5]. The introduction of the `maynotprogress` IR function attribute and the `llvm.loop.mustprogress` loop metadata tries to resolve this situation. The changes proposed are: - Document that IR Functions are by default expected to make forward-progress (as defined in the C++ Spec [6]). - Implement a `maynotprogress` IR function attribute (not droppable) to indicate that this function is exempt from the above requirement. This will allow frontends to disable IR optimizations that would otherwise optimize away their infinite loops without side-effects. - Implement `llvm.loop.mustprogress` as a loop metadata to notify to the LLVM optimization passes such as LoopDeletion that even if this function is permitted to not make progress, this loop is still required to make progress because it is not one of the infinite side-effect free loops permitted by the frontend language specs. Note that loop metadata can be dropped, so even if this metadata is dropped, we would not optimize away loops that we don’t optimize now and we wouldn’t preserve loops that we don’t preserve now. The current implementations are in: - Changes to the LoopDeletion Pass: https://reviews.llvm.org/D86844 - Changes to the Clang Frontend: https://reviews.llvm.org/D86841 - Changes to LangRef: https://reviews.llvm.org/D86233 - Changed to IR: https://reviews.llvm.org/D85393 The changes preserve what was previously accepted as the “default behavior” [5]. That is, you get forward progress assumption in case a function is not marked with the `maynotprogress` attribute. Here the default behavior is that LLVM IR functions are required to make forward-progress and are assumed to make forward progress. These attributes are aimed at helping frontends write correct code as per their language specs, and in addition, optimize out some dead loops that we weren’t able to optimize out before but could’ve. Feedback welcome. (The name of the function attribute and the loop metadata are still under discussion, and we’re definitely open to changing them.) Atmn and Johannes [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [3] https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html [4] https://reviews.llvm.org/D38336 [5] https://reviews.llvm.org/D65718 [6] https://eel.is/c++draft/intro.progress
Hal Finkel via llvm-dev
2020-Sep-05 00:39 UTC
[llvm-dev] [RFC] Introducing the maynotprogress IR attribute
On 9/4/20 6:31 PM, Atmn Patel via llvm-dev wrote:> Hi All, > > We’ve prepared a new function attribute `maynotprogress` and loop > metadata `llvm.loop.mustprogress` in order to better formalize the way > LLVM deals with infinite loops without observable side-effects. This > is deeply related to the implicit forward progress requirements in the > IR. > > Background: > There has been a demonstrated need for clarity within the forward > progress requirements in LLVM IR. This discussion started with the > longstanding bug [1], followed by subsequent discussion in [2,3]. > TL;DR is that LLVM IR needed a clear way to deal with the fact that > C/C++ deems certain infinite side-effect free loops as UB while other > loops, and other languages, have well defined infinite side-effect > free loops. > > The proposed mechanism was to add an intrinsic: `llvm.sideeffect()` > that should be inserted by the frontend into loops to indicate that > this loop should be treated as having side-effects, and should not be > optimized out. This was implemented in [4], and long story short, it’s > been an imperfect solution for other language frontends in various > ways.Can you make the long story not quite so short? What kinds of problems have cropped up with this solution?> Even C/C++ has loops with and loops without this requirement, > though we could not distinguish so far.What kinds of loop could we not distinguish? Can you please provide an example?> > In addition, there has been ambiguity regarding the forward-progress > requirements in IR, as pointed out in [5]. > > The introduction of the `maynotprogress` IR function attribute and the > `llvm.loop.mustprogress` loop metadata tries to resolve this > situation. The changes proposed are: > - Document that IR Functions are by default expected to make > forward-progress (as defined in the C++ Spec [6]). > - Implement a `maynotprogress` IR function attribute (not droppable) > to indicate that this function is exempt from the above requirement. > This will allow frontends to disable IR optimizations that would > otherwise optimize away their infinite loops without side-effects. > - Implement `llvm.loop.mustprogress` as a loop metadata to notify to > the LLVM optimization passes such as LoopDeletion that even if this > function is permitted to not make progress, this loop is still > required to make progress because it is not one of the infinite > side-effect free loops permitted by the frontend language specs. Note > that loop metadata can be dropped, so even if this metadata is > dropped, we would not optimize away loops that we don’t optimize now > and we wouldn’t preserve loops that we don’t preserve now.I'm a bit worried about the following: It seems like you want to handle C functions that have loops that might be infinite (i.e., those with constant controlling expressions) by adding the maynotprogress attribute to the containing function, and then this attribute to all of the other loops. Is that correct? Also, it seems semantically incorrect to inline functions with the attribute into functions without the attribute unless you add the attribute to the caller. Is that correct? If those are true, then we can end up with cases where, with functions from the same C source file, we're either disallowing inlining or pessimizing the optimization of all loops in the caller. Unless, when inlining, we add the attribute to the caller and also add this metadata to all other loops? In any case, please explain the intended behavior of the attribute and the metadata upon inlining.> > The current implementations are in: > - Changes to the LoopDeletion Pass: https://reviews.llvm.org/D86844 > - Changes to the Clang Frontend: https://reviews.llvm.org/D86841 > - Changes to LangRef: https://reviews.llvm.org/D86233 > - Changed to IR: https://reviews.llvm.org/D85393 > > The changes preserve what was previously accepted as the “default > behavior” [5]. That is, you get forward progress assumption in case a > function is not marked with the `maynotprogress` attribute. Here the > default behavior is that LLVM IR functions are required to make > forward-progress and are assumed to make forward progress. These > attributes are aimed at helping frontends write correct code as per > their language specs, and in addition, optimize out some dead loops > that we weren’t able to optimize out before but could’ve. > > Feedback welcome. > > (The name of the function attribute and the loop metadata are still > under discussion, and we’re definitely open to changing them.)Some additional thoughts on the bikeshedding: I'm not in love with this name. might_not_progress would be better. We could choose something more colloquial, like may_inf_loop. Or more formal, like no_forward_progress_guarantee. I like this because it seems the most technically accurate and isn't likely to be misread. It's long, however, and even though we have attribute lists, it will still appear a lot. I think that I like nfpg the best (for "no forward-progress guarantee"). Why nfpg? It's technically accurate and short. Short is useful because, for some languages (any maybe even for some IR from C), the attribute is going to appear on nearly every function. I know that we have attribute lists, but even so. no_fpg is good too. Thanks again, Hal> > Atmn and Johannes > > [1] https://bugs.llvm.org/show_bug.cgi?id=965 > [2] https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html > [3] https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html > [4] https://reviews.llvm.org/D38336 > [5] https://reviews.llvm.org/D65718 > [6] https://eel.is/c++draft/intro.progress > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Johannes Doerfert via llvm-dev
2020-Sep-05 05:04 UTC
[llvm-dev] [RFC] Introducing the maynotprogress IR attribute
On 9/4/20 7:39 PM, Hal Finkel via llvm-dev wrote: > > On 9/4/20 6:31 PM, Atmn Patel via llvm-dev wrote: >> Hi All, >> >> We’ve prepared a new function attribute `maynotprogress` and loop >> metadata `llvm.loop.mustprogress` in order to better formalize the way >> LLVM deals with infinite loops without observable side-effects. This >> is deeply related to the implicit forward progress requirements in the >> IR. >> >> Background: >> There has been a demonstrated need for clarity within the forward >> progress requirements in LLVM IR. This discussion started with the >> longstanding bug [1], followed by subsequent discussion in [2,3]. >> TL;DR is that LLVM IR needed a clear way to deal with the fact that >> C/C++ deems certain infinite side-effect free loops as UB while other >> loops, and other languages, have well defined infinite side-effect >> free loops. >> >> The proposed mechanism was to add an intrinsic: `llvm.sideeffect()` >> that should be inserted by the frontend into loops to indicate that >> this loop should be treated as having side-effects, and should not be >> optimized out. This was implemented in [4], and long story short, it’s >> been an imperfect solution for other language frontends in various >> ways. > > > Can you make the long story not quite so short? What kinds of problems > have cropped up with this solution? You mean, in addition to the conceptual problem of introducing an arbitrary side-effect to prevent transformations from happening? (And yes, we should revisit `llvm.assume` next.) So why do this at all: 1) We settle the discussion and finally define what semantic LLVM-IR has. Basically revisiting [1] and [5] with all the reasoning there. IMHO, it is in itself a good idea to write "something" about forward progress down. Either non-attributed IR has it or not, but limbo is bad. For example, we are fairly conservative right now but still don't promise not to assume forward progress, the worst situation. In fact, we have optimizations that assume forward progress and some that don't, .. 2a) Assuming we always had an implicit forward progress guarantee: We avoid generic pessimizations for languages that require other semantics for the IR: Rust, C with constant loop expressions, ... see [1]. A function that might have an endless loop but which does not write memory does not write memory. I mean, we still want to perform transformations even if there is a path in which such a function is called. 2b) Assuming we do not have an implicit forward progress guarantee: We can delete loops that do not have side effects even if the trip count is unknown. I mean, we could have done that in the other case (2a) but we didn't in loop deletion. Though, we do remove a call if the function only contained such a loop, ... Why this way: - The discussions before concluded IR does have a forward process guarantee [1,5]. So we don't want to pessimize existing code. That said, we want to exploit the property, e.g. in LoopDeletion, during the deduction of `willreturn` (and thereby for `isGuaranteedToTransferExecutionToSuccessor` and everything that transitively uses it), ... - Function attributes are the most fine-grained level, except instructions, to provide sticky semantics. The instruction level has however only coarse grained effects, that is why we used `llvm.sideeffect` so far. - Loop metadata is reasonably sticky in the few cases we might need it for optimizations, the key is dropping it doesn't compromise correctness. >> Even C/C++ has loops with and loops without this requirement, >> though we could not distinguish so far. > > > What kinds of loop could we not distinguish? Can you please provide an > example? IIRC, this loop is UB and cannot be reached: ``` int x = ((int)y) + 1; while (x < y); ``` while this loop is not UB and can be reached: ``` while (1); ``` Though, I'm not a C language lawyer. >> In addition, there has been ambiguity regarding the forward-progress >> requirements in IR, as pointed out in [5]. >> >> The introduction of the `maynotprogress` IR function attribute and the >> `llvm.loop.mustprogress` loop metadata tries to resolve this >> situation. The changes proposed are: >> - Document that IR Functions are by default expected to make >> forward-progress (as defined in the C++ Spec [6]). >> - Implement a `maynotprogress` IR function attribute (not droppable) >> to indicate that this function is exempt from the above requirement. >> This will allow frontends to disable IR optimizations that would >> otherwise optimize away their infinite loops without side-effects. >> - Implement `llvm.loop.mustprogress` as a loop metadata to notify to >> the LLVM optimization passes such as LoopDeletion that even if this >> function is permitted to not make progress, this loop is still >> required to make progress because it is not one of the infinite >> side-effect free loops permitted by the frontend language specs. Note >> that loop metadata can be dropped, so even if this metadata is >> dropped, we would not optimize away loops that we don’t optimize now >> and we wouldn’t preserve loops that we don’t preserve now. > > > I'm a bit worried about the following: It seems like you want to > handle C functions that have loops that might be infinite (i.e., those > with constant controlling expressions) by adding the maynotprogress > attribute to the containing function, and then this attribute to all > of the other loops. Is that correct? That was the idea (for C/C++), yes. > Also, it seems semantically incorrect to inline functions with the > attribute into functions without the attribute unless you add the > attribute to the caller. Is that correct? Yes. We actually talked about this and simply forgot to make the attribute "sticky", same as for example SpeculativeLoadHardeningAttr. Patch is underway. > If those are true, then we can end up with cases where, with functions > from the same C source file, we're either disallowing inlining or > pessimizing the optimization of all loops in the caller. Technically true, assuming we do not use the loop metadata or it is dropped. However, that is still an improvement to the status quo. Right now, either we have a forward progress guarantee and the C loop that should not be deleted might be deleted, or we don't have such a guarantee and no loop is deleted regardless of the presence of one that should not be deleted. Note that it is not only about "empty" loops. Assuming finite loops helps for loops that do something too. Take: ``` unsigned count = 0; for (void *p = s; p != e; p += 4) ++count; ``` which is just fine optimized but if we dropped the `inbounds` from the GEP, at some point, we end up with a loop instead of a closed form expression: https://clang.godbolt.org/z/6dYTYe > Unless, when inlining, we add the attribute to the caller and also add > this metadata to all other loops? For "maximal optimization opportunity" yes. For correctness, we don't need the metadata at all. Atmn is putting a patch up now to make the attribute sticky, it's a one liner + tests, and then we can look at the metadata in a follow up. > In any case, please explain the intended behavior of the attribute and > the metadata upon inlining. The attribute will be attached to the caller upon inlining as this is the only way to keep semantics correct. Metadata can be added to the loops of the caller if the caller did not have the attribute, but that is an optimization and not required. The patch for the first part will be part of this change set. >> The current implementations are in: >> - Changes to the LoopDeletion Pass: https://reviews.llvm.org/D86844 >> - Changes to the Clang Frontend: https://reviews.llvm.org/D86841 >> - Changes to LangRef: https://reviews.llvm.org/D86233 >> - Changed to IR: https://reviews.llvm.org/D85393 >> >> The changes preserve what was previously accepted as the “default >> behavior” [5]. That is, you get forward progress assumption in case a >> function is not marked with the `maynotprogress` attribute. Here the >> default behavior is that LLVM IR functions are required to make >> forward-progress and are assumed to make forward progress. These >> attributes are aimed at helping frontends write correct code as per >> their language specs, and in addition, optimize out some dead loops >> that we weren’t able to optimize out before but could’ve. >> >> Feedback welcome. >> >> (The name of the function attribute and the loop metadata are still >> under discussion, and we’re definitely open to changing them.) > > > Some additional thoughts on the bikeshedding: > > I'm not in love with this name. might_not_progress would be better. We > could choose something more colloquial, like may_inf_loop. Or more > formal, like no_forward_progress_guarantee. I like this because it > seems the most technically accurate and isn't likely to be misread. > It's long, however, and even though we have attribute lists, it will > still appear a lot. > > I think that I like nfpg the best (for "no forward-progress > guarantee"). Why nfpg? It's technically accurate and short. Short is > useful because, for some languages (any maybe even for some IR from > C), the attribute is going to appear on nearly every function. I know > that we have attribute lists, but even so. no_fpg is good too. I'm given up on the name. To be honest, if you don't read the langref you don't know what these things do half the time, at least not in detail. Let's just number them, this would be `attribute70` I think ;) Cheers, Johannes > Thanks again, > > Hal > > >> >> Atmn and Johannes >> >> [1] https://bugs.llvm.org/show_bug.cgi?id=965 >> [2] https://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html >> [3] https://lists.llvm.org/pipermail/llvm-dev/2017-October/118558.html >> [4] https://reviews.llvm.org/D38336 >> [5] https://reviews.llvm.org/D65718 >> [6] https://eel.is/c++draft/intro.progress >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >