Johannes Doerfert via llvm-dev
2021-Apr-22 17:04 UTC
[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations
On 4/22/21 11:49 AM, David Blaikie wrote:> There seems to be a bunch of confusion and probably some > conflation/ambiguity about whether we're talking about IR constructs > on the C attributes. > > Johannes - I assume your claim is restricted mostly to the IR? That > having optnone not require or imply noinline improves orthogonality of > features and that there are reasonable use cases where one might want > optnone while allowing inlining (of the optnone function) or optnone > while disallowing inlining (of the optnone function)Yes, IR level is my concern. I'd be in favor of matching it in C but I don't care enough to go through the discussions. My latest use case: `s/optnone//g` should not result in a verifier error when you just want to try out an optimization on a single function.> > Paul - I think you're mostly thinking about/interested in the specific > source level/end user use case that motivated the initial > implementation of optnone. Where, I tend to agree - inlining an > optnone function is not advantageous to the user. Though it's possible > Johannes 's argument could be generalized from IR to C and still > apply: orthogonal features are more powerful and the user can always > compose them together to get what they want. (good chance they're > using attributes behind macros for ease of use anyway - they're a bit > verbose to write by hand all the time)I'd give the user the capability building blocks and let them work with that, but again, this is not my main concern right now.> There's also the -O0 use of optnone these days (clang puts optnone on > all functions when compiling with -O0 - the intent being to treat such > functions as though they were compiled in a separate object file > without optimizations (that's me projecting what I /think/ the mental > model should be) - which, similarly, I think will probably want to > keep the current behavior (no ipa/inlining and no optimization - > however that's phrased). > > Essentially the high level use cases of optnone all look like "imagine > if I compiled this in a separate object file without LTO" - apparently > even noipa+optnone isn't enough for that, maybe (I need to test that > more, based on a comment from Johannes earlier that inexact > definitions don't stop the inliner... )? Sounds like maybe it's more > like what I'm thinking of is "what if this function had weak linkage" > (ie: could be replaced by a totally different one)?Yes, weak should do the trick. That said, if you want "separate object file without LTO", go with `noinline` + `noipa`. This will make the call edges optimization barriers. If you want to also not optimize the function, add `optnone` as required. FWIW, if all functions are `optnone`, `noinline` and `noipa` are not needed, that is the -O0 case. More specifically, if your caller is `optnone`, `noinline` and `noipa` are not needed (for that caller). ~ Johannes> > On Thu, Apr 22, 2021 at 9:23 AM Johannes Doerfert > <johannesdoerfert at gmail.com> wrote: >> >> On 4/22/21 11:05 AM, paul.robinson at sony.com wrote: >>>> On 4/22/21 9:43 AM, paul.robinson at sony.com wrote: >>>>>> That said, I believe it is a mistake that `optnone` requires >>>>>> `noinline`. There is no reason for it to do so on the IR level. >>>>>> If you argue C-level `optnone` should imply `noinline`, that is >>>>>> a something worth discussing, though on the IR level we can >>>>>> decouple them. Use case, for example, the not-optimized version >>>>>> is called from functions that are `optnone` themselves while >>>>>> other call sites are inlined and your function is optimized. So >>>>>> you can use the two attributes to do context sensitive `optnone`. >>>>> The original intent for `optnone` was to imitate the -O0 pipeline >>>>> to the extent that was feasible. The -O0 pipeline (as constructed >>>>> by Clang) runs just the always-inliner, not the regular inliner; >>>>> so, functions marked `optnone` should not be inlined. The way >>>>> to achieve that effect most simply is to have `optnone` require >>>>> `noinline` and that's what we did. >>>>> >>>>> If we have `optnone` stop requiring `noinline` and teach the >>>>> inliner to inline an `optnone` callee only into an `optnone` caller, >>>>> then we are violating the intent that `optnone` imitate -O0, because >>>>> that inlining would not have happened at -O0. >>>> I think I initially read this wrong, hence the part below. >>>> After reading it again, I have one question: Why would the >>>> inliner inline something that is not `always_inline` into >>>> an `optnone` caller? That would violate the idea of `optnone`, >>>> IMHO, regardless if the callee is `optnone` or not. That is >>>> why I don't believe `noinline` on the callee is necessary >>>> for your use case. >>> The inliner should be ignoring `optnone` callers, so it would >>> never inline *anything* into an `optnone` caller. (Other than >>> an `alwaysinline` function.) >> My point is, it already does ignore `optnone` callers >> and inlines only `alwaysinline` calls into them: >> >> https://clang.godbolt.org/z/fznbjTEd5 >> >> >>> I had read this: >>> >>>>>> I believe it is a mistake that `optnone` requires `noinline`. >>> and the case that came to mind is inlining an `optnone` callee >>> into a not-`optnone` caller. The inlined copy would then be >>> treated to further optimization, which violates the idea of >>> `optnone`. >> But that is a composition issue. If you do not want to >> inline a `optnone` callee into an non-`optnone` caller, >> then add `noinline` to the callee. If you don't mind if >> it is inlined into non-`optnone` callers and optimized >> in there, then don't. My last email contained an example >> to show the different cases, you can mix and match IR >> attributes to get what you want. Requiring them to be >> tied is not improving anything but just restricting the >> options. >> >> >>> Now, the inliner already knows to avoid `noinline` callees, so >>> attaching `noinline` to `optnone` functions was (at the time) >>> considered an optimal way to avoid the problematic case. We >>> could instead teach the inliner to skip `optnone` callees, and >>> that would allow us to eliminate the requirement that `optnone` >>> functions must also be `noinline`. I am unclear why redefining >>> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`) >>> is better, but then I don't work much with attributes. >> The inliner will not inline calls into an `optnone` caller >> if it is not necessary. As said before, that would violate >> the `optnone` idea for the caller, no matter what the callee >> looks like. So requiring `noinline` on the callee seems >> to me like a workaround or an oversight. >> >> It is better to not require them together because you can >> actually describe more distinct scenarios. Please take >> another look at my example in the last email, it shows >> what is possible if you split them. Furthermore, `optnone` >> does by design imply `noinline` for the call sites in the >> caller, or at least nobody argued that it shouldn't. Thus, >> requiring `noinline` on the callee is simply unnecessary >> as it does not add any value. >> >> >>> The notion of allowing an `optnone` caller to inline an `optnone` >>> callee sounds like it would also violate the intent of `optnone` >>> in that it should imitate -O0, where inlining is confined to >>> `alwaysinline` callees, and `optnone` is defined to conflict with >>> `alwaysinline` (because if you always inline something, you are >>> allowing it to have subsequent optimizations same as the caller, >>> which conflicts with `optnone`). >> Nobody said `optnone` callers should inline calls that are >> not always_inline, at least so far I have not seen that >> argument be made anywhere. I'll just skip this paragraph. >> >> >>> So, if you want to undo the _requirement_ that `optnone` must >>> have `noinline`, but then redefine `optnone` such that it can't >>> be inlined anywhere, you've done something that seems to have no >>> practical effect. Maybe that helps Attributor in some way, but >>> I don't see any other reason to be making this change. >> I do not want to say `optnone` cannot be inlined. `noinline` >> says it cannot be inlined. If you want it to not be inlined, >> use `noinline`, if you want it to not be optimized in it's >> own function, use `optnone`, if you want both, use both. >> >> The practical effect was literally showcased in my last email, >> please go back and look at the example. >> >> I don't know why the Attributor has to do with this, I'm happy >> to hear your thoughts on that though :) >> >> ~ Johannes >> >> >>> --paulr >>> >>
David Blaikie via llvm-dev
2021-Apr-22 17:21 UTC
[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations
On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert <johannesdoerfert at gmail.com> wrote:> > > On 4/22/21 11:49 AM, David Blaikie wrote: > > There seems to be a bunch of confusion and probably some > > conflation/ambiguity about whether we're talking about IR constructs > > on the C attributes. > > > > Johannes - I assume your claim is restricted mostly to the IR? That > > having optnone not require or imply noinline improves orthogonality of > > features and that there are reasonable use cases where one might want > > optnone while allowing inlining (of the optnone function) or optnone > > while disallowing inlining (of the optnone function) > > Yes, IR level is my concern. I'd be in favor of matching it in C > but I don't care enough to go through the discussions. > > My latest use case: `s/optnone//g` should not result in a verifier > error when you just want to try out an optimization on a single function.I guess you meant adding optnone, rather than removing it? (removing optnone shouldn't cause any verifier errors, does it?)> > Paul - I think you're mostly thinking about/interested in the specific > > source level/end user use case that motivated the initial > > implementation of optnone. Where, I tend to agree - inlining an > > optnone function is not advantageous to the user. Though it's possible > > Johannes 's argument could be generalized from IR to C and still > > apply: orthogonal features are more powerful and the user can always > > compose them together to get what they want. (good chance they're > > using attributes behind macros for ease of use anyway - they're a bit > > verbose to write by hand all the time) > > I'd give the user the capability building blocks and let them work > with that, but again, this is not my main concern right now. > > > > There's also the -O0 use of optnone these days (clang puts optnone on > > all functions when compiling with -O0 - the intent being to treat such > > functions as though they were compiled in a separate object file > > without optimizations (that's me projecting what I /think/ the mental > > model should be) - which, similarly, I think will probably want to > > keep the current behavior (no ipa/inlining and no optimization - > > however that's phrased). > > > > Essentially the high level use cases of optnone all look like "imagine > > if I compiled this in a separate object file without LTO" - apparently > > even noipa+optnone isn't enough for that, maybe (I need to test that > > more, based on a comment from Johannes earlier that inexact > > definitions don't stop the inliner... )? Sounds like maybe it's more > > like what I'm thinking of is "what if this function had weak linkage" > > (ie: could be replaced by a totally different one)? > > Yes, weak should do the trick. That said, if you want "separate > object file without LTO", go with `noinline` + `noipa`. This will > make the call edges optimization barriers. If you want to also not > optimize the function, add `optnone` as required.Starts to feel like a long list to get what seems like one wholistic concept ("treat it as though it were a separate non-LTO object file" / "treat it as though this function had weak linkage"), but it's probably OK/a fine thing to do (not like there's a high cost to having multiple attributes since the attribute lists are shared, etc) - just psychologically for me, there seems to be one core concept and stitching it together from several attributes makes me worry that there are gaps (as there have been/what's motivating this discussion - though it certainly sounds like there will be fewer gaps after this work, for sure).> FWIW, if all functions are `optnone`, `noinline` and `noipa` are > not needed, that is the -O0 case. More specifically, if your caller > is `optnone`, `noinline` and `noipa` are not needed (for that caller).Right - though LTO is the case that motivated adding optnone for -O0, so it would be respected even under LTO - so in that case we'd want noinline and noipa, by the sounds of it. - Dave> ~ Johannes > > > > > > On Thu, Apr 22, 2021 at 9:23 AM Johannes Doerfert > > <johannesdoerfert at gmail.com> wrote: > >> > >> On 4/22/21 11:05 AM, paul.robinson at sony.com wrote: > >>>> On 4/22/21 9:43 AM, paul.robinson at sony.com wrote: > >>>>>> That said, I believe it is a mistake that `optnone` requires > >>>>>> `noinline`. There is no reason for it to do so on the IR level. > >>>>>> If you argue C-level `optnone` should imply `noinline`, that is > >>>>>> a something worth discussing, though on the IR level we can > >>>>>> decouple them. Use case, for example, the not-optimized version > >>>>>> is called from functions that are `optnone` themselves while > >>>>>> other call sites are inlined and your function is optimized. So > >>>>>> you can use the two attributes to do context sensitive `optnone`. > >>>>> The original intent for `optnone` was to imitate the -O0 pipeline > >>>>> to the extent that was feasible. The -O0 pipeline (as constructed > >>>>> by Clang) runs just the always-inliner, not the regular inliner; > >>>>> so, functions marked `optnone` should not be inlined. The way > >>>>> to achieve that effect most simply is to have `optnone` require > >>>>> `noinline` and that's what we did. > >>>>> > >>>>> If we have `optnone` stop requiring `noinline` and teach the > >>>>> inliner to inline an `optnone` callee only into an `optnone` caller, > >>>>> then we are violating the intent that `optnone` imitate -O0, because > >>>>> that inlining would not have happened at -O0. > >>>> I think I initially read this wrong, hence the part below. > >>>> After reading it again, I have one question: Why would the > >>>> inliner inline something that is not `always_inline` into > >>>> an `optnone` caller? That would violate the idea of `optnone`, > >>>> IMHO, regardless if the callee is `optnone` or not. That is > >>>> why I don't believe `noinline` on the callee is necessary > >>>> for your use case. > >>> The inliner should be ignoring `optnone` callers, so it would > >>> never inline *anything* into an `optnone` caller. (Other than > >>> an `alwaysinline` function.) > >> My point is, it already does ignore `optnone` callers > >> and inlines only `alwaysinline` calls into them: > >> > >> https://clang.godbolt.org/z/fznbjTEd5 > >> > >> > >>> I had read this: > >>> > >>>>>> I believe it is a mistake that `optnone` requires `noinline`. > >>> and the case that came to mind is inlining an `optnone` callee > >>> into a not-`optnone` caller. The inlined copy would then be > >>> treated to further optimization, which violates the idea of > >>> `optnone`. > >> But that is a composition issue. If you do not want to > >> inline a `optnone` callee into an non-`optnone` caller, > >> then add `noinline` to the callee. If you don't mind if > >> it is inlined into non-`optnone` callers and optimized > >> in there, then don't. My last email contained an example > >> to show the different cases, you can mix and match IR > >> attributes to get what you want. Requiring them to be > >> tied is not improving anything but just restricting the > >> options. > >> > >> > >>> Now, the inliner already knows to avoid `noinline` callees, so > >>> attaching `noinline` to `optnone` functions was (at the time) > >>> considered an optimal way to avoid the problematic case. We > >>> could instead teach the inliner to skip `optnone` callees, and > >>> that would allow us to eliminate the requirement that `optnone` > >>> functions must also be `noinline`. I am unclear why redefining > >>> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`) > >>> is better, but then I don't work much with attributes. > >> The inliner will not inline calls into an `optnone` caller > >> if it is not necessary. As said before, that would violate > >> the `optnone` idea for the caller, no matter what the callee > >> looks like. So requiring `noinline` on the callee seems > >> to me like a workaround or an oversight. > >> > >> It is better to not require them together because you can > >> actually describe more distinct scenarios. Please take > >> another look at my example in the last email, it shows > >> what is possible if you split them. Furthermore, `optnone` > >> does by design imply `noinline` for the call sites in the > >> caller, or at least nobody argued that it shouldn't. Thus, > >> requiring `noinline` on the callee is simply unnecessary > >> as it does not add any value. > >> > >> > >>> The notion of allowing an `optnone` caller to inline an `optnone` > >>> callee sounds like it would also violate the intent of `optnone` > >>> in that it should imitate -O0, where inlining is confined to > >>> `alwaysinline` callees, and `optnone` is defined to conflict with > >>> `alwaysinline` (because if you always inline something, you are > >>> allowing it to have subsequent optimizations same as the caller, > >>> which conflicts with `optnone`). > >> Nobody said `optnone` callers should inline calls that are > >> not always_inline, at least so far I have not seen that > >> argument be made anywhere. I'll just skip this paragraph. > >> > >> > >>> So, if you want to undo the _requirement_ that `optnone` must > >>> have `noinline`, but then redefine `optnone` such that it can't > >>> be inlined anywhere, you've done something that seems to have no > >>> practical effect. Maybe that helps Attributor in some way, but > >>> I don't see any other reason to be making this change. > >> I do not want to say `optnone` cannot be inlined. `noinline` > >> says it cannot be inlined. If you want it to not be inlined, > >> use `noinline`, if you want it to not be optimized in it's > >> own function, use `optnone`, if you want both, use both. > >> > >> The practical effect was literally showcased in my last email, > >> please go back and look at the example. > >> > >> I don't know why the Attributor has to do with this, I'm happy > >> to hear your thoughts on that though :) > >> > >> ~ Johannes > >> > >> > >>> --paulr > >>> > >>