Johannes Doerfert via llvm-dev
2021-Apr-22 17:32 UTC
[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations
On 4/22/21 12:21 PM, David Blaikie wrote:> On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert > <johannesdoerfert at gmail.com> wrote: >> >> On 4/22/21 11:49 AM, David Blaikie wrote: >>> There seems to be a bunch of confusion and probably some >>> conflation/ambiguity about whether we're talking about IR constructs >>> on the C attributes. >>> >>> Johannes - I assume your claim is restricted mostly to the IR? That >>> having optnone not require or imply noinline improves orthogonality of >>> features and that there are reasonable use cases where one might want >>> optnone while allowing inlining (of the optnone function) or optnone >>> while disallowing inlining (of the optnone function) >> Yes, IR level is my concern. I'd be in favor of matching it in C >> but I don't care enough to go through the discussions. >> >> My latest use case: `s/optnone//g` should not result in a verifier >> error when you just want to try out an optimization on a single function. > I guess you meant adding optnone, rather than removing it? (removing > optnone shouldn't cause any verifier errors, does it?)Yes, right, my bad. Removing `noinline` or adding `optnone` can get you in trouble.>>> Paul - I think you're mostly thinking about/interested in the specific >>> source level/end user use case that motivated the initial >>> implementation of optnone. Where, I tend to agree - inlining an >>> optnone function is not advantageous to the user. Though it's possible >>> Johannes 's argument could be generalized from IR to C and still >>> apply: orthogonal features are more powerful and the user can always >>> compose them together to get what they want. (good chance they're >>> using attributes behind macros for ease of use anyway - they're a bit >>> verbose to write by hand all the time) >> I'd give the user the capability building blocks and let them work >> with that, but again, this is not my main concern right now. >> >> >>> There's also the -O0 use of optnone these days (clang puts optnone on >>> all functions when compiling with -O0 - the intent being to treat such >>> functions as though they were compiled in a separate object file >>> without optimizations (that's me projecting what I /think/ the mental >>> model should be) - which, similarly, I think will probably want to >>> keep the current behavior (no ipa/inlining and no optimization - >>> however that's phrased). >>> >>> Essentially the high level use cases of optnone all look like "imagine >>> if I compiled this in a separate object file without LTO" - apparently >>> even noipa+optnone isn't enough for that, maybe (I need to test that >>> more, based on a comment from Johannes earlier that inexact >>> definitions don't stop the inliner... )? Sounds like maybe it's more >>> like what I'm thinking of is "what if this function had weak linkage" >>> (ie: could be replaced by a totally different one)? >> Yes, weak should do the trick. That said, if you want "separate >> object file without LTO", go with `noinline` + `noipa`. This will >> make the call edges optimization barriers. If you want to also not >> optimize the function, add `optnone` as required. > Starts to feel like a long list to get what seems like one wholistic > concept ("treat it as though it were a separate non-LTO object file" / > "treat it as though this function had weak linkage"), but it's > probably OK/a fine thing to do (not like there's a high cost to having > multiple attributes since the attribute lists are shared, etc) - just > psychologically for me, there seems to be one core concept and > stitching it together from several attributes makes me worry that > there are gaps (as there have been/what's motivating this discussion - > though it certainly sounds like there will be fewer gaps after this > work, for sure).If you want "separate non-LTO" behavior by design, put it in a different file and don't compile that file with LTO ;) Inside a single TU there is no "separate non-LTO" idea, it is a single TU after all. To get the same effect we build it from blocks that have a meaning in the single TU case. Sure, there might be gaps left, that usually means there is something missing in the single TU case as well so a new attribute is in order.>> FWIW, if all functions are `optnone`, `noinline` and `noipa` are >> not needed, that is the -O0 case. More specifically, if your caller >> is `optnone`, `noinline` and `noipa` are not needed (for that caller). > Right - though LTO is the case that motivated adding optnone for -O0, > so it would be respected even under LTO - so in that case we'd want > noinline and noipa, by the sounds of it.If you run only one file with -O0 and then LTO it with other files that do not have -O0, you probably want to add `noinline` + `noipa` at the "entry points" that you want to debug. I think -O0 could reasonably add all three arguments anyway, if you say O0 they all make sense (to me). If you want more control, you need to seed them manually. ~ Johannes> > - Dave > >> ~ Johannes >> >> >>> On Thu, Apr 22, 2021 at 9:23 AM Johannes Doerfert >>> <johannesdoerfert at gmail.com> wrote: >>>> On 4/22/21 11:05 AM, paul.robinson at sony.com wrote: >>>>>> On 4/22/21 9:43 AM, paul.robinson at sony.com wrote: >>>>>>>> That said, I believe it is a mistake that `optnone` requires >>>>>>>> `noinline`. There is no reason for it to do so on the IR level. >>>>>>>> If you argue C-level `optnone` should imply `noinline`, that is >>>>>>>> a something worth discussing, though on the IR level we can >>>>>>>> decouple them. Use case, for example, the not-optimized version >>>>>>>> is called from functions that are `optnone` themselves while >>>>>>>> other call sites are inlined and your function is optimized. So >>>>>>>> you can use the two attributes to do context sensitive `optnone`. >>>>>>> The original intent for `optnone` was to imitate the -O0 pipeline >>>>>>> to the extent that was feasible. The -O0 pipeline (as constructed >>>>>>> by Clang) runs just the always-inliner, not the regular inliner; >>>>>>> so, functions marked `optnone` should not be inlined. The way >>>>>>> to achieve that effect most simply is to have `optnone` require >>>>>>> `noinline` and that's what we did. >>>>>>> >>>>>>> If we have `optnone` stop requiring `noinline` and teach the >>>>>>> inliner to inline an `optnone` callee only into an `optnone` caller, >>>>>>> then we are violating the intent that `optnone` imitate -O0, because >>>>>>> that inlining would not have happened at -O0. >>>>>> I think I initially read this wrong, hence the part below. >>>>>> After reading it again, I have one question: Why would the >>>>>> inliner inline something that is not `always_inline` into >>>>>> an `optnone` caller? That would violate the idea of `optnone`, >>>>>> IMHO, regardless if the callee is `optnone` or not. That is >>>>>> why I don't believe `noinline` on the callee is necessary >>>>>> for your use case. >>>>> The inliner should be ignoring `optnone` callers, so it would >>>>> never inline *anything* into an `optnone` caller. (Other than >>>>> an `alwaysinline` function.) >>>> My point is, it already does ignore `optnone` callers >>>> and inlines only `alwaysinline` calls into them: >>>> >>>> https://clang.godbolt.org/z/fznbjTEd5 >>>> >>>> >>>>> I had read this: >>>>> >>>>>>>> I believe it is a mistake that `optnone` requires `noinline`. >>>>> and the case that came to mind is inlining an `optnone` callee >>>>> into a not-`optnone` caller. The inlined copy would then be >>>>> treated to further optimization, which violates the idea of >>>>> `optnone`. >>>> But that is a composition issue. If you do not want to >>>> inline a `optnone` callee into an non-`optnone` caller, >>>> then add `noinline` to the callee. If you don't mind if >>>> it is inlined into non-`optnone` callers and optimized >>>> in there, then don't. My last email contained an example >>>> to show the different cases, you can mix and match IR >>>> attributes to get what you want. Requiring them to be >>>> tied is not improving anything but just restricting the >>>> options. >>>> >>>> >>>>> Now, the inliner already knows to avoid `noinline` callees, so >>>>> attaching `noinline` to `optnone` functions was (at the time) >>>>> considered an optimal way to avoid the problematic case. We >>>>> could instead teach the inliner to skip `optnone` callees, and >>>>> that would allow us to eliminate the requirement that `optnone` >>>>> functions must also be `noinline`. I am unclear why redefining >>>>> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`) >>>>> is better, but then I don't work much with attributes. >>>> The inliner will not inline calls into an `optnone` caller >>>> if it is not necessary. As said before, that would violate >>>> the `optnone` idea for the caller, no matter what the callee >>>> looks like. So requiring `noinline` on the callee seems >>>> to me like a workaround or an oversight. >>>> >>>> It is better to not require them together because you can >>>> actually describe more distinct scenarios. Please take >>>> another look at my example in the last email, it shows >>>> what is possible if you split them. Furthermore, `optnone` >>>> does by design imply `noinline` for the call sites in the >>>> caller, or at least nobody argued that it shouldn't. Thus, >>>> requiring `noinline` on the callee is simply unnecessary >>>> as it does not add any value. >>>> >>>> >>>>> The notion of allowing an `optnone` caller to inline an `optnone` >>>>> callee sounds like it would also violate the intent of `optnone` >>>>> in that it should imitate -O0, where inlining is confined to >>>>> `alwaysinline` callees, and `optnone` is defined to conflict with >>>>> `alwaysinline` (because if you always inline something, you are >>>>> allowing it to have subsequent optimizations same as the caller, >>>>> which conflicts with `optnone`). >>>> Nobody said `optnone` callers should inline calls that are >>>> not always_inline, at least so far I have not seen that >>>> argument be made anywhere. I'll just skip this paragraph. >>>> >>>> >>>>> So, if you want to undo the _requirement_ that `optnone` must >>>>> have `noinline`, but then redefine `optnone` such that it can't >>>>> be inlined anywhere, you've done something that seems to have no >>>>> practical effect. Maybe that helps Attributor in some way, but >>>>> I don't see any other reason to be making this change. >>>> I do not want to say `optnone` cannot be inlined. `noinline` >>>> says it cannot be inlined. If you want it to not be inlined, >>>> use `noinline`, if you want it to not be optimized in it's >>>> own function, use `optnone`, if you want both, use both. >>>> >>>> The practical effect was literally showcased in my last email, >>>> please go back and look at the example. >>>> >>>> I don't know why the Attributor has to do with this, I'm happy >>>> to hear your thoughts on that though :) >>>> >>>> ~ Johannes >>>> >>>> >>>>> --paulr >>>>>
via llvm-dev
2021-Apr-22 18:41 UTC
[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations
I'll just say,> If you want "separate non-LTO" behavior by design, put it in a > different file and don't compile that file with LTO ;)that is impractical in many build systems, and an excessive amount of work if you want to selectively build a some code with -O0 because you're debugging something at the moment. --paulr