thr3ads.net - llvm dev - [llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations [Apr 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Apr-22 02:29 UTC

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

Implemented a first-pass at adding noipa IR/bitcode and the basic
functionality, noipa implying "may be unrefined"/not
is-definition-exact. https://reviews.llvm.org/D101011

On Mon, Apr 19, 2021 at 4:32 PM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:>
>
> On 4/19/21 4:41 PM, David Blaikie wrote:
> > On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert
> > <johannesdoerfert at gmail.com>  wrote:
> >> On 4/18/21 10:51 PM, David Blaikie wrote:
> >>> On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> >>> johannesdoerfert at gmail.com> wrote:
> >>>
> >>>> I'm very much in favor of `noipa`. It comes up every
few months
> >>>> and it would be widely useful.
> >>> Out of curiosity, what sort of uses do you have in mind for
it?
> >> Most times people basically want `noinline` to also mean "no
> >> interprocedural optimization", but without `optnone`. So,
your
> >> function is optimized but actually called and the call result
> >> is used, no constants are propagated etc.
> >>
> >> Example:
> >>
> >> ```
> >> __attribute__((noipa))
> >> void foo() { return 1 + 2; }
> >> void bar() { return foo(); }
> >> ```
> >> should become
> >>
> >> ```
> >> __attribute__((noipa))
> >> void foo() { return 3; }
> >> void bar() { return foo(); }
> >> ```
> >> which it does not right now.
> > I'm curious what the use case is you've come across (the
justification
> > for the GCC implementation of noipa was mostly for compiler testing -
> > which is my interest in having these semantics (under optnone or
> > otherwise) - so just curious what other use cases I should have in
> > mind, etc)
>
> I looked for `noipa` in my inbox, here are some results that
> show different use cases people brought up since March 2020:
>
> https://reviews.llvm.org/D75815#1939277
> https://bugs.llvm.org/show_bug.cgi?id=46463
> https://reviews.llvm.org/D93838#2472155
> https://reviews.llvm.org/D97971#2608302
>
> Another use case is runtime call detection in the presence of definitions.
> So, we detect `malloc` and also various OpenMP runtime calls, which works
> fine because those are usually declarations. However, sometimes they are
> not and then we can easily end up with signatures that do not match what we
> expect anymore. At least that happens if we link in the OpenMP GPU runtime
> into an application.
Ah, thanks for all the links/context!
> >>>> I'd expose it via Clang and -O0 could
> >>>> set it as well (for the LTO case).
> >>>>
> >>>> When it comes to inexact definitions, optnone functions,
and existing
> >>>> attributes,
> >>>> I'd be in favor of 1) always allowing the use of
existing attributes,
> >>>>
> >>> I'm not sure what you mean by this ^ - could you
rephrase/elaborate?
> >>>
> >>>
> >>>> and 2) not deriving new ones for an inexact or optnone
definition.
> >>>>
> >>> Also this ^ I'm similarly confused/unclear about.
> >> So if you have a call of F, and F has attribute A, we can use
> >> that fact at the call site, regardless of the definition of F.
> >> F could be `optnone` or with non-exact linkage, but the
information
> >> attached to it is still usable.
> > +1 SGTM.
> >
> >> If we go for the above we can never derive/attach information
> >> for a non-exact linkage definitions. That way we prevent IPO from
> >> using information that might be invalid if the definition is
replaced.
> > Yup, sounds good.
> >
> >> It is all about where you disturb the ipo deduction in this case,
I think
> >> it is more beneficial to not attach new things but an argument
could be
> >> made to allow that but no propagation.
> > Allow adding them, but never using them? Yeah, that doesn't seem
> > especially helpful/useful - the attributes are entirely for IPO, so if
> > you want to block IPO it seems best not to add them.
>
> We could use them *inside* the function, but we can make that work
> differently as well. IPO seems the more important target.
Ah, right. Yeah, agreed.
> >> Both have benefits, its' not 100%
> >> clear what is more desirable at the end of the day.
> >>
> >>
> >>>> This is how the Attributor determines if it a function
level attribute
> >>>> could
> >>>> be derived or if we should only stick with the existing
information:
> >>>>
> >>>>        /// Determine whether the function \p F is IPO
amendable
> >>>>        ///
> >>>>        /// If a function is exactly defined or it has
alwaysinline attribute
> >>>>        /// and is viable to be inlined, we say it is IPO
amendable
> >>>>        bool isFunctionIPOAmendable(const Function &F)
{
> >>>>          return F.hasExactDefinition() ||
> >>>> InfoCache.InlineableFunctions.count(&F);
> >>>>        }
> >>>>
> >>>> So, if the above check doesn't hold we will not add
new attributes but
> >>>> we will
> >>>> still use existing ones. This seems to me the right way to
allow
> >>>> users/frontends
> >>>> to provide information selectively.
> >>>>
> >>> Yep, that sounds right to me (if you put attributes on an
optnone/noipa
> >>> function, they should be usable/used - but none should be
discovered/added
> >>> later by inspection of the implementation of such a function)
- currently
> >>> doesn't seem to be the case for the (old pass manager?)
FunctionAttrs pass,
> >>> so I have to figure some things out there.
> >> That is what I tried to say above, I think.
> >>
> >> In the end, I want to know that foo does not access memory but
> >> bar could for all we know:
> >>
> >> ```
> >> __attribute__((pure, optnone))         // or non-exact linkage
> >> void pure_optnone() { /* empty */ }
> >>
> >> __attribute__((optnone))               // or non-exact linkage
> >> void optnone() { /* empty */ }
> >>
> >> void foo() { pure_optnone(); }
> >>
> >> void bar() { optnone(); }
> >> ```
> > Got it,
> >
> > I'll see about posting an implementation of noipa and switching
> > __attribute__((optnone)) over to lower to LLVM's optnone+noipa
rather
> > than optnone+noinline.
>
> FWIW, I think `noipa` should not imply `noinline`, unsure if you
> had that in mind or not.
Do you think it should require that noipa also carries noinline? (the
way optnone currently requires noinline) Or should we let the
non-inlining fall out naturally from the non-exact definition
property?
> > Happy if someone wants to add clang support for an
> > __attribute__((noipa)) lowering to that LLVM noipa once it's in
(maybe
> > I'll do it, guess it's probably fairly cheap/easy).
>
> Agreed, I won't volunteer right now, I doubt that I'll get to it
> anytime soon. That said, I actually would like to use `noipa`, see
> above.
>
> ~ Johannes
>
>
> > - Dave
> >
> >> ~ Johannes
> >>
> >>
> >>>> That said, right now the Attributor will not propagate any
information
> >>>> from an
> >>>> optnone function or derive new information. Nevertheless,
I'd be in
> >>>> favor to allow
> >>>> existing information to be used for IPO.
> >>>>
> >>> *nod* I think I'm with you there.
> >>>
> >>> - Dave
> >>>
> >>>
> >>>> ~ Johannes
> >>>>
> >>>>
> >>>> On 4/18/21 8:40 PM, David Blaikie via llvm-dev wrote:
> >>>>> Prototyping the idea of "isDefinitionExact"
returning false for optnone
> >>>>> (whether or not we split it out into noipo or not)
I've tripped over
> >>>>> something it seems I created 5 years ago:
> >>>>>
> >>>>> I added some IPC support for optnone to GlobalsModRef:
> >>>>>
> >>>>
https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>>>> - so it wouldn't conclude properties of an optnone
function.
> >>>>>
> >>>>> But I then made a follow-up commit (without a lot of
context as to why,
> >>>>> unfortunately :/ ) that allowed GlobasModRef to use
existing attributes
> >>>> on
> >>>>> an optnone function:
> >>>>>
> >>>>
https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>>>> But it seems making the function definition inexact,
breaks the unit
> >>>>> testing added in the latter commit. I suppose then
it's an open question
> >>>>> whether existing attributes on an inexact definition
should be used at
> >>>> all?
> >>>>> (I don't know what motivated me to support them
for optnone)
> >>>>>
> >>>>> Oh, and here's a change from Chandler around the
same time similarly
> >>>>> blocking some ipo for optnone:
> >>>>>
> >>>>
https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c
> >>>>> - in this case preventing FunctionAttrs from deriving
the attributes for
> >>>> an
> >>>>> optnone function. That functionality looks like it can
be subsumed by the
> >>>>> inexact approach - applying inexact to optnone and
removing the change in
> >>>>> Chandler's patch still passes the tests. (hmm,
tested - not quite, but
> >>>> more
> >>>>> work to do there)
> >>>>>
> >>>>> On Sun, Apr 18, 2021 at 10:06 AM David
Blaikie<dblaikie at gmail.com>
> >>>> wrote:
> >>>>>> On Sun, Apr 18, 2021 at 9:43 AM Roman
Lebedev<lebedev.ri at gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> There's 'noipa' attribute in GCC,
currently it is not supported by
> >>>> clang.
> >>>>>>> Theoretically, how would one implement it?
> >>>>>>>
> >>>>>> If we wanted to do this really robustly, I guess
we might have to
> >>>>>> introduce some sort of "here's the usual
way to check if this is a
> >>>>>> definition/get the body of the function"
(which for noipa it says
> >>>> "there is
> >>>>>> no body/don't look here") and "no,
really, I need the definition" (for
> >>>>>> actual code generation).
> >>>>>>
> >>>>>> Though I'm not advocating for that - I'm
OK with a more
> >>>> ad-hoc/best-effort
> >>>>>> implementation targeting the -O0/debugging
assistant
> >>>>>> __attribute__((optnone)) kind of use case - happy
to fix cases as they
> >>>> come
> >>>>>> up to improve the user experience for these
situations.
> >>>>>>
> >>>>>> Maybe we could get away with generalizing this by
having an optnone (or
> >>>>>> noipa) function appear "interposable"
even though it doesn't have a real
> >>>>>> interposable linkage? That should hinder/disable
any IPA.
> >>>>>>
> >>>>>> Hmm, looks like GlobalValue::isDefinitionExact
would be best to return
> >>>>>> false in this case (whatever we end up naming it)
/maybe/
> >>>>>> mayBeDerefined should return false too.
> >>>>>>
> >>>>>> Yeah, I guess if we can implement such a robust
generalization, then
> >>>> it'd
> >>>>>> probably be OK/easy enough to implement both noipa
and optnone implies
> >>>>>> noipa the same as it implies noinline (well, I
guess noipa would subsume
> >>>>>> the noinline implication - if the function
isn't exact, the inliner
> >>>> won't
> >>>>>> inline it so there wouldn't be any need for
the explicit noinline)
> >>>>>>
> >>>>>>
> >>>>>>> With your proposal, clang `noipa` attribute
could be lowered
> >>>>>>> to `optnone` on the whole function, To me that
seems like
> >>>>>>> too much of a hammer, should that be the path
forward.
> >>>>>>>
> >>>>>> I agree that lowering noipa to optnone would be a
very aggressive form
> >>>> of
> >>>>>> noipa - likely if we want to support noipa it
would be to support it
> >>>>>> separately and maybe either lower -O0 (& maybe
> >>>> __attribute__((optnone))) to
> >>>>>> both optnone+noipa+noinline (since optnone already
implies noinline) or
> >>>>>> make optnone imply ipa/be a superset of it
implicitly (if we do have
> >>>> noipa
> >>>>>> it's probably best to have "optnone
requires noipa" the same way
> >>>> "optnone
> >>>>>> requires noinline" rather than an implicit
superset sort of thing).
> >>>>>>
> >>>>>> I think that'd certainly be appropriate for
-O0, and I'd argue it'd be
> >>>>>> appropriate for __attribute__((optnone)) because I
think it'd be what
> >>>>>> people expect/is consistent with the motivation
for the attribute (for
> >>>>>> debuggability - so you wouldn't want a caller
to not fill in
> >>>>>> parameters/pass in garbage because it knows the
implementation doesn't
> >>>>>> matter, or not use the result because it knows
what the result should
> >>>> be).
> >>>>>>> Would it not be best to not conflate the two,
> >>>>>>> and just introduce the `noipa` attribute?
> >>>>>>>
> >>>>>> I think we'd still want to conflate them for
user-facing functionality,
> >>>>>> even if they were separable at the IR level.
> >>>>>>
> >>>>>> - Dave
> >>>>>>
> >>>>>>
> >>>>>>> Roman
> >>>>>>>
> >>>>>>> On Sun, Apr 18, 2021 at 7:37 PM David
Blaikie<dblaikie at gmail.com>
> >>>> wrote:
> >>>>>>>> While trying to reproduce some debug info
thing (I don't have the
> >>>> exact
> >>>>>>> example at the moment - but I think it was
more aggressive than the
> >>>> example
> >>>>>>> I have now, but something like this:
> >>>>>>>> __attribute__((optnone)) int f1() {
> >>>>>>>>      return 3;
> >>>>>>>> }
> >>>>>>>> int main() {
> >>>>>>>>      return f1();
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> (actually I think in my case I had a
variable to hold the return value
> >>>>>>> from f1, with the intent that this
variable's location couldn't use a
> >>>>>>> constant - a load from a volatile variable
would probably have provided
> >>>>>>> similar functionality in this case)
> >>>>>>>> LLVM (& specifically Sparse
Conditional Constant Propagation,
> >>>>>>> llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes
this code noting that f1
> >>>>>>> always returns 3, so rather than using the
return value from the call
> >>>> to
> >>>>>>> f1, it ends up hardcoding the return value:
> >>>>>>>> define dso_local i32 @main()
local_unnamed_addr #1 {
> >>>>>>>>
> >>>>>>>> entry:
> >>>>>>>>
> >>>>>>>>      %call = tail call i32 @_Z2f1v()
> >>>>>>>>
> >>>>>>>>      ret i32 3
> >>>>>>>>
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I consider this a bug - in that optnone is
used to implement -O0 for
> >>>>>>> LTO, so it seemed to me that the correct
behavior is for an optnone
> >>>>>>> function to behave as though it were compiled
in another object file
> >>>>>>> outside the purview of optimizations -
interprocedural or
> >>>> intraprocedural.
> >>>>>>>> So I senthttps://reviews.llvm.org/D100353 
to fix that.
> >>>>>>>>
> >>>>>>>> Florian pointed out that this wasn't
quite specified in the LangRef,
> >>>>>>> which says this about optnone:
> >>>>>>>> This function attribute indicates that
most optimization passes will
> >>>>>>> skip this function, with the exception of
interprocedural optimization
> >>>>>>> passes. Code generation defaults to the “fast”
instruction selector.
> >>>> This
> >>>>>>> attribute cannot be used together with the
alwaysinline attribute; this
> >>>>>>> attribute is also incompatible with the
minsize attribute and the
> >>>> optsize
> >>>>>>> attribute.
> >>>>>>>> This attribute requires the noinline
attribute to be specified on the
> >>>>>>> function as well, so the function is never
inlined into any caller.
> >>>> Only
> >>>>>>> functions with the alwaysinline attribute are
valid candidates for
> >>>> inlining
> >>>>>>> into the body of this function.
> >>>>>>>> So the spec of optnone is unclear (or
arguably explicitly disallows)
> >>>>>>> whether interprocedural optimizations should
treat optnone functions
> >>>> in any
> >>>>>>> particular way.
> >>>>>>>> So I was going to update the wording to
rephrase this to say
> >>>>>>> "Interprocedural optimizations should
treat this function as though it
> >>>> were
> >>>>>>> defined in an isolated module/object."
(perhaps "interprocedural
> >>>>>>> optimizations should treat optnone functions
as opaque" or "as though
> >>>> they
> >>>>>>> were only declarations")
> >>>>>>>> The choice of this direction was based on
my (possibly incorrect or
> >>>>>>> debatable) understanding of optnone, that it
was equivalent to the
> >>>> function
> >>>>>>> being in a separate/non-lto object. (this
seems consistent with the way
> >>>>>>> optnone is used to implement -O0 under lto -
you could imagine a user
> >>>>>>> debugging a binary, using -O0 for the code
they're interested in
> >>>> debugging,
> >>>>>>> and potentially using an interactive debugger
to change some state in
> >>>> the
> >>>>>>> function causing it to return a different
value - which would get quite
> >>>>>>> confusing if the return value was effectively
hardcoded into the
> >>>> caller)
> >>>>>>>> What're folks thoughts on this?
> >>>>>>>>
> >>>>>>>> - Dave
> >>>>> _______________________________________________
> >>>>> LLVM Developers mailing list
> >>>>> llvm-dev at lists.llvm.org
> >>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Johannes Doerfert via llvm-dev

2021-Apr-22 04:08 UTC

head link

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

On 4/21/21 9:29 PM, David Blaikie wrote:>
>>> I'll see about posting an implementation of noipa and switching
>>> __attribute__((optnone)) over to lower to LLVM's optnone+noipa
rather
>>> than optnone+noinline.
>> FWIW, I think `noipa` should not imply `noinline`, unsure if you
>> had that in mind or not.
> Do you think it should require that noipa also carries noinline? (the
> way optnone currently requires noinline) Or should we let the
> non-inlining fall out naturally from the non-exact definition
> property?
>So, non-exact definitions do not prevent inlining. You can even
create an internal copy and use that for IPO, think of it as
"inline-then-outline".

That said, I believe it is a mistake that `optnone` requires
`noinline`. There is no reason for it to do so on the IR level.
If you argue C-level `optnone` should imply `noinline`, that is
a something worth discussing, though on the IR level we can
decouple them. Use case, for example, the not-optimized version
is called from functions that are `optnone` themselves while
other call sites are inlined and your function is optimized. So
you can use the two attributes to do context sensitive `optnone`.

Circling back to `noipa`, I'm very much in favor of letting it
compose freely with the others, at least in the IR. So, it does
not require, nor imply `noinline` or `optnone`. Similarly,
`noinline` does not imply `noipa`, neither does `optnone`. The
latter might be surprising but I imagine I can use function
attributes of an `optnone` function at the call site but I will
not if the function is `noipa`.

Others might have different opinions though.

~ Johannes

llvm dev - Apr 2021 - Revisiting/refining the definition of optnone with interprocedural transformations

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations