thr3ads.net - llvm dev - [llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations [Apr 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Apr-19 21:41 UTC

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:>
>
> On 4/18/21 10:51 PM, David Blaikie wrote:
> > On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> > johannesdoerfert at gmail.com> wrote:
> >
> >> I'm very much in favor of `noipa`. It comes up every few
months
> >> and it would be widely useful.
> >
> > Out of curiosity, what sort of uses do you have in mind for it?
>
> Most times people basically want `noinline` to also mean "no
> interprocedural optimization", but without `optnone`. So, your
> function is optimized but actually called and the call result
> is used, no constants are propagated etc.
>
> Example:
>
> ```
> __attribute__((noipa))
> void foo() { return 1 + 2; }
> void bar() { return foo(); }
> ```
> should become
>
> ```
> __attribute__((noipa))
> void foo() { return 3; }
> void bar() { return foo(); }
> ```
> which it does not right now.
I'm curious what the use case is you've come across (the justification
for the GCC implementation of noipa was mostly for compiler testing -
which is my interest in having these semantics (under optnone or
otherwise) - so just curious what other use cases I should have in
mind, etc)
> >> I'd expose it via Clang and -O0 could
> >> set it as well (for the LTO case).
> >>
> >> When it comes to inexact definitions, optnone functions, and
existing
> >> attributes,
> >> I'd be in favor of 1) always allowing the use of existing
attributes,
> >>
> > I'm not sure what you mean by this ^ - could you
rephrase/elaborate?
> >
> >
> >> and 2) not deriving new ones for an inexact or optnone definition.
> >>
> > Also this ^ I'm similarly confused/unclear about.
>
> So if you have a call of F, and F has attribute A, we can use
> that fact at the call site, regardless of the definition of F.
> F could be `optnone` or with non-exact linkage, but the information
> attached to it is still usable.
+1 SGTM.
> If we go for the above we can never derive/attach information
> for a non-exact linkage definitions. That way we prevent IPO from
> using information that might be invalid if the definition is replaced.
Yup, sounds good.
> It is all about where you disturb the ipo deduction in this case, I think
> it is more beneficial to not attach new things but an argument could be
> made to allow that but no propagation.
Allow adding them, but never using them? Yeah, that doesn't seem
especially helpful/useful - the attributes are entirely for IPO, so if
you want to block IPO it seems best not to add them.
> Both have benefits, its' not 100%
> clear what is more desirable at the end of the day.
>
>
> >
> >
> >> This is how the Attributor determines if it a function level
attribute
> >> could
> >> be derived or if we should only stick with the existing
information:
> >>
> >>       /// Determine whether the function \p F is IPO amendable
> >>       ///
> >>       /// If a function is exactly defined or it has alwaysinline
attribute
> >>       /// and is viable to be inlined, we say it is IPO amendable
> >>       bool isFunctionIPOAmendable(const Function &F) {
> >>         return F.hasExactDefinition() ||
> >> InfoCache.InlineableFunctions.count(&F);
> >>       }
> >>
> >> So, if the above check doesn't hold we will not add new
attributes but
> >> we will
> >> still use existing ones. This seems to me the right way to allow
> >> users/frontends
> >> to provide information selectively.
> >>
> > Yep, that sounds right to me (if you put attributes on an
optnone/noipa
> > function, they should be usable/used - but none should be
discovered/added
> > later by inspection of the implementation of such a function) -
currently
> > doesn't seem to be the case for the (old pass manager?)
FunctionAttrs pass,
> > so I have to figure some things out there.
>
> That is what I tried to say above, I think.
>
> In the end, I want to know that foo does not access memory but
> bar could for all we know:
>
> ```
> __attribute__((pure, optnone))         // or non-exact linkage
> void pure_optnone() { /* empty */ }
>
> __attribute__((optnone))               // or non-exact linkage
> void optnone() { /* empty */ }
>
> void foo() { pure_optnone(); }
>
> void bar() { optnone(); }
> ```
Got it,

I'll see about posting an implementation of noipa and switching
__attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
than optnone+noinline.

Happy if someone wants to add clang support for an
__attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
I'll do it, guess it's probably fairly cheap/easy).

- Dave
> ~ Johannes
>
>
> >
> >
> >> That said, right now the Attributor will not propagate any
information
> >> from an
> >> optnone function or derive new information. Nevertheless, I'd
be in
> >> favor to allow
> >> existing information to be used for IPO.
> >>
> > *nod* I think I'm with you there.
> >
> > - Dave
> >
> >
> >> ~ Johannes
> >>
> >>
> >> On 4/18/21 8:40 PM, David Blaikie via llvm-dev wrote:
> >>> Prototyping the idea of "isDefinitionExact"
returning false for optnone
> >>> (whether or not we split it out into noipo or not) I've
tripped over
> >>> something it seems I created 5 years ago:
> >>>
> >>> I added some IPC support for optnone to GlobalsModRef:
> >>>
> >>
https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>> - so it wouldn't conclude properties of an optnone
function.
> >>>
> >>> But I then made a follow-up commit (without a lot of context
as to why,
> >>> unfortunately :/ ) that allowed GlobasModRef to use existing
attributes
> >> on
> >>> an optnone function:
> >>>
> >>
https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>> But it seems making the function definition inexact, breaks
the unit
> >>> testing added in the latter commit. I suppose then it's an
open question
> >>> whether existing attributes on an inexact definition should be
used at
> >> all?
> >>> (I don't know what motivated me to support them for
optnone)
> >>>
> >>> Oh, and here's a change from Chandler around the same time
similarly
> >>> blocking some ipo for optnone:
> >>>
> >>
https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c
> >>> - in this case preventing FunctionAttrs from deriving the
attributes for
> >> an
> >>> optnone function. That functionality looks like it can be
subsumed by the
> >>> inexact approach - applying inexact to optnone and removing
the change in
> >>> Chandler's patch still passes the tests. (hmm, tested -
not quite, but
> >> more
> >>> work to do there)
> >>>
> >>> On Sun, Apr 18, 2021 at 10:06 AM David Blaikie <dblaikie at
gmail.com>
> >> wrote:
> >>>> On Sun, Apr 18, 2021 at 9:43 AM Roman Lebedev
<lebedev.ri at gmail.com>
> >>>> wrote:
> >>>>
> >>>>> There's 'noipa' attribute in GCC,
currently it is not supported by
> >> clang.
> >>>>> Theoretically, how would one implement it?
> >>>>>
> >>>> If we wanted to do this really robustly, I guess we might
have to
> >>>> introduce some sort of "here's the usual way to
check if this is a
> >>>> definition/get the body of the function" (which for
noipa it says
> >> "there is
> >>>> no body/don't look here") and "no, really, I
need the definition" (for
> >>>> actual code generation).
> >>>>
> >>>> Though I'm not advocating for that - I'm OK with a
more
> >> ad-hoc/best-effort
> >>>> implementation targeting the -O0/debugging assistant
> >>>> __attribute__((optnone)) kind of use case - happy to fix
cases as they
> >> come
> >>>> up to improve the user experience for these situations.
> >>>>
> >>>> Maybe we could get away with generalizing this by having
an optnone (or
> >>>> noipa) function appear "interposable" even
though it doesn't have a real
> >>>> interposable linkage? That should hinder/disable any IPA.
> >>>>
> >>>> Hmm, looks like GlobalValue::isDefinitionExact would be
best to return
> >>>> false in this case (whatever we end up naming it) /maybe/
> >>>> mayBeDerefined should return false too.
> >>>>
> >>>> Yeah, I guess if we can implement such a robust
generalization, then
> >> it'd
> >>>> probably be OK/easy enough to implement both noipa and
optnone implies
> >>>> noipa the same as it implies noinline (well, I guess noipa
would subsume
> >>>> the noinline implication - if the function isn't
exact, the inliner
> >> won't
> >>>> inline it so there wouldn't be any need for the
explicit noinline)
> >>>>
> >>>>
> >>>>> With your proposal, clang `noipa` attribute could be
lowered
> >>>>> to `optnone` on the whole function, To me that seems
like
> >>>>> too much of a hammer, should that be the path forward.
> >>>>>
> >>>> I agree that lowering noipa to optnone would be a very
aggressive form
> >> of
> >>>> noipa - likely if we want to support noipa it would be to
support it
> >>>> separately and maybe either lower -O0 (& maybe
> >> __attribute__((optnone))) to
> >>>> both optnone+noipa+noinline (since optnone already implies
noinline) or
> >>>> make optnone imply ipa/be a superset of it implicitly (if
we do have
> >> noipa
> >>>> it's probably best to have "optnone requires
noipa" the same way
> >> "optnone
> >>>> requires noinline" rather than an implicit superset
sort of thing).
> >>>>
> >>>> I think that'd certainly be appropriate for -O0, and
I'd argue it'd be
> >>>> appropriate for __attribute__((optnone)) because I think
it'd be what
> >>>> people expect/is consistent with the motivation for the
attribute (for
> >>>> debuggability - so you wouldn't want a caller to not
fill in
> >>>> parameters/pass in garbage because it knows the
implementation doesn't
> >>>> matter, or not use the result because it knows what the
result should
> >> be).
> >>>>
> >>>>> Would it not be best to not conflate the two,
> >>>>> and just introduce the `noipa` attribute?
> >>>>>
> >>>> I think we'd still want to conflate them for
user-facing functionality,
> >>>> even if they were separable at the IR level.
> >>>>
> >>>> - Dave
> >>>>
> >>>>
> >>>>> Roman
> >>>>>
> >>>>> On Sun, Apr 18, 2021 at 7:37 PM David Blaikie
<dblaikie at gmail.com>
> >> wrote:
> >>>>>> While trying to reproduce some debug info thing (I
don't have the
> >> exact
> >>>>> example at the moment - but I think it was more
aggressive than the
> >> example
> >>>>> I have now, but something like this:
> >>>>>> __attribute__((optnone)) int f1() {
> >>>>>>     return 3;
> >>>>>> }
> >>>>>> int main() {
> >>>>>>     return f1();
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> (actually I think in my case I had a variable to
hold the return value
> >>>>> from f1, with the intent that this variable's
location couldn't use a
> >>>>> constant - a load from a volatile variable would
probably have provided
> >>>>> similar functionality in this case)
> >>>>>> LLVM (& specifically Sparse Conditional
Constant Propagation,
> >>>>> llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes this
code noting that f1
> >>>>> always returns 3, so rather than using the return
value from the call
> >> to
> >>>>> f1, it ends up hardcoding the return value:
> >>>>>> define dso_local i32 @main() local_unnamed_addr #1
{
> >>>>>>
> >>>>>> entry:
> >>>>>>
> >>>>>>     %call = tail call i32 @_Z2f1v()
> >>>>>>
> >>>>>>     ret i32 3
> >>>>>>
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> I consider this a bug - in that optnone is used to
implement -O0 for
> >>>>> LTO, so it seemed to me that the correct behavior is
for an optnone
> >>>>> function to behave as though it were compiled in
another object file
> >>>>> outside the purview of optimizations - interprocedural
or
> >> intraprocedural.
> >>>>>> So I sent https://reviews.llvm.org/D100353 to fix
that.
> >>>>>>
> >>>>>> Florian pointed out that this wasn't quite
specified in the LangRef,
> >>>>> which says this about optnone:
> >>>>>> This function attribute indicates that most
optimization passes will
> >>>>> skip this function, with the exception of
interprocedural optimization
> >>>>> passes. Code generation defaults to the “fast”
instruction selector.
> >> This
> >>>>> attribute cannot be used together with the
alwaysinline attribute; this
> >>>>> attribute is also incompatible with the minsize
attribute and the
> >> optsize
> >>>>> attribute.
> >>>>>> This attribute requires the noinline attribute to
be specified on the
> >>>>> function as well, so the function is never inlined
into any caller.
> >> Only
> >>>>> functions with the alwaysinline attribute are valid
candidates for
> >> inlining
> >>>>> into the body of this function.
> >>>>>> So the spec of optnone is unclear (or arguably
explicitly disallows)
> >>>>> whether interprocedural optimizations should treat
optnone functions
> >> in any
> >>>>> particular way.
> >>>>>> So I was going to update the wording to rephrase
this to say
> >>>>> "Interprocedural optimizations should treat this
function as though it
> >> were
> >>>>> defined in an isolated module/object." (perhaps
"interprocedural
> >>>>> optimizations should treat optnone functions as
opaque" or "as though
> >> they
> >>>>> were only declarations")
> >>>>>> The choice of this direction was based on my
(possibly incorrect or
> >>>>> debatable) understanding of optnone, that it was
equivalent to the
> >> function
> >>>>> being in a separate/non-lto object. (this seems
consistent with the way
> >>>>> optnone is used to implement -O0 under lto - you could
imagine a user
> >>>>> debugging a binary, using -O0 for the code they're
interested in
> >> debugging,
> >>>>> and potentially using an interactive debugger to
change some state in
> >> the
> >>>>> function causing it to return a different value -
which would get quite
> >>>>> confusing if the return value was effectively
hardcoded into the
> >> caller)
> >>>>>> What're folks thoughts on this?
> >>>>>>
> >>>>>> - Dave
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Johannes Doerfert via llvm-dev

2021-Apr-19 23:32 UTC

head link

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

On 4/19/21 4:41 PM, David Blaikie wrote:> On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert
> <johannesdoerfert at gmail.com>  wrote:
>> On 4/18/21 10:51 PM, David Blaikie wrote:
>>> On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
>>> johannesdoerfert at gmail.com> wrote:
>>>
>>>> I'm very much in favor of `noipa`. It comes up every few
months
>>>> and it would be widely useful.
>>> Out of curiosity, what sort of uses do you have in mind for it?
>> Most times people basically want `noinline` to also mean "no
>> interprocedural optimization", but without `optnone`. So, your
>> function is optimized but actually called and the call result
>> is used, no constants are propagated etc.
>>
>> Example:
>>
>> ```
>> __attribute__((noipa))
>> void foo() { return 1 + 2; }
>> void bar() { return foo(); }
>> ```
>> should become
>>
>> ```
>> __attribute__((noipa))
>> void foo() { return 3; }
>> void bar() { return foo(); }
>> ```
>> which it does not right now.
> I'm curious what the use case is you've come across (the
justification
> for the GCC implementation of noipa was mostly for compiler testing -
> which is my interest in having these semantics (under optnone or
> otherwise) - so just curious what other use cases I should have in
> mind, etc)
I looked for `noipa` in my inbox, here are some results that
show different use cases people brought up since March 2020:

https://reviews.llvm.org/D75815#1939277
https://bugs.llvm.org/show_bug.cgi?id=46463
https://reviews.llvm.org/D93838#2472155
https://reviews.llvm.org/D97971#2608302

Another use case is runtime call detection in the presence of definitions.
So, we detect `malloc` and also various OpenMP runtime calls, which works
fine because those are usually declarations. However, sometimes they are
not and then we can easily end up with signatures that do not match what we
expect anymore. At least that happens if we link in the OpenMP GPU runtime
into an application.

>>>> I'd expose it via Clang and -O0 could
>>>> set it as well (for the LTO case).
>>>>
>>>> When it comes to inexact definitions, optnone functions, and
existing
>>>> attributes,
>>>> I'd be in favor of 1) always allowing the use of existing
attributes,
>>>>
>>> I'm not sure what you mean by this ^ - could you
rephrase/elaborate?
>>>
>>>
>>>> and 2) not deriving new ones for an inexact or optnone
definition.
>>>>
>>> Also this ^ I'm similarly confused/unclear about.
>> So if you have a call of F, and F has attribute A, we can use
>> that fact at the call site, regardless of the definition of F.
>> F could be `optnone` or with non-exact linkage, but the information
>> attached to it is still usable.
> +1 SGTM.
>
>> If we go for the above we can never derive/attach information
>> for a non-exact linkage definitions. That way we prevent IPO from
>> using information that might be invalid if the definition is replaced.
> Yup, sounds good.
>
>> It is all about where you disturb the ipo deduction in this case, I
think
>> it is more beneficial to not attach new things but an argument could be
>> made to allow that but no propagation.
> Allow adding them, but never using them? Yeah, that doesn't seem
> especially helpful/useful - the attributes are entirely for IPO, so if
> you want to block IPO it seems best not to add them.
We could use them *inside* the function, but we can make that work
differently as well. IPO seems the more important target.

>> Both have benefits, its' not 100%
>> clear what is more desirable at the end of the day.
>>
>>
>>>> This is how the Attributor determines if it a function level
attribute
>>>> could
>>>> be derived or if we should only stick with the existing
information:
>>>>
>>>>        /// Determine whether the function \p F is IPO amendable
>>>>        ///
>>>>        /// If a function is exactly defined or it has
alwaysinline attribute
>>>>        /// and is viable to be inlined, we say it is IPO
amendable
>>>>        bool isFunctionIPOAmendable(const Function &F) {
>>>>          return F.hasExactDefinition() ||
>>>> InfoCache.InlineableFunctions.count(&F);
>>>>        }
>>>>
>>>> So, if the above check doesn't hold we will not add new
attributes but
>>>> we will
>>>> still use existing ones. This seems to me the right way to
allow
>>>> users/frontends
>>>> to provide information selectively.
>>>>
>>> Yep, that sounds right to me (if you put attributes on an
optnone/noipa
>>> function, they should be usable/used - but none should be
discovered/added
>>> later by inspection of the implementation of such a function) -
currently
>>> doesn't seem to be the case for the (old pass manager?)
FunctionAttrs pass,
>>> so I have to figure some things out there.
>> That is what I tried to say above, I think.
>>
>> In the end, I want to know that foo does not access memory but
>> bar could for all we know:
>>
>> ```
>> __attribute__((pure, optnone))         // or non-exact linkage
>> void pure_optnone() { /* empty */ }
>>
>> __attribute__((optnone))               // or non-exact linkage
>> void optnone() { /* empty */ }
>>
>> void foo() { pure_optnone(); }
>>
>> void bar() { optnone(); }
>> ```
> Got it,
>
> I'll see about posting an implementation of noipa and switching
> __attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
> than optnone+noinline.
FWIW, I think `noipa` should not imply `noinline`, unsure if you
had that in mind or not.

> Happy if someone wants to add clang support for an
> __attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
> I'll do it, guess it's probably fairly cheap/easy).
Agreed, I won't volunteer right now, I doubt that I'll get to it
anytime soon. That said, I actually would like to use `noipa`, see
above.

~ Johannes

> - Dave
>
>> ~ Johannes
>>
>>
>>>> That said, right now the Attributor will not propagate any
information
>>>> from an
>>>> optnone function or derive new information. Nevertheless,
I'd be in
>>>> favor to allow
>>>> existing information to be used for IPO.
>>>>
>>> *nod* I think I'm with you there.
>>>
>>> - Dave
>>>
>>>
>>>> ~ Johannes
>>>>
>>>>
>>>> On 4/18/21 8:40 PM, David Blaikie via llvm-dev wrote:
>>>>> Prototyping the idea of "isDefinitionExact"
returning false for optnone
>>>>> (whether or not we split it out into noipo or not) I've
tripped over
>>>>> something it seems I created 5 years ago:
>>>>>
>>>>> I added some IPC support for optnone to GlobalsModRef:
>>>>>
>>>>
https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
>>>>> - so it wouldn't conclude properties of an optnone
function.
>>>>>
>>>>> But I then made a follow-up commit (without a lot of
context as to why,
>>>>> unfortunately :/ ) that allowed GlobasModRef to use
existing attributes
>>>> on
>>>>> an optnone function:
>>>>>
>>>>
https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
>>>>> But it seems making the function definition inexact, breaks
the unit
>>>>> testing added in the latter commit. I suppose then it's
an open question
>>>>> whether existing attributes on an inexact definition should
be used at
>>>> all?
>>>>> (I don't know what motivated me to support them for
optnone)
>>>>>
>>>>> Oh, and here's a change from Chandler around the same
time similarly
>>>>> blocking some ipo for optnone:
>>>>>
>>>>
https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c
>>>>> - in this case preventing FunctionAttrs from deriving the
attributes for
>>>> an
>>>>> optnone function. That functionality looks like it can be
subsumed by the
>>>>> inexact approach - applying inexact to optnone and removing
the change in
>>>>> Chandler's patch still passes the tests. (hmm, tested -
not quite, but
>>>> more
>>>>> work to do there)
>>>>>
>>>>> On Sun, Apr 18, 2021 at 10:06 AM David Blaikie<dblaikie
at gmail.com>
>>>> wrote:
>>>>>> On Sun, Apr 18, 2021 at 9:43 AM Roman
Lebedev<lebedev.ri at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> There's 'noipa' attribute in GCC,
currently it is not supported by
>>>> clang.
>>>>>>> Theoretically, how would one implement it?
>>>>>>>
>>>>>> If we wanted to do this really robustly, I guess we
might have to
>>>>>> introduce some sort of "here's the usual way
to check if this is a
>>>>>> definition/get the body of the function" (which
for noipa it says
>>>> "there is
>>>>>> no body/don't look here") and "no,
really, I need the definition" (for
>>>>>> actual code generation).
>>>>>>
>>>>>> Though I'm not advocating for that - I'm OK
with a more
>>>> ad-hoc/best-effort
>>>>>> implementation targeting the -O0/debugging assistant
>>>>>> __attribute__((optnone)) kind of use case - happy to
fix cases as they
>>>> come
>>>>>> up to improve the user experience for these situations.
>>>>>>
>>>>>> Maybe we could get away with generalizing this by
having an optnone (or
>>>>>> noipa) function appear "interposable" even
though it doesn't have a real
>>>>>> interposable linkage? That should hinder/disable any
IPA.
>>>>>>
>>>>>> Hmm, looks like GlobalValue::isDefinitionExact would be
best to return
>>>>>> false in this case (whatever we end up naming it)
/maybe/
>>>>>> mayBeDerefined should return false too.
>>>>>>
>>>>>> Yeah, I guess if we can implement such a robust
generalization, then
>>>> it'd
>>>>>> probably be OK/easy enough to implement both noipa and
optnone implies
>>>>>> noipa the same as it implies noinline (well, I guess
noipa would subsume
>>>>>> the noinline implication - if the function isn't
exact, the inliner
>>>> won't
>>>>>> inline it so there wouldn't be any need for the
explicit noinline)
>>>>>>
>>>>>>
>>>>>>> With your proposal, clang `noipa` attribute could
be lowered
>>>>>>> to `optnone` on the whole function, To me that
seems like
>>>>>>> too much of a hammer, should that be the path
forward.
>>>>>>>
>>>>>> I agree that lowering noipa to optnone would be a very
aggressive form
>>>> of
>>>>>> noipa - likely if we want to support noipa it would be
to support it
>>>>>> separately and maybe either lower -O0 (& maybe
>>>> __attribute__((optnone))) to
>>>>>> both optnone+noipa+noinline (since optnone already
implies noinline) or
>>>>>> make optnone imply ipa/be a superset of it implicitly
(if we do have
>>>> noipa
>>>>>> it's probably best to have "optnone requires
noipa" the same way
>>>> "optnone
>>>>>> requires noinline" rather than an implicit
superset sort of thing).
>>>>>>
>>>>>> I think that'd certainly be appropriate for -O0,
and I'd argue it'd be
>>>>>> appropriate for __attribute__((optnone)) because I
think it'd be what
>>>>>> people expect/is consistent with the motivation for the
attribute (for
>>>>>> debuggability - so you wouldn't want a caller to
not fill in
>>>>>> parameters/pass in garbage because it knows the
implementation doesn't
>>>>>> matter, or not use the result because it knows what the
result should
>>>> be).
>>>>>>> Would it not be best to not conflate the two,
>>>>>>> and just introduce the `noipa` attribute?
>>>>>>>
>>>>>> I think we'd still want to conflate them for
user-facing functionality,
>>>>>> even if they were separable at the IR level.
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>>
>>>>>>> Roman
>>>>>>>
>>>>>>> On Sun, Apr 18, 2021 at 7:37 PM David
Blaikie<dblaikie at gmail.com>
>>>> wrote:
>>>>>>>> While trying to reproduce some debug info thing
(I don't have the
>>>> exact
>>>>>>> example at the moment - but I think it was more
aggressive than the
>>>> example
>>>>>>> I have now, but something like this:
>>>>>>>> __attribute__((optnone)) int f1() {
>>>>>>>>      return 3;
>>>>>>>> }
>>>>>>>> int main() {
>>>>>>>>      return f1();
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> (actually I think in my case I had a variable
to hold the return value
>>>>>>> from f1, with the intent that this variable's
location couldn't use a
>>>>>>> constant - a load from a volatile variable would
probably have provided
>>>>>>> similar functionality in this case)
>>>>>>>> LLVM (& specifically Sparse Conditional
Constant Propagation,
>>>>>>> llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes this
code noting that f1
>>>>>>> always returns 3, so rather than using the return
value from the call
>>>> to
>>>>>>> f1, it ends up hardcoding the return value:
>>>>>>>> define dso_local i32 @main() local_unnamed_addr
#1 {
>>>>>>>>
>>>>>>>> entry:
>>>>>>>>
>>>>>>>>      %call = tail call i32 @_Z2f1v()
>>>>>>>>
>>>>>>>>      ret i32 3
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> I consider this a bug - in that optnone is used
to implement -O0 for
>>>>>>> LTO, so it seemed to me that the correct behavior
is for an optnone
>>>>>>> function to behave as though it were compiled in
another object file
>>>>>>> outside the purview of optimizations -
interprocedural or
>>>> intraprocedural.
>>>>>>>> So I senthttps://reviews.llvm.org/D100353  to
fix that.
>>>>>>>>
>>>>>>>> Florian pointed out that this wasn't quite
specified in the LangRef,
>>>>>>> which says this about optnone:
>>>>>>>> This function attribute indicates that most
optimization passes will
>>>>>>> skip this function, with the exception of
interprocedural optimization
>>>>>>> passes. Code generation defaults to the “fast”
instruction selector.
>>>> This
>>>>>>> attribute cannot be used together with the
alwaysinline attribute; this
>>>>>>> attribute is also incompatible with the minsize
attribute and the
>>>> optsize
>>>>>>> attribute.
>>>>>>>> This attribute requires the noinline attribute
to be specified on the
>>>>>>> function as well, so the function is never inlined
into any caller.
>>>> Only
>>>>>>> functions with the alwaysinline attribute are valid
candidates for
>>>> inlining
>>>>>>> into the body of this function.
>>>>>>>> So the spec of optnone is unclear (or arguably
explicitly disallows)
>>>>>>> whether interprocedural optimizations should treat
optnone functions
>>>> in any
>>>>>>> particular way.
>>>>>>>> So I was going to update the wording to
rephrase this to say
>>>>>>> "Interprocedural optimizations should treat
this function as though it
>>>> were
>>>>>>> defined in an isolated module/object."
(perhaps "interprocedural
>>>>>>> optimizations should treat optnone functions as
opaque" or "as though
>>>> they
>>>>>>> were only declarations")
>>>>>>>> The choice of this direction was based on my
(possibly incorrect or
>>>>>>> debatable) understanding of optnone, that it was
equivalent to the
>>>> function
>>>>>>> being in a separate/non-lto object. (this seems
consistent with the way
>>>>>>> optnone is used to implement -O0 under lto - you
could imagine a user
>>>>>>> debugging a binary, using -O0 for the code
they're interested in
>>>> debugging,
>>>>>>> and potentially using an interactive debugger to
change some state in
>>>> the
>>>>>>> function causing it to return a different value -
which would get quite
>>>>>>> confusing if the return value was effectively
hardcoded into the
>>>> caller)
>>>>>>>> What're folks thoughts on this?
>>>>>>>>
>>>>>>>> - Dave
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - Apr 2021 - Revisiting/refining the definition of optnone with interprocedural transformations

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations