thr3ads.net - llvm dev - [llvm-dev] [RFC] Adding function attributes to represent codegen optimization level [Apr 2018]

If this information is useful, please help other people find it:
Share via:

via llvm-dev

2018-Apr-05 15:44 UTC

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

On 2018-04-04 22:00, Mehdi AMINI wrote:> Le mar. 3 avr. 2018 à 12:47, via llvm-dev <llvm-dev at
lists.llvm.org> a
> écrit :
> 
>> All,
>> A recent commit, D43040/r324557, changed the behavior of the gold
>> plugin
>> when compiling with LTO.  The change now causes the codegen
>> optimization
>> level to default to CodeGenOpt::Default (i.e., -O2) rather than use
>> the
>> LTO optimization level.  The argument was made that the LTO
>> optimization
>> level should control the amount of cross-module optimizations done
>> by
>> LTO, but it should not control the codegen optimization level; that
>> should be based off of the optimization level used during the
>> initial
>> compilation phase (i.e., bitcode generation).
> 
> I actually don't understand this clearly.
> 
> Unless we're saying that we would change the IR optimization level
> either using the -OX flag during LTO (which is clumsy, because what is
> a "cross-module optimization" alone?), why would the `-OX` flag
change
> the Codegen optimization level when passed to clang without LTO, but
> it wouldn't during LTO?
I'm simply stating the argument made by Peter in r324557; this is not my 
opinion.  Personally, I think it seems reasonable to allow the 
optimization flag used during the link step to control the codegen 
optimization level.  However, this is no longer the case after r324557.

FWIW, I would be very much on-board with reverting r324557 and then 
changing lld to mirror the behavior of the gold plugin, but I don't know 
if that's the consensus in the community.
> Are we encoding O1/O2/O3 optimization level into function attributes
> and trying to honor these during the LTO IR optimization pipeline as
> well?
No.  The intent of these attributes are to control the codegen pipeline 
only.  Of course this is all based on the assumption that using the 
optimization level used during bitcode generation should also be used 
with LTO in the codegen pipeline.

I don't have a strong opinion either way.  I just want codgen to respect 
the fact that I specified -O3 during both the bitcode generation and 
link steps, but that's not the case anymore.  :)

  Chad
> 
> Thanks,
> 
> --
> Mehdi
> 
>> Assuming the argument is reasonable (it make sense to me), I was
>> hoping
>> to solicit feedback on how to proceed.  The suggestion in
>> D43040/r324557
>> was to add function attributes to represent the compile-time
>> optimization level (which also seems reasonable to me).
>> 
>> As a first step, I've put together two patches: 1) an llvm patch
>> that
>> adds the function attributes to the LLVM IR and 2) a clang patch
>> that
>> attaches these attributes to each function based on the codegen
>> optimization level.  I then use the function level attributes to
>> "reconstruct" to codegen optimization level used with LTO.
>> 
>> Please understand this is very much a WIP and just a very small step
>> towards a final solution.
>> 
>> Here are the patches for reference:
>> Clang: D45226
>> LLVM: D45225
>> 
>> Regards,
>> Chad
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Peter Collingbourne via llvm-dev

2018-Apr-06 20:56 UTC

head link

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

On Thu, Apr 5, 2018 at 8:44 AM, via llvm-dev <llvm-dev at lists.llvm.org>
wrote:
> On 2018-04-04 22:00, Mehdi AMINI wrote:
>
>> Le mar. 3 avr. 2018 à 12:47, via llvm-dev <llvm-dev at
lists.llvm.org> a
>> écrit :
>>
>> All,
>>> A recent commit, D43040/r324557, changed the behavior of the gold
>>> plugin
>>> when compiling with LTO.  The change now causes the codegen
>>> optimization
>>> level to default to CodeGenOpt::Default (i.e., -O2) rather than use
>>> the
>>> LTO optimization level.  The argument was made that the LTO
>>> optimization
>>> level should control the amount of cross-module optimizations done
>>> by
>>> LTO, but it should not control the codegen optimization level; that
>>> should be based off of the optimization level used during the
>>> initial
>>> compilation phase (i.e., bitcode generation).
>>>
>>
>> I actually don't understand this clearly.
>>
>> Unless we're saying that we would change the IR optimization level
>> either using the -OX flag during LTO (which is clumsy, because what is
>> a "cross-module optimization" alone?), why would the `-OX`
flag change
>> the Codegen optimization level when passed to clang without LTO, but
>> it wouldn't during LTO?
>>
>
> I'm simply stating the argument made by Peter in r324557; this is not
my
> opinion.  Personally, I think it seems reasonable to allow the optimization
> flag used during the link step to control the codegen optimization level.
> However, this is no longer the case after r324557.
>
> FWIW, I would be very much on-board with reverting r324557 and then
> changing lld to mirror the behavior of the gold plugin, but I don't
know if
> that's the consensus in the community.

To answer your question Mehdi, what I mean by "cross-module
optimization"
is simply a series of passes that operates on a module after having linked
parts of other modules into it, that would result in IPO between modules.
For example, an inlining pass followed by scalar optimization passes.

The way I think about LTO is that it effectively splits the pass pipeline
in two, which lets us put cross-module optimizations in the middle.

What this means semantically is that LTO opt level 0 would essentially run
the two parts of the pipeline one after the other, giving you essentially
the same binary as not-LTO, but it would allow for LTO-only features such
as CFI to work. One might have also chosen to compile parts of one's
program with different optimization levels, and those levels would need to
be respected by the code generator. For this to work, we must at least use
the same CG opt level that was used at compile time.

Higher LTO opt levels would result in more passes being run in the middle,
perhaps at more aggressive settings, which would result in more
cross-module optimizations. But we still should at least try to approximate
the optimization level requested for each particular function.

Ideally, we would use the same optimization level that would have been used
at compile time. Such an optimization level would be communicated via an
attribute, as proposed here. However, in the absence of that information,
it does seem reasonable to make a guess about the user intent from the LTO
opt level. If a user specifies an LTO opt level of 3, it probably means
that the user cares a lot about performance, so we can guess a CG opt level
of CodeGenOpt::Aggressive. Otherwise, we can guess a CG opt level of
CodeGenOpt::Default since this would seem to provide the best balance of
performance, code size and debuggability.

So this is the direction that I would propose:
- Remove ability to override CG opt level from LTO API. For now, we can
infer it from the LTO opt level as mentioned above.
- Add function attributes for signaling compile-time opt level and start
moving towards using them in preference to TargetMachine::OptLevel.
- Remove code for inferring CG opt level from LTO opt level, as it is now
redundant with the function attribute.

This would seem to get us to a desired state without regressing users who
might depend on being able to use the aggressive CG opt level from LTO.

Thoughts?

Peter

Are we encoding O1/O2/O3 optimization level into function
attributes>> and trying to honor these during the LTO IR optimization pipeline as
>> well?
>>
>
> No.  The intent of these attributes are to control the codegen pipeline
> only.  Of course this is all based on the assumption that using the
> optimization level used during bitcode generation should also be used with
> LTO in the codegen pipeline.
>
> I don't have a strong opinion either way.  I just want codgen to
respect
> the fact that I specified -O3 during both the bitcode generation and link
> steps, but that's not the case anymore.  :)
>
>  Chad
>
>
>
>> Thanks,
>>
>> --
>> Mehdi
>>
>> Assuming the argument is reasonable (it make sense to me), I was
>>> hoping
>>> to solicit feedback on how to proceed.  The suggestion in
>>> D43040/r324557
>>> was to add function attributes to represent the compile-time
>>> optimization level (which also seems reasonable to me).
>>>
>>> As a first step, I've put together two patches: 1) an llvm
patch
>>> that
>>> adds the function attributes to the LLVM IR and 2) a clang patch
>>> that
>>> attaches these attributes to each function based on the codegen
>>> optimization level.  I then use the function level attributes to
>>> "reconstruct" to codegen optimization level used with
LTO.
>>>
>>> Please understand this is very much a WIP and just a very small
step
>>> towards a final solution.
>>>
>>> Here are the patches for reference:
>>> Clang: D45226
>>> LLVM: D45225
>>>
>>> Regards,
>>> Chad
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180406/96f5912c/attachment.html>

Mehdi AMINI via llvm-dev

2018-Apr-07 03:53 UTC

head link

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

Hi,



Le ven. 6 avr. 2018 à 13:56, Peter Collingbourne <peter at pcc.me.uk> a
écrit :
> On Thu, Apr 5, 2018 at 8:44 AM, via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> On 2018-04-04 22:00, Mehdi AMINI wrote:
>>
>>> Le mar. 3 avr. 2018 à 12:47, via llvm-dev <llvm-dev at
lists.llvm.org> a
>>> écrit :
>>>
>>> All,
>>>> A recent commit, D43040/r324557, changed the behavior of the
gold
>>>> plugin
>>>> when compiling with LTO.  The change now causes the codegen
>>>> optimization
>>>> level to default to CodeGenOpt::Default (i.e., -O2) rather than
use
>>>> the
>>>> LTO optimization level.  The argument was made that the LTO
>>>> optimization
>>>> level should control the amount of cross-module optimizations
done
>>>> by
>>>> LTO, but it should not control the codegen optimization level;
that
>>>> should be based off of the optimization level used during the
>>>> initial
>>>> compilation phase (i.e., bitcode generation).
>>>>
>>>
>>> I actually don't understand this clearly.
>>>
>>> Unless we're saying that we would change the IR optimization
level
>>> either using the -OX flag during LTO (which is clumsy, because what
is
>>> a "cross-module optimization" alone?), why would the
`-OX` flag change
>>> the Codegen optimization level when passed to clang without LTO,
but
>>> it wouldn't during LTO?
>>>
>>
>> I'm simply stating the argument made by Peter in r324557; this is
not my
>> opinion.  Personally, I think it seems reasonable to allow the
optimization
>> flag used during the link step to control the codegen optimization
level.
>> However, this is no longer the case after r324557.
>>
>> FWIW, I would be very much on-board with reverting r324557 and then
>> changing lld to mirror the behavior of the gold plugin, but I don't
know if
>> that's the consensus in the community.
>
>
> To answer your question Mehdi, what I mean by "cross-module
optimization"
> is simply a series of passes that operates on a module after having linked
> parts of other modules into it, that would result in IPO between modules.
> For example, an inlining pass followed by scalar optimization passes.
>
> The way I think about LTO is that it effectively splits the pass pipeline
> in two, which lets us put cross-module optimizations in the middle.
>
> What this means semantically is that LTO opt level 0 would essentially run
> the two parts of the pipeline one after the other, giving you essentially
> the same binary as not-LTO, but it would allow for LTO-only features such
> as CFI to work. One might have also chosen to compile parts of one's
> program with different optimization levels, and those levels would need to
> be respected by the code generator. For this to work, we must at least use
> the same CG opt level that was used at compile time.
>
> Higher LTO opt levels would result in more passes being run in the middle,
> perhaps at more aggressive settings, which would result in more
> cross-module optimizations. But we still should at least try to approximate
> the optimization level requested for each particular function.
>
> Ideally, we would use the same optimization level that would have been
> used at compile time. Such an optimization level would be communicated via
> an attribute, as proposed here. However, in the absence of that
> information, it does seem reasonable to make a guess about the user intent
> from the LTO opt level. If a user specifies an LTO opt level of 3, it
> probably means that the user cares a lot about performance, so we can guess
> a CG opt level of CodeGenOpt::Aggressive. Otherwise, we can guess a CG opt
> level of CodeGenOpt::Default since this would seem to provide the best
> balance of performance, code size and debuggability.
>
> So this is the direction that I would propose:
> - Remove ability to override CG opt level from LTO API. For now, we can
> infer it from the LTO opt level as mentioned above.
> - Add function attributes for signaling compile-time opt level and start
> moving towards using them in preference to TargetMachine::OptLevel.
> - Remove code for inferring CG opt level from LTO opt level, as it is now
> redundant with the function attribute.
>
> This would seem to get us to a desired state without regressing users who
> might depend on being able to use the aggressive CG opt level from LTO.
>
> Thoughts?
>

That all seems reasonable to me. That said I haven't given much thoughts
about the opt-level through function attributes
recently.>From what I remember, it was hard to figure the implementation wheninlining two functions (O3 -> O2 or vice-versa), and also some part of the
pipeline just can't be split because they operate module-wise.
For instance if O3 includes an extra `globalopt` pass that O2 does not
include, how do you handle this when some functions are marked as O2 and
others as O3?

The only things I could reason about at the time was that O0 really means
no-optimization and it could be translated somehow to opt_none.

Best,

-- 
Mehdi



>
> Peter
>
> Are we encoding O1/O2/O3 optimization level into function attributes
>>> and trying to honor these during the LTO IR optimization pipeline
as
>>> well?
>>>
>>
>> No.  The intent of these attributes are to control the codegen pipeline
>> only.  Of course this is all based on the assumption that using the
>> optimization level used during bitcode generation should also be used
with
>> LTO in the codegen pipeline.
>>
>> I don't have a strong opinion either way.  I just want codgen to
respect
>> the fact that I specified -O3 during both the bitcode generation and
link
>> steps, but that's not the case anymore.  :)
>>
>>  Chad
>>
>>
>>
>>> Thanks,
>>>
>>> --
>>> Mehdi
>>>
>>> Assuming the argument is reasonable (it make sense to me), I was
>>>> hoping
>>>> to solicit feedback on how to proceed.  The suggestion in
>>>> D43040/r324557
>>>> was to add function attributes to represent the compile-time
>>>> optimization level (which also seems reasonable to me).
>>>>
>>>> As a first step, I've put together two patches: 1) an llvm
patch
>>>> that
>>>> adds the function attributes to the LLVM IR and 2) a clang
patch
>>>> that
>>>> attaches these attributes to each function based on the codegen
>>>> optimization level.  I then use the function level attributes
to
>>>> "reconstruct" to codegen optimization level used with
LTO.
>>>>
>>>> Please understand this is very much a WIP and just a very small
step
>>>> towards a final solution.
>>>>
>>>> Here are the patches for reference:
>>>> Clang: D45226
>>>> LLVM: D45225
>>>>
>>>> Regards,
>>>> Chad
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
>
> --
> --
> Peter
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180407/79c53de6/attachment.html>

Sean Silva via llvm-dev

2018-Apr-09 04:05 UTC

head link

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

On Fri, Apr 6, 2018, 1:56 PM Peter Collingbourne via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Thu, Apr 5, 2018 at 8:44 AM, via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> On 2018-04-04 22:00, Mehdi AMINI wrote:
>>
>>> Le mar. 3 avr. 2018 à 12:47, via llvm-dev <llvm-dev at
lists.llvm.org> a
>>> écrit :
>>>
>>> All,
>>>> A recent commit, D43040/r324557, changed the behavior of the
gold
>>>> plugin
>>>> when compiling with LTO.  The change now causes the codegen
>>>> optimization
>>>> level to default to CodeGenOpt::Default (i.e., -O2) rather than
use
>>>> the
>>>> LTO optimization level.  The argument was made that the LTO
>>>> optimization
>>>> level should control the amount of cross-module optimizations
done
>>>> by
>>>> LTO, but it should not control the codegen optimization level;
that
>>>> should be based off of the optimization level used during the
>>>> initial
>>>> compilation phase (i.e., bitcode generation).
>>>>
>>>
>>> I actually don't understand this clearly.
>>>
>>> Unless we're saying that we would change the IR optimization
level
>>> either using the -OX flag during LTO (which is clumsy, because what
is
>>> a "cross-module optimization" alone?), why would the
`-OX` flag change
>>> the Codegen optimization level when passed to clang without LTO,
but
>>> it wouldn't during LTO?
>>>
>>
>> I'm simply stating the argument made by Peter in r324557; this is
not my
>> opinion.  Personally, I think it seems reasonable to allow the
optimization
>> flag used during the link step to control the codegen optimization
level.
>> However, this is no longer the case after r324557.
>>
>> FWIW, I would be very much on-board with reverting r324557 and then
>> changing lld to mirror the behavior of the gold plugin, but I don't
know if
>> that's the consensus in the community.
>
>
> To answer your question Mehdi, what I mean by "cross-module
optimization"
> is simply a series of passes that operates on a module after having linked
> parts of other modules into it, that would result in IPO between modules.
> For example, an inlining pass followed by scalar optimization passes.
>
> The way I think about LTO is that it effectively splits the pass pipeline
> in two, which lets us put cross-module optimizations in the middle.
>
> What this means semantically is that LTO opt level 0 would essentially run
> the two parts of the pipeline one after the other, giving you essentially
> the same binary as not-LTO, but it would allow for LTO-only features such
> as CFI to work. One might have also chosen to compile parts of one's
> program with different optimization levels, and those levels would need to
> be respected by the code generator. For this to work, we must at least use
> the same CG opt level that was used at compile time.
>
> Higher LTO opt levels would result in more passes being run in the middle,
> perhaps at more aggressive settings, which would result in more
> cross-module optimizations. But we still should at least try to approximate
> the optimization level requested for each particular function.
>
> Ideally, we would use the same optimization level that would have been
> used at compile time. Such an optimization level would be communicated via
> an attribute, as proposed here. However, in the absence of that
> information, it does seem reasonable to make a guess about the user intent
> from the LTO opt level. If a user specifies an LTO opt level of 3, it
> probably means that the user cares a lot about performance, so we can guess
> a CG opt level of CodeGenOpt::Aggressive. Otherwise, we can guess a CG opt
> level of CodeGenOpt::Default since this would seem to provide the best
> balance of performance, code size and debuggability.
>
> So this is the direction that I would propose:
> - Remove ability to override CG opt level from LTO API. For now, we can
> infer it from the LTO opt level as mentioned above.
> - Add function attributes for signaling compile-time opt level and start
> moving towards using them in preference to TargetMachine::OptLevel.
> - Remove code for inferring CG opt level from LTO opt level, as it is now
> redundant with the function attribute.
>
Long term, what opt level would older IR get? (I.e. IR missing an explicit
opt level)

-- Sean Silva
>
> This would seem to get us to a desired state without regressing users who
> might depend on being able to use the aggressive CG opt level from LTO.
>
> Thoughts?
>
> Peter
>
> Are we encoding O1/O2/O3 optimization level into function attributes
>>> and trying to honor these during the LTO IR optimization pipeline
as
>>> well?
>>>
>>
>> No.  The intent of these attributes are to control the codegen pipeline
>> only.  Of course this is all based on the assumption that using the
>> optimization level used during bitcode generation should also be used
with
>> LTO in the codegen pipeline.
>>
>> I don't have a strong opinion either way.  I just want codgen to
respect
>> the fact that I specified -O3 during both the bitcode generation and
link
>> steps, but that's not the case anymore.  :)
>>
>>  Chad
>>
>>
>>
>>> Thanks,
>>>
>>> --
>>> Mehdi
>>>
>>> Assuming the argument is reasonable (it make sense to me), I was
>>>> hoping
>>>> to solicit feedback on how to proceed.  The suggestion in
>>>> D43040/r324557
>>>> was to add function attributes to represent the compile-time
>>>> optimization level (which also seems reasonable to me).
>>>>
>>>> As a first step, I've put together two patches: 1) an llvm
patch
>>>> that
>>>> adds the function attributes to the LLVM IR and 2) a clang
patch
>>>> that
>>>> attaches these attributes to each function based on the codegen
>>>> optimization level.  I then use the function level attributes
to
>>>> "reconstruct" to codegen optimization level used with
LTO.
>>>>
>>>> Please understand this is very much a WIP and just a very small
step
>>>> towards a final solution.
>>>>
>>>> Here are the patches for reference:
>>>> Clang: D45226
>>>> LLVM: D45225
>>>>
>>>> Regards,
>>>> Chad
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
>
> --
> --
> Peter
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180409/bba80a33/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Apr 2018 - [RFC] Adding function attributes to represent codegen optimization level

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level

Apparently Analagous Threads