thr3ads.net - llvm dev - [llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz [Apr 2018]

If this information is useful, please help other people find it:
Share via:

Jessica Paquette via llvm-dev

2018-Apr-21 02:06 UTC

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Hi all,

The MachineOutliner has come a long way since the original incarnation presented
at the 2016 LLVM Developer's Meeting [1]. In particular, we've been
pushing a lot on the AArch64 target for the MachineOutliner. It's mature
enough at this point that we'd like to take things a step further and turn
it on by default in AArch64 under -Oz. Since the primary goal of -Oz is
"make it as small as possible", the outliner is a good addition to the
-Oz pass pipeline.

For a detailed description of the MachineOutliner, see the original RFC. [2].

We've observed, comparing -Oz to -Oz + outlining on the latest trunk
compiler,

* A geomean ~4.4% text size reduction of the CTMark tests (min = 0.3% on
tramp3d-v4, max = 15.4% on kc)

* A geomean compile-time overhead of ~1.1% (min = 0.2% on 7zip, max = 2.2% on
sqlite3)

We perform regular testing to ensure the outliner produces correct AArch64 code
at -Oz. Tests include the LLVM test suite and standard external test suites such
as SPEC. All tests compile and execute. We've also been making sure that the
outliner produces debuggable code. Users are still guaranteed to have sane
backtraces in the presence of outlined functions.

Added exposure to various programs would help the outlining algorithm mature
further. This, in turn, will help the overall outlining project. For example,
there have been a few discussions on implementing an IR-level outlining pass [3,
4]. Ultimately, the goal is to create a shared outlining interface. This
interface would allow the outliner to exist at any level of representation [4].
The general outlining algorithm will be part of the shared interface. Thus, in
the spirit of incremental improvement, it makes sense to begin
"stress-testing" it sooner than later.

There are a few patches necessary to facilitate this. They are available in the
patches section of this email. I’ll summarize what they do here for the sake of
discussion though.

The first patch is one that teaches the backend about size optimization levels.
This is comparable to what's done in the inliner. Today, the only way to
tell if something is optimizing for size is by looking at function attributes.
This is fine for function passes, but insufficient for module passes like the
MachineOutliner. The function attribute approach forces the outliner to iterate
over every function in the module before deciding to take action. If -Oz
isn't passed in, then the outliner will not find any functions worth
outlining from. This would incur unnecessary compile-time overhead. Thus, we
decided the best course of action is to teach the backend about size options.

The second patch teaches llc to handle -Oz and -Os.

The third patch teaches targets about the outliner. A target will be able to
specify if, and when it wants outlining on by default. It also adds a flag to
disable the MachineOutliner for users that don’t want outlining behaviour when
it is enabled by default.

The final patch teaches clang to pass the new size information down along to the
backend. This allows us to do things like, clang -Oz … foo.c and have the
outliner run.

Thanks for taking the time to read this!
Jessica

*** Patches ***

1. Teaching the backend about -Oz/-Os: https://reviews.llvm.org/D45914
<https://reviews.llvm.org/D45914>
2. Teach llc about -Oz/-Os: https://reviews.llvm.org/D45915
<https://reviews.llvm.org/D45915>
3. Teaching the target about the outliner and enabling it by default under
AArch64: https://reviews.llvm.org/D45916 <https://reviews.llvm.org/D45916>
4. Teaching clang to pass -Oz/-Os down to the backend:
https://reviews.llvm.org/D45917 <https://reviews.llvm.org/D45917>

*** References ***
[1] Reducing Code Size Using Outlining
(https://www.youtube.com/watch?v=yorld-WSOeU
<https://www.youtube.com/watch?v=yorld-WSOeU>)

[2] Original RFC
(http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html
<http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html>)

[3] [RFC] Add IR level interprocedural outliner for code size.
(http://lists.llvm.org/pipermail/llvm-dev/2017-July/115666.html
<http://lists.llvm.org/pipermail/llvm-dev/2017-July/115666.html>)

[4] [RFC] PT.2 Add IR level interprocedural outliner for code size.
(http://lists.llvm.org/pipermail/llvm-dev/2017-September/117153.html
<http://lists.llvm.org/pipermail/llvm-dev/2017-September/117153.html>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180420/2d55c87e/attachment-0001.html>

Friedman, Eli via llvm-dev

2018-Apr-23 20:24 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev
wrote:> We perform regular testing to ensure the outliner produces correct 
> AArch64 code at -Oz. Tests include the LLVM test suite and standard 
> external test suites such as SPEC. All tests compile and 
> execute. We've also been making sure that the outliner produces 
> debuggable code. Users are still guaranteed to have sane backtraces in 
> the presence of outlined functions.
>
> Added exposure to various programs would help the outlining algorithm 
> mature further. This, in turn, will help the overall outlining 
> project. For example, there have been a few discussions on 
> implementing an IR-level outlining pass [3, 4]. Ultimately, the goal 
> is to create a shared outlining interface. This interface would allow 
> the outliner to exist at any level of representation [4]. The general 
> outlining algorithm will be part of the shared interface. Thus, in the 
> spirit of incremental improvement, it makes sense to begin 
> "stress-testing" it sooner than later.
I just tried some tests, and I'm seeing a bunch of failures on SPEC at 
-O3; looks like mostly crashes at runtime.   I can try to reduce a 
testcase if you need it.
>
> There are a few patches necessary to facilitate this. They are 
> available in the patches section of this email. I’ll summarize what 
> they do here for the sake of discussion though.
>
> The first patch is one that teaches the backend about size 
> optimization levels. This is comparable to what's done in the inliner. 
> Today, the only way to tell if something is optimizing for size is by 
> looking at function attributes. This is fine for function passes, but 
> insufficient for module passes like the MachineOutliner. The function 
> attribute approach forces the outliner to iterate over every function 
> in the module before deciding to take action. If -Oz isn't passed in, 
> then the outliner will not find any functions worth outlining from. 
> This would incur unnecessary compile-time overhead. Thus, we decided 
> the best course of action is to teach the backend about size options.
I don't think this is really the right approach.  With LTO, you can have 
a mix of functions, some of which are minsize, and some of which are 
not.  Or with profile info, we might want to outline only cold code (I 
guess this isn't implemented yet, but potentially future work).  Tying 
whether we run the outliner to a command-line flag restricts the 
possible uses; either the entire module gets outlining, or none of it does.

In general, we've been moving away from global settings so we can 
optimize more effectively in this sort of scenario.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/14201c86/attachment.html>

Jessica Paquette via llvm-dev

2018-Apr-23 20:41 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Hi Eli,
> I just tried some tests, and I'm seeing a bunch of failures on SPEC at
-O3; looks like mostly crashes at runtime.   I can try to reduce a testcase if
you need it.If you could do that, that would be great. Our testing has been primarily for
-Oz and -O2, so I haven’t looked at -O3 at all.
> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
I’m worried that walking the entire list of functions in the module when nothing
has the minsize attribute would incur unnecessary compile-time overhead. If
that’s a reasonable thing to do though, I’m fine with that approach. It’d be a
less invasive change, and would give us the desired LTO behaviour for free.

- Jessica

> On Apr 23, 2018, at 1:24 PM, Friedman, Eli <efriedma at
codeaurora.org> wrote:
> 
> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>> We perform regular testing to ensure the outliner produces correct
AArch64 code at -Oz. Tests include the LLVM test suite and standard external
test suites such as SPEC. All tests compile and execute. We've also been
making sure that the outliner produces debuggable code. Users are still
guaranteed to have sane backtraces in the presence of outlined functions.
>> 
>> Added exposure to various programs would help the outlining algorithm
mature further. This, in turn, will help the overall outlining project. For
example, there have been a few discussions on implementing an IR-level outlining
pass [3, 4]. Ultimately, the goal is to create a shared outlining interface.
This interface would allow the outliner to exist at any level of representation
[4]. The general outlining algorithm will be part of the shared interface. Thus,
in the spirit of incremental improvement, it makes sense to begin
"stress-testing" it sooner than later.
> 
> I just tried some tests, and I'm seeing a bunch of failures on SPEC at
-O3; looks like mostly crashes at runtime.   I can try to reduce a testcase if
you need it.
> 
>> 
>> There are a few patches necessary to facilitate this. They are
available in the patches section of this email. I’ll summarize what they do here
for the sake of discussion though.
>> 
>> The first patch is one that teaches the backend about size optimization
levels. This is comparable to what's done in the inliner. Today, the only
way to tell if something is optimizing for size is by looking at function
attributes. This is fine for function passes, but insufficient for module passes
like the         MachineOutliner. The function attribute approach forces the
outliner to iterate over every function in the module before deciding to take
action. If -Oz isn't passed in, then the outliner will not find any
functions worth outlining from. This would incur unnecessary compile-time
overhead. Thus, we decided the best course of action is to teach the backend
about size options.
> 
> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
> 
> In general, we've been moving away from global settings so we can
optimize more effectively in this sort of scenario.
> 
> -Eli
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/fe55698d/attachment.html>

Jessica Paquette via llvm-dev

2018-Apr-26 18:48 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.

I’ve updated the main patch (https://reviews.llvm.org/D45916
<https://reviews.llvm.org/D45916>) to use this method instead. It’s a lot
cleaner and keeps the changes far more self-contained. This should make it
easier to define custom outlining behaviour based off function attributes,
target-specific requirements, etc. The other patches have been abandoned because
they are no longer required.

The compile-time overhead should only appear in AArch64 after this patch. It
should only incur the 1% overhead if -Oz is passed in. Otherwise, there will be
a very small overhead stemming from looping over the functions in the module and
checking for the minsize attribute.

I also fixed the -O3 SPEC failure, so I don’t think that there’s anything
outstanding left to fix.

- Jessica

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180426/b41a6bd5/attachment.html>

Jessica Paquette via llvm-dev

2018-May-22 18:09 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Ping!

Any objections to this?

Eli, you’ve been submitting a few patches to the outliner lately. Since I think
Matthias is a little too busy to review the patch right now, do you think you
could take up the review for it? If you have no objections, I’d like to push
this forward.

- Jessica
> On Apr 20, 2018, at 7:06 PM, Jessica Paquette <jpaquette at
apple.com> wrote:
> 
> Hi all,
> 
> The MachineOutliner has come a long way since the original incarnation
presented at the 2016 LLVM Developer's Meeting [1]. In particular, we've
been pushing a lot on the AArch64 target for the MachineOutliner. It's
mature enough at this point that we'd like to take things a step further and
turn it on by default in AArch64 under -Oz. Since the primary goal of -Oz is
"make it as small as possible", the outliner is a good addition to the
-Oz pass pipeline.
> 
> For a detailed description of the MachineOutliner, see the original RFC.
[2].
> 
> We've observed, comparing -Oz to -Oz + outlining on the latest trunk
compiler,
> 
> * A geomean ~4.4% text size reduction of the CTMark tests (min = 0.3% on
tramp3d-v4, max = 15.4% on kc)
> 
> * A geomean compile-time overhead of ~1.1% (min = 0.2% on 7zip, max = 2.2%
on sqlite3)
> 
> We perform regular testing to ensure the outliner produces correct AArch64
code at -Oz. Tests include the LLVM test suite and standard external test suites
such as SPEC. All tests compile and execute. We've also been making sure
that the outliner produces debuggable code. Users are still guaranteed to have
sane backtraces in the presence of outlined functions.
> 
> Added exposure to various programs would help the outlining algorithm
mature further. This, in turn, will help the overall outlining project. For
example, there have been a few discussions on implementing an IR-level outlining
pass [3, 4]. Ultimately, the goal is to create a shared outlining interface.
This interface would allow the outliner to exist at any level of representation
[4]. The general outlining algorithm will be part of the shared interface. Thus,
in the spirit of incremental improvement, it makes sense to begin
"stress-testing" it sooner than later.
> 
> There are a few patches necessary to facilitate this. They are available in
the patches section of this email. I’ll summarize what they do here for the sake
of discussion though.
> 
> The first patch is one that teaches the backend about size optimization
levels. This is comparable to what's done in the inliner. Today, the only
way to tell if something is optimizing for size is by looking at function
attributes. This is fine for function passes, but insufficient for module passes
like the MachineOutliner. The function attribute approach forces the outliner to
iterate over every function in the module before deciding to take action. If -Oz
isn't passed in, then the outliner will not find any functions worth
outlining from. This would incur unnecessary compile-time overhead. Thus, we
decided the best course of action is to teach the backend about size options.
> 
> The second patch teaches llc to handle -Oz and -Os.
> 
> The third patch teaches targets about the outliner. A target will be able
to specify if, and when it wants outlining on by default. It also adds a flag to
disable the MachineOutliner for users that don’t want outlining behaviour when
it is enabled by default.
> 
> The final patch teaches clang to pass the new size information down along
to the backend. This allows us to do things like, clang -Oz … foo.c and have the
outliner run.
> 
> Thanks for taking the time to read this!
> Jessica
> 
> *** Patches ***
> 
> 1. Teaching the backend about -Oz/-Os: https://reviews.llvm.org/D45914
<https://reviews.llvm.org/D45914>
> 2. Teach llc about -Oz/-Os: https://reviews.llvm.org/D45915
<https://reviews.llvm.org/D45915>
> 3. Teaching the target about the outliner and enabling it by default under
AArch64: https://reviews.llvm.org/D45916 <https://reviews.llvm.org/D45916>
> 4. Teaching clang to pass -Oz/-Os down to the backend:
https://reviews.llvm.org/D45917 <https://reviews.llvm.org/D45917>
> 
> 
> *** References ***
> [1] Reducing Code Size Using Outlining
(https://www.youtube.com/watch?v=yorld-WSOeU
<https://www.youtube.com/watch?v=yorld-WSOeU>)
> 
> [2] Original RFC
(http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html
<http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html>)
> 
> [3] [RFC] Add IR level interprocedural outliner for code size.
(http://lists.llvm.org/pipermail/llvm-dev/2017-July/115666.html
<http://lists.llvm.org/pipermail/llvm-dev/2017-July/115666.html>)
> 
> [4] [RFC] PT.2 Add IR level interprocedural outliner for code size.
(http://lists.llvm.org/pipermail/llvm-dev/2017-September/117153.html
<http://lists.llvm.org/pipermail/llvm-dev/2017-September/117153.html>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180522/3fb96a01/attachment.html>

Matthias Braun via llvm-dev

2018-Jul-13 21:17 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

> On Apr 23, 2018, at 1:24 PM, Friedman, Eli via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>> We perform regular testing to ensure the outliner produces correct
AArch64 code at -Oz. Tests include the LLVM test suite and standard external
test suites such as SPEC. All tests compile and execute. We've also been
making sure that the outliner produces debuggable code. Users are still
guaranteed to have sane backtraces in the presence of outlined functions.
>> 
>> Added exposure to various programs would help the outlining algorithm
mature further. This, in turn, will help the overall outlining project. For
example, there have been a few discussions on implementing an IR-level outlining
pass [3, 4]. Ultimately, the goal is to create a shared outlining interface.
This interface would allow the outliner to exist at any level of representation
[4]. The general outlining algorithm will be part of the shared interface. Thus,
in the spirit of incremental improvement, it makes sense to begin
"stress-testing" it sooner than later.
> 
> I just tried some tests, and I'm seeing a bunch of failures on SPEC at
-O3; looks like mostly crashes at runtime.   I can try to reduce a testcase if
you need it.
> 
>> 
>> There are a few patches necessary to facilitate this. They are
available in the patches section of this email. I’ll summarize what they do here
for the sake of discussion though.
>> 
>> The first patch is one that teaches the backend about size optimization
levels. This is comparable to what's done in the inliner. Today, the only
way to tell if something is optimizing for size is by looking at function
attributes. This is fine for function passes, but insufficient for module passes
like the MachineOutliner. The function attribute approach forces the outliner to
iterate over every function in the module before deciding to take action. If -Oz
isn't passed in, then the outliner will not find any functions worth
outlining from. This would incur unnecessary compile-time overhead. Thus, we
decided the best course of action is to teach the backend about size options.
> 
> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
Just to give some alternative view on this (currently going through the patches
and wondering if things really have to be that complicated...):

- O0-O3 are handled by adding more/less passes into the pass pipeline and
thereby enabling/disabling optimizations.
- When LTOing (mostly) the O0-O3 of the last LTO/linking step is what counts
AFAIK.
- We probably want to have smarted behavior when mixing compilation untis with
different optlevels, but we don't have one today.
- So why do we start creating a local solution for mixing -Os with non-Os code
and the outliner here?

- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180713/da8503f6/attachment.html>

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Apr 2018 - [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Reasonably Related Threads