thr3ads.net - llvm dev - [llvm-dev] Effectiveness of llvm optimisation passes [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Yi Lin via llvm-dev

2017-Sep-22 05:04 UTC

[llvm-dev] Effectiveness of llvm optimisation passes

Hi all,

I am trying to understand the effectiveness of various llvm 
optimisations when a language targets llvm (or C) as its backend.

The following is my approach (please correct me if I did anything wrong):

I am trying to explicitly control the optimisations passes in llvm. I 
disable optimisation in clang, but instead emit unoptimized llvm IR, and 
use opt to optimise that. These are what I do:

* clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm 
-momit-leaf-frame-pointer a.c -o a.ll
* opt -(PASSES) a.ll -o a.bc
* llc a.bc -filetype=obj -o a.o

To evaluate the effectiveness of optimisation passes, I started with an 
'add-one-in' approach. The baseline is no optimisations passes, and I 
iterate through all the O1 passes and explicitly allow one pass for each 
run. I didnt try understand those passes so it is a black box test. This 
will show how effective each single optimisation is (ignore correlation 
of passes). This can be iterative, e.g. identify the most effecitve 
pass, and always enable it, and then 'add-one-in' for the rest passes. I
also plan to take a 'leave-one-out' approach as well, in which the 
baseline is all optimisations enabled, and one pass will be disabled at 
a time.

Here is the result for the 'add-one-in' approach on some micro
benchmarks:

https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0

The result seems a bit surprising. A few passes, such as licm, sroa, 
instcombine and mem2reg, seem to deliver a very close performance as O1 
(which includes all the passes). Figure 7 is an example. If my 
methodology is correct, then my guess is those optimisations may require 
some common internal passes, which actually deliver most of the 
improvements. I am wondering if this is true.

Any suggestion or critiques are welcome.

Thanks,
Yi

Craig Topper via llvm-dev

2017-Sep-22 05:10 UTC

head link

[llvm-dev] Effectiveness of llvm optimisation passes

Have -O0 on your clang command line causes all functions to get marked with
an 'optnone' attribute that prevents opt from being able to optimize
them
later. You should also add "-Xclang -disable-O0-optnone" to your
command
line.

~Craig

On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> I am trying to understand the effectiveness of various llvm optimisations
> when a language targets llvm (or C) as its backend.
>
> The following is my approach (please correct me if I did anything wrong):
>
> I am trying to explicitly control the optimisations passes in llvm. I
> disable optimisation in clang, but instead emit unoptimized llvm IR, and
> use opt to optimise that. These are what I do:
>
> * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
> -momit-leaf-frame-pointer a.c -o a.ll
> * opt -(PASSES) a.ll -o a.bc
> * llc a.bc -filetype=obj -o a.o
>
> To evaluate the effectiveness of optimisation passes, I started with an
> 'add-one-in' approach. The baseline is no optimisations passes, and
I
> iterate through all the O1 passes and explicitly allow one pass for each
> run. I didnt try understand those passes so it is a black box test. This
> will show how effective each single optimisation is (ignore correlation of
> passes). This can be iterative, e.g. identify the most effecitve pass, and
> always enable it, and then 'add-one-in' for the rest passes. I also
plan to
> take a 'leave-one-out' approach as well, in which the baseline is
all
> optimisations enabled, and one pass will be disabled at a time.
>
> Here is the result for the 'add-one-in' approach on some micro
benchmarks:
>
> https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>
> The result seems a bit surprising. A few passes, such as licm, sroa,
> instcombine and mem2reg, seem to deliver a very close performance as O1
> (which includes all the passes). Figure 7 is an example. If my methodology
> is correct, then my guess is those optimisations may require some common
> internal passes, which actually deliver most of the improvements. I am
> wondering if this is true.
>
> Any suggestion or critiques are welcome.
>
> Thanks,
> Yi
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170921/72904564/attachment.html>

Haidl, Michael via llvm-dev

2017-Sep-22 05:14 UTC

head link

[llvm-dev] Effectiveness of llvm optimisation passes

Craig was faster on the optnone flag (if you are using Clang 5 and above).
However, I observed that some of the opt passes ignore the optnone in 
some cases, e.g., -breack-crit-edge.
You can use the -stats flag from opt to get a list of statistics what a 
particular pass did (if it collects statistics of course).

On 22.09.2017 07:11, Craig Topper via llvm-dev wrote:> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang
-disable-O0-optnone"
> to your command line.
> 
> ~Craig
> 
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> 
>     Hi all,
> 
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
> 
>     The following is my approach (please correct me if I did anything
>     wrong):
> 
>     I am trying to explicitly control the optimisations passes in llvm.
>     I disable optimisation in clang, but instead emit unoptimized llvm
>     IR, and use opt to optimise that. These are what I do:
> 
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
> 
>     To evaluate the effectiveness of optimisation passes, I started with
>     an 'add-one-in' approach. The baseline is no optimisations
passes,
>     and I iterate through all the O1 passes and explicitly allow one
>     pass for each run. I didnt try understand those passes so it is a
>     black box test. This will show how effective each single
>     optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always enable
>     it, and then 'add-one-in' for the rest passes. I also plan to
take a
>     'leave-one-out' approach as well, in which the baseline is all
>     optimisations enabled, and one pass will be disabled at a time.
> 
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
> 
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>    
<https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
> 
>     The result seems a bit surprising. A few passes, such as licm, sroa,
>     instcombine and mem2reg, seem to deliver a very close performance as
>     O1 (which includes all the passes). Figure 7 is an example. If my
>     methodology is correct, then my guess is those optimisations may
>     require some common internal passes, which actually deliver most of
>     the improvements. I am wondering if this is true.
> 
>     Any suggestion or critiques are welcome.
> 
>     Thanks,
>     Yi
> 
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
>

Yi Lin via llvm-dev

2017-Sep-22 05:21 UTC

head link

[llvm-dev] Effectiveness of llvm optimisation passes

Thank you very much. That explains the results.

I am running the benchmarks again with '-Xclang -disable-O0-optnone'.

Thanks,
Yi

On 22/9/17 15:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang
-disable-O0-optnone"
> to your command line.
>
> ~Craig
>
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     Hi all,
>
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
>
>     The following is my approach (please correct me if I did anything
>     wrong):
>
>     I am trying to explicitly control the optimisations passes in
>     llvm. I disable optimisation in clang, but instead emit
>     unoptimized llvm IR, and use opt to optimise that. These are what
>     I do:
>
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
>
>     To evaluate the effectiveness of optimisation passes, I started
>     with an 'add-one-in' approach. The baseline is no optimisations
>     passes, and I iterate through all the O1 passes and explicitly
>     allow one pass for each run. I didnt try understand those passes
>     so it is a black box test. This will show how effective each
>     single optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always
>     enable it, and then 'add-one-in' for the rest passes. I also
plan
>     to take a 'leave-one-out' approach as well, in which the
baseline
>     is all optimisations enabled, and one pass will be disabled at a time.
>
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
>
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>    
<https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
>
>     The result seems a bit surprising. A few passes, such as licm,
>     sroa, instcombine and mem2reg, seem to deliver a very close
>     performance as O1 (which includes all the passes). Figure 7 is an
>     example. If my methodology is correct, then my guess is those
>     optimisations may require some common internal passes, which
>     actually deliver most of the improvements. I am wondering if this
>     is true.
>
>     Any suggestion or critiques are welcome.
>
>     Thanks,
>     Yi
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>

Yi Lin via llvm-dev

2017-Sep-22 07:17 UTC

head link

[llvm-dev] Effectiveness of llvm optimisation passes

I noticed that there is a '-run-pass' argument for llc. I am wondering 
if I can do a similar approach with machine level optimisations/passes 
for llc. Are those passes optional (so I can turn them off)? And how can 
I get MIR format as llc expects with '-run-pass'?

Thanks a lot.

Cheers,
Yi

On 22/9/17 15:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang
-disable-O0-optnone"
> to your command line.
>
> ~Craig
>
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     Hi all,
>
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
>
>     The following is my approach (please correct me if I did anything
>     wrong):
>
>     I am trying to explicitly control the optimisations passes in
>     llvm. I disable optimisation in clang, but instead emit
>     unoptimized llvm IR, and use opt to optimise that. These are what
>     I do:
>
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
>
>     To evaluate the effectiveness of optimisation passes, I started
>     with an 'add-one-in' approach. The baseline is no optimisations
>     passes, and I iterate through all the O1 passes and explicitly
>     allow one pass for each run. I didnt try understand those passes
>     so it is a black box test. This will show how effective each
>     single optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always
>     enable it, and then 'add-one-in' for the rest passes. I also
plan
>     to take a 'leave-one-out' approach as well, in which the
baseline
>     is all optimisations enabled, and one pass will be disabled at a time.
>
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
>
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>    
<https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
>
>     The result seems a bit surprising. A few passes, such as licm,
>     sroa, instcombine and mem2reg, seem to deliver a very close
>     performance as O1 (which includes all the passes). Figure 7 is an
>     example. If my methodology is correct, then my guess is those
>     optimisations may require some common internal passes, which
>     actually deliver most of the improvements. I am wondering if this
>     is true.
>
>     Any suggestion or critiques are welcome.
>
>     Thanks,
>     Yi
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>

Yi Lin via llvm-dev

2017-Sep-26 07:04 UTC

head link

[llvm-dev] Effectiveness of llvm optimisation passes

I feel I am still doing something wrong, as the performance do not seem 
to change with different passes I use.

The commandline I am using are:

* clang -O0 -Xclang -disable-O0-optnone -S -mllvm -disable-llvm-optzns 
-emit-llvm -momit-leaf-frame-pointer a.c -o a.ll
* opt -(PASS_FLAG) a.ll -o a.bc
* llc a.bc -filetype=obj -o a.o

I tried with PASS_FLAG as all passes from O1, a specific pass in O1, or 
directly use '-O1', '-O0'. The performance variation seems to be
noise
only (+/- 1%).

And clang is warning me about unused arguments for '-Xclang 
-disable-O0-optnone', though the result is different from not using the 
argument. I am using clang-5.0

Any help would be appreciated.

Thanks,
Yi

On 22/9/17 17:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang
-disable-O0-optnone"
> to your command line.
>
> ~Craig
>
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     Hi all,
>
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
>
>     The following is my approach (please correct me if I did anything
>     wrong):
>
>     I am trying to explicitly control the optimisations passes in
>     llvm. I disable optimisation in clang, but instead emit
>     unoptimized llvm IR, and use opt to optimise that. These are what
>     I do:
>
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
>
>     To evaluate the effectiveness of optimisation passes, I started
>     with an 'add-one-in' approach. The baseline is no optimisations
>     passes, and I iterate through all the O1 passes and explicitly
>     allow one pass for each run. I didnt try understand those passes
>     so it is a black box test. This will show how effective each
>     single optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always
>     enable it, and then 'add-one-in' for the rest passes. I also
plan
>     to take a 'leave-one-out' approach as well, in which the
baseline
>     is all optimisations enabled, and one pass will be disabled at a time.
>
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
>
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>    
<https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
>
>     The result seems a bit surprising. A few passes, such as licm,
>     sroa, instcombine and mem2reg, seem to deliver a very close
>     performance as O1 (which includes all the passes). Figure 7 is an
>     example. If my methodology is correct, then my guess is those
>     optimisations may require some common internal passes, which
>     actually deliver most of the improvements. I am wondering if this
>     is true.
>
>     Any suggestion or critiques are welcome.
>
>     Thanks,
>     Yi
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Sep 2017 - Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

[llvm-dev] Effectiveness of llvm optimisation passes

Possibly Parallel Threads