Yi Lin via llvm-dev
2017-Sep-22 05:04 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
Hi all, I am trying to understand the effectiveness of various llvm optimisations when a language targets llvm (or C) as its backend. The following is my approach (please correct me if I did anything wrong): I am trying to explicitly control the optimisations passes in llvm. I disable optimisation in clang, but instead emit unoptimized llvm IR, and use opt to optimise that. These are what I do: * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm -momit-leaf-frame-pointer a.c -o a.ll * opt -(PASSES) a.ll -o a.bc * llc a.bc -filetype=obj -o a.o To evaluate the effectiveness of optimisation passes, I started with an 'add-one-in' approach. The baseline is no optimisations passes, and I iterate through all the O1 passes and explicitly allow one pass for each run. I didnt try understand those passes so it is a black box test. This will show how effective each single optimisation is (ignore correlation of passes). This can be iterative, e.g. identify the most effecitve pass, and always enable it, and then 'add-one-in' for the rest passes. I also plan to take a 'leave-one-out' approach as well, in which the baseline is all optimisations enabled, and one pass will be disabled at a time. Here is the result for the 'add-one-in' approach on some micro benchmarks: https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 The result seems a bit surprising. A few passes, such as licm, sroa, instcombine and mem2reg, seem to deliver a very close performance as O1 (which includes all the passes). Figure 7 is an example. If my methodology is correct, then my guess is those optimisations may require some common internal passes, which actually deliver most of the improvements. I am wondering if this is true. Any suggestion or critiques are welcome. Thanks, Yi
Craig Topper via llvm-dev
2017-Sep-22 05:10 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
Have -O0 on your clang command line causes all functions to get marked with an 'optnone' attribute that prevents opt from being able to optimize them later. You should also add "-Xclang -disable-O0-optnone" to your command line. ~Craig On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > I am trying to understand the effectiveness of various llvm optimisations > when a language targets llvm (or C) as its backend. > > The following is my approach (please correct me if I did anything wrong): > > I am trying to explicitly control the optimisations passes in llvm. I > disable optimisation in clang, but instead emit unoptimized llvm IR, and > use opt to optimise that. These are what I do: > > * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm > -momit-leaf-frame-pointer a.c -o a.ll > * opt -(PASSES) a.ll -o a.bc > * llc a.bc -filetype=obj -o a.o > > To evaluate the effectiveness of optimisation passes, I started with an > 'add-one-in' approach. The baseline is no optimisations passes, and I > iterate through all the O1 passes and explicitly allow one pass for each > run. I didnt try understand those passes so it is a black box test. This > will show how effective each single optimisation is (ignore correlation of > passes). This can be iterative, e.g. identify the most effecitve pass, and > always enable it, and then 'add-one-in' for the rest passes. I also plan to > take a 'leave-one-out' approach as well, in which the baseline is all > optimisations enabled, and one pass will be disabled at a time. > > Here is the result for the 'add-one-in' approach on some micro benchmarks: > > https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 > > The result seems a bit surprising. A few passes, such as licm, sroa, > instcombine and mem2reg, seem to deliver a very close performance as O1 > (which includes all the passes). Figure 7 is an example. If my methodology > is correct, then my guess is those optimisations may require some common > internal passes, which actually deliver most of the improvements. I am > wondering if this is true. > > Any suggestion or critiques are welcome. > > Thanks, > Yi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170921/72904564/attachment.html>
Haidl, Michael via llvm-dev
2017-Sep-22 05:14 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
Craig was faster on the optnone flag (if you are using Clang 5 and above). However, I observed that some of the opt passes ignore the optnone in some cases, e.g., -breack-crit-edge. You can use the -stats flag from opt to get a list of statistics what a particular pass did (if it collects statistics of course). On 22.09.2017 07:11, Craig Topper via llvm-dev wrote:> Have -O0 on your clang command line causes all functions to get marked > with an 'optnone' attribute that prevents opt from being able to > optimize them later. You should also add "-Xclang -disable-O0-optnone" > to your command line. > > ~Craig > > On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi all, > > I am trying to understand the effectiveness of various llvm > optimisations when a language targets llvm (or C) as its backend. > > The following is my approach (please correct me if I did anything > wrong): > > I am trying to explicitly control the optimisations passes in llvm. > I disable optimisation in clang, but instead emit unoptimized llvm > IR, and use opt to optimise that. These are what I do: > > * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm > -momit-leaf-frame-pointer a.c -o a.ll > * opt -(PASSES) a.ll -o a.bc > * llc a.bc -filetype=obj -o a.o > > To evaluate the effectiveness of optimisation passes, I started with > an 'add-one-in' approach. The baseline is no optimisations passes, > and I iterate through all the O1 passes and explicitly allow one > pass for each run. I didnt try understand those passes so it is a > black box test. This will show how effective each single > optimisation is (ignore correlation of passes). This can be > iterative, e.g. identify the most effecitve pass, and always enable > it, and then 'add-one-in' for the rest passes. I also plan to take a > 'leave-one-out' approach as well, in which the baseline is all > optimisations enabled, and one pass will be disabled at a time. > > Here is the result for the 'add-one-in' approach on some micro > benchmarks: > > https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 > <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0> > > The result seems a bit surprising. A few passes, such as licm, sroa, > instcombine and mem2reg, seem to deliver a very close performance as > O1 (which includes all the passes). Figure 7 is an example. If my > methodology is correct, then my guess is those optimisations may > require some common internal passes, which actually deliver most of > the improvements. I am wondering if this is true. > > Any suggestion or critiques are welcome. > > Thanks, > Yi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >
Yi Lin via llvm-dev
2017-Sep-22 05:21 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
Thank you very much. That explains the results. I am running the benchmarks again with '-Xclang -disable-O0-optnone'. Thanks, Yi On 22/9/17 15:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked > with an 'optnone' attribute that prevents opt from being able to > optimize them later. You should also add "-Xclang -disable-O0-optnone" > to your command line. > > ~Craig > > On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi all, > > I am trying to understand the effectiveness of various llvm > optimisations when a language targets llvm (or C) as its backend. > > The following is my approach (please correct me if I did anything > wrong): > > I am trying to explicitly control the optimisations passes in > llvm. I disable optimisation in clang, but instead emit > unoptimized llvm IR, and use opt to optimise that. These are what > I do: > > * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm > -momit-leaf-frame-pointer a.c -o a.ll > * opt -(PASSES) a.ll -o a.bc > * llc a.bc -filetype=obj -o a.o > > To evaluate the effectiveness of optimisation passes, I started > with an 'add-one-in' approach. The baseline is no optimisations > passes, and I iterate through all the O1 passes and explicitly > allow one pass for each run. I didnt try understand those passes > so it is a black box test. This will show how effective each > single optimisation is (ignore correlation of passes). This can be > iterative, e.g. identify the most effecitve pass, and always > enable it, and then 'add-one-in' for the rest passes. I also plan > to take a 'leave-one-out' approach as well, in which the baseline > is all optimisations enabled, and one pass will be disabled at a time. > > Here is the result for the 'add-one-in' approach on some micro > benchmarks: > > https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 > <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0> > > The result seems a bit surprising. A few passes, such as licm, > sroa, instcombine and mem2reg, seem to deliver a very close > performance as O1 (which includes all the passes). Figure 7 is an > example. If my methodology is correct, then my guess is those > optimisations may require some common internal passes, which > actually deliver most of the improvements. I am wondering if this > is true. > > Any suggestion or critiques are welcome. > > Thanks, > Yi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >
Yi Lin via llvm-dev
2017-Sep-22 07:17 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
I noticed that there is a '-run-pass' argument for llc. I am wondering if I can do a similar approach with machine level optimisations/passes for llc. Are those passes optional (so I can turn them off)? And how can I get MIR format as llc expects with '-run-pass'? Thanks a lot. Cheers, Yi On 22/9/17 15:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked > with an 'optnone' attribute that prevents opt from being able to > optimize them later. You should also add "-Xclang -disable-O0-optnone" > to your command line. > > ~Craig > > On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi all, > > I am trying to understand the effectiveness of various llvm > optimisations when a language targets llvm (or C) as its backend. > > The following is my approach (please correct me if I did anything > wrong): > > I am trying to explicitly control the optimisations passes in > llvm. I disable optimisation in clang, but instead emit > unoptimized llvm IR, and use opt to optimise that. These are what > I do: > > * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm > -momit-leaf-frame-pointer a.c -o a.ll > * opt -(PASSES) a.ll -o a.bc > * llc a.bc -filetype=obj -o a.o > > To evaluate the effectiveness of optimisation passes, I started > with an 'add-one-in' approach. The baseline is no optimisations > passes, and I iterate through all the O1 passes and explicitly > allow one pass for each run. I didnt try understand those passes > so it is a black box test. This will show how effective each > single optimisation is (ignore correlation of passes). This can be > iterative, e.g. identify the most effecitve pass, and always > enable it, and then 'add-one-in' for the rest passes. I also plan > to take a 'leave-one-out' approach as well, in which the baseline > is all optimisations enabled, and one pass will be disabled at a time. > > Here is the result for the 'add-one-in' approach on some micro > benchmarks: > > https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 > <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0> > > The result seems a bit surprising. A few passes, such as licm, > sroa, instcombine and mem2reg, seem to deliver a very close > performance as O1 (which includes all the passes). Figure 7 is an > example. If my methodology is correct, then my guess is those > optimisations may require some common internal passes, which > actually deliver most of the improvements. I am wondering if this > is true. > > Any suggestion or critiques are welcome. > > Thanks, > Yi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >
Yi Lin via llvm-dev
2017-Sep-26 07:04 UTC
[llvm-dev] Effectiveness of llvm optimisation passes
I feel I am still doing something wrong, as the performance do not seem to change with different passes I use. The commandline I am using are: * clang -O0 -Xclang -disable-O0-optnone -S -mllvm -disable-llvm-optzns -emit-llvm -momit-leaf-frame-pointer a.c -o a.ll * opt -(PASS_FLAG) a.ll -o a.bc * llc a.bc -filetype=obj -o a.o I tried with PASS_FLAG as all passes from O1, a specific pass in O1, or directly use '-O1', '-O0'. The performance variation seems to be noise only (+/- 1%). And clang is warning me about unused arguments for '-Xclang -disable-O0-optnone', though the result is different from not using the argument. I am using clang-5.0 Any help would be appreciated. Thanks, Yi On 22/9/17 17:10, Craig Topper wrote:> Have -O0 on your clang command line causes all functions to get marked > with an 'optnone' attribute that prevents opt from being able to > optimize them later. You should also add "-Xclang -disable-O0-optnone" > to your command line. > > ~Craig > > On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi all, > > I am trying to understand the effectiveness of various llvm > optimisations when a language targets llvm (or C) as its backend. > > The following is my approach (please correct me if I did anything > wrong): > > I am trying to explicitly control the optimisations passes in > llvm. I disable optimisation in clang, but instead emit > unoptimized llvm IR, and use opt to optimise that. These are what > I do: > > * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm > -momit-leaf-frame-pointer a.c -o a.ll > * opt -(PASSES) a.ll -o a.bc > * llc a.bc -filetype=obj -o a.o > > To evaluate the effectiveness of optimisation passes, I started > with an 'add-one-in' approach. The baseline is no optimisations > passes, and I iterate through all the O1 passes and explicitly > allow one pass for each run. I didnt try understand those passes > so it is a black box test. This will show how effective each > single optimisation is (ignore correlation of passes). This can be > iterative, e.g. identify the most effecitve pass, and always > enable it, and then 'add-one-in' for the rest passes. I also plan > to take a 'leave-one-out' approach as well, in which the baseline > is all optimisations enabled, and one pass will be disabled at a time. > > Here is the result for the 'add-one-in' approach on some micro > benchmarks: > > https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0 > <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0> > > The result seems a bit surprising. A few passes, such as licm, > sroa, instcombine and mem2reg, seem to deliver a very close > performance as O1 (which includes all the passes). Figure 7 is an > example. If my methodology is correct, then my guess is those > optimisations may require some common internal passes, which > actually deliver most of the improvements. I am wondering if this > is true. > > Any suggestion or critiques are welcome. > > Thanks, > Yi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >
Maybe Matching Threads
- Effectiveness of llvm optimisation passes
- Effectiveness of llvm optimisation passes
- Difference between clang -O1 -Xclang -disable-O0-optnone and clang -O0 -Xclang -disable-O0-optnone in LLVM 9
- Replicate Individual O3 optimizations
- How to generate .bc file using configure && make on Mac OS X?