Mehdi AMINI via llvm-dev
2018-Jan-08 05:56 UTC
[llvm-dev] Relationship between clang, opt and llc
Hi Toddy, You can achieve what you're looking for with a pipeline based on `clang -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't be guarantee'd to be well supported across releases of the compiler. Otherwise, if there are some performance-releated (or not...) command line options you think clang is missing / would benefit, I invite you to propose adding them to cfe-dev at lists.llvm.org and submit a patch! Best, -- Mehdi 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>:> Thanks a lot, Mehdi. > > For GCC, there are around 190 optimization flags exposed as command-line > options. > For Clang/LLVM, the number is 40, and many important optimization > parameters are not exposed at all, such as loop unrolling factor, inline > function size parameters. > > I understand there is very different idea for whether or not expose many > flags to end-user. > Personally, I believe it is a reasonable to keep end-user controllable > command-line options minimal for user-friendliness. > However, for users who care a lot for a tiny bit performance improvement, > like HPC community, it may be better to expose as many fine-grained > tunables in the form of command line options as possible. Or, at least > there should be a way to achieve this fairly easy. > > I am curious about which way is the best for my purpose. > Please see my latest reply for 3 possible fine-grained optimization > pipeline. > Looking forward to more discussions. > > Thanks a lot! > > On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> Hi, >> >> "SetC" options are LLVM cl::opt options, they are intended for LLVM >> developer and experimentations. If a settings is intended to be used as a >> public API, there is usually a programmatic way of setting it in LLVM. >> "SetA" is what clang as a C++ compiler exposes to the end-user. >> Internally clang will (most of the time) use one or multiple LLVM APIs to >> propagate a settings. >> >> Best, >> >> -- >> Mehdi >> >> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >> llvm-dev at lists.llvm.org>: >> >>> Craig, thanks a lot! >>> >>> I'm actually confused by clang optimization flags. >>> >>> If I run clang -help, it will show many optimizations (denoted as set A) >>> and non-optimization options (denoted as set B). >>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>> C). >>> >>> There are many options in set C while not in set A, and also options in >>> set A but not in set C. >>> >>> The general question is: what is the relationship between set A and set >>> C, at the same optimization level O0/O1/O2/O3? >>> Another question is: how to specify an option in set C as a clang >>> command line option, if it is not in A? >>> >>> For example, -dse is in set C but not in set A, how can I specify it as >>> a clang option? Or simply I cannot do that. >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper <craig.topper at gmail.com> >>> wrote: >>> >>>> O0 didn't start applying optnone until r304127 in May 2017 which is >>>> after the 4.0 family was branched. So only 5.0, 6.0, and trunk have that >>>> behavior. Commit message copied below >>>> >>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>> >>>> Date: Mon May 29 05:38:20 2017 +0000 >>>> >>>> >>>> IRGen: Add optnone attribute on function during O0 >>>> >>>> >>>> >>>> Amongst other, this will help LTO to correctly handle/honor files >>>> >>>> compiled with O0, helping debugging failures. >>>> >>>> It also seems in line with how we handle other options, like how >>>> >>>> -fnoinline adds the appropriate attribute as well. >>>> >>>> >>>> >>>> Differential Revision: https://reviews.llvm.org/D28404 >>>> >>>> >>>> >>>> ~Craig >>>> >>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang <wenwangtoddy at gmail.com> >>>> wrote: >>>> >>>>> @Zhaopei, thanks for the clarification. >>>>> >>>>> @Craig and @Michael, for clang 4.0.1, -Xclang -disable-O0-optnone >>>>> gives the following error message. From which version -disable-O0-optnone >>>>> gets supported? >>>>> >>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone -Xclang >>>>> -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>> error: unknown argument: '-disable-O0-optnone' >>>>> >>>>> [twang15 at c89 temp]$ clang++ --version >>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>> Target: x86_64-unknown-linux-gnu >>>>> >>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper <craig.topper at gmail.com> >>>>> wrote: >>>>> >>>>>> If you pass -O0 to clang, most functions will be tagged with an >>>>>> optnone function attribute that will prevent opt and llc even if you pass >>>>>> -O3 to opt and llc. This is the mostly likely cause for the slow down in 2. >>>>>> >>>>>> You can disable the optnone function attribute behavior by passing >>>>>> "-Xclang -disable-O0-optnone" to clang >>>>>> >>>>>> ~Craig >>>>>> >>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>> >>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>> >>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>> Runtime: 9.487353 second >>>>>>> >>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>> Runtime: 24.15 seconds >>>>>>> >>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>> Runtime: 9.53 seconds >>>>>>> >>>>>>> 1 and 3 have almost the same performance, while 2 is significantly >>>>>>> worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>> >>>>>>> Is this a wrong expectation? >>>>>>> >>>>>>> @Peizhao, what did you try in your last post? >>>>>>> >>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>>> It's really nice of you pointing out the -Xclang option, it makes >>>>>>>> things much easier. I really appreciate your help! >>>>>>>> >>>>>>>> Best, >>>>>>>> Peizhao >>>>>>>> >>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>> >>>>>>>>> clang -O0 does not disable all optimization passes modify the IR.; >>>>>>>>> In fact it causes most functions to get tagged with noinline to prevent >>>>>>>>> inlinining >>>>>>>>> >>>>>>>>> >>>>>>>>> It also disable lifetime instrinsics emission and TBAA, etc. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> What you really need to do is >>>>>>>>> >>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>> >>>>>>>>> Find the -cc1 command line from that output. Execute that command >>>>>>>>> with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>> >>>>>>>>> >>>>>>>>> That’s a bit complicated: CC1 options can be passed through with >>>>>>>>> -Xclang, for example here just adding to the regular clang invocation ` >>>>>>>>> -Xclang -disable-llvm-passes` >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> — >>>>>>>>> Mehdi >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> You should be able to feed the output from that command to opt/llc >>>>>>>>> and get consistent results. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ~Craig >>>>>>>>> >>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> I am wondering about the relationship clang, opt and llc. I >>>>>>>>>> understand that this has been asked, e.g., >>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for posting a >>>>>>>>>> similar question again, but I still have something that hasn't been >>>>>>>>>> resolved yet. >>>>>>>>>> >>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>> >>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>> ... >>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>> >>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>> ... >>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>> >>>>>>>>>> I took a look at the source code of the clang tool and the opt >>>>>>>>>> tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() functions >>>>>>>>>> to add passes to their optimization pipeline; and for the backend, the >>>>>>>>>> clang and llc both use the addPassesToEmitFile() function to generate >>>>>>>>>> object code. >>>>>>>>>> >>>>>>>>>> So presumably the above two approaches to generating optimized >>>>>>>>>> executable file should do the same thing. However, I am seeing that the >>>>>>>>>> second approach is around 2% slower than the first approach (which is the >>>>>>>>>> way developers usually use) pretty consistently. >>>>>>>>>> >>>>>>>>>> Can anyone point me to the reasons why this happens? Or even >>>>>>>>>> correct my wrong understanding of the relationship between these two >>>>>>>>>> approaches? >>>>>>>>>> >>>>>>>>>> PS: I used the -debug-pass=Structure option to print out the >>>>>>>>>> passes, they seem the same except that the first approach has an extra pass >>>>>>>>>> called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>> >>>>>>>>>> Peizhao >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180108/bff8e45f/attachment-0001.html>
toddy wang via llvm-dev
2018-Jan-08 06:32 UTC
[llvm-dev] Relationship between clang, opt and llc
Hi Mehdi, Now we have 5 pipelines. (In addition to the first 3, which I have described in detail above, please refer my latest reply for details) 1. clang + opt + gold 2. clang + opt + lld 3. clang + GNU ld/ gold /lld 4. clang + opt + llc + clang clang -emit-llvm -O1 -Xclang -disable-llvm-passes for c/c++ to .bc generation and minimal front-end optimization opt for single bc file optimization llc single bc file to obj file generation and back-end optimization (no link-time optimization is possible, since llc works on 1 bc file at a time) clang again for linking all obj file to generate final executable. (although in principle there can be a link-time optimization even with all obj files, it requires a lot of work and is machine-dependent. This may also be the reason why modern compilers like LLVM/GCC/ICC, etc performs LTO not at obj level. But, obj level may yield extra benefit even LTO at intermediate level has been applied by compilers, because obj level can see more information.) `clang -Ox` + `opt -Ox` + `llc -Ox` is too coarse-grain. 5. Modify clang to align with GCC/ICC so that many tunables are exposed at clang command line. Not sure how much work is needed, but at least requires an overall understanding of compiler internals, which can be gradually figured out. I believe 5 is interesting, but 2 may be good enough. More experiments are needed before decision is made. On Mon, Jan 8, 2018 at 12:56 AM, Mehdi AMINI <joker.eph at gmail.com> wrote:> Hi Toddy, > > You can achieve what you're looking for with a pipeline based on `clang > -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't be > guarantee'd to be well supported across releases of the compiler. > > Otherwise, if there are some performance-releated (or not...) command line > options you think clang is missing / would benefit, I invite you to propose > adding them to cfe-dev at lists.llvm.org and submit a patch! > > Best, > > -- > Mehdi > > 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: > >> Thanks a lot, Mehdi. >> >> For GCC, there are around 190 optimization flags exposed as command-line >> options. >> For Clang/LLVM, the number is 40, and many important optimization >> parameters are not exposed at all, such as loop unrolling factor, inline >> function size parameters. >> >> I understand there is very different idea for whether or not expose many >> flags to end-user. >> Personally, I believe it is a reasonable to keep end-user controllable >> command-line options minimal for user-friendliness. >> However, for users who care a lot for a tiny bit performance improvement, >> like HPC community, it may be better to expose as many fine-grained >> tunables in the form of command line options as possible. Or, at least >> there should be a way to achieve this fairly easy. >> >> I am curious about which way is the best for my purpose. >> Please see my latest reply for 3 possible fine-grained optimization >> pipeline. >> Looking forward to more discussions. >> >> Thanks a lot! >> >> On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> Hi, >>> >>> "SetC" options are LLVM cl::opt options, they are intended for LLVM >>> developer and experimentations. If a settings is intended to be used as a >>> public API, there is usually a programmatic way of setting it in LLVM. >>> "SetA" is what clang as a C++ compiler exposes to the end-user. >>> Internally clang will (most of the time) use one or multiple LLVM APIs to >>> propagate a settings. >>> >>> Best, >>> >>> -- >>> Mehdi >>> >>> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >>> llvm-dev at lists.llvm.org>: >>> >>>> Craig, thanks a lot! >>>> >>>> I'm actually confused by clang optimization flags. >>>> >>>> If I run clang -help, it will show many optimizations (denoted as set >>>> A) and non-optimization options (denoted as set B). >>>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>>> C). >>>> >>>> There are many options in set C while not in set A, and also options in >>>> set A but not in set C. >>>> >>>> The general question is: what is the relationship between set A and >>>> set C, at the same optimization level O0/O1/O2/O3? >>>> Another question is: how to specify an option in set C as a clang >>>> command line option, if it is not in A? >>>> >>>> For example, -dse is in set C but not in set A, how can I specify it as >>>> a clang option? Or simply I cannot do that. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper <craig.topper at gmail.com> >>>> wrote: >>>> >>>>> O0 didn't start applying optnone until r304127 in May 2017 which is >>>>> after the 4.0 family was branched. So only 5.0, 6.0, and trunk have that >>>>> behavior. Commit message copied below >>>>> >>>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>>> >>>>> Date: Mon May 29 05:38:20 2017 +0000 >>>>> >>>>> >>>>> IRGen: Add optnone attribute on function during O0 >>>>> >>>>> >>>>> >>>>> Amongst other, this will help LTO to correctly handle/honor files >>>>> >>>>> compiled with O0, helping debugging failures. >>>>> >>>>> It also seems in line with how we handle other options, like how >>>>> >>>>> -fnoinline adds the appropriate attribute as well. >>>>> >>>>> >>>>> >>>>> Differential Revision: https://reviews.llvm.org/D28404 >>>>> >>>>> >>>>> >>>>> ~Craig >>>>> >>>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang <wenwangtoddy at gmail.com> >>>>> wrote: >>>>> >>>>>> @Zhaopei, thanks for the clarification. >>>>>> >>>>>> @Craig and @Michael, for clang 4.0.1, -Xclang -disable-O0-optnone >>>>>> gives the following error message. From which version -disable-O0-optnone >>>>>> gets supported? >>>>>> >>>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone -Xclang >>>>>> -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>>> error: unknown argument: '-disable-O0-optnone' >>>>>> >>>>>> [twang15 at c89 temp]$ clang++ --version >>>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>>> Target: x86_64-unknown-linux-gnu >>>>>> >>>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper <craig.topper at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> If you pass -O0 to clang, most functions will be tagged with an >>>>>>> optnone function attribute that will prevent opt and llc even if you pass >>>>>>> -O3 to opt and llc. This is the mostly likely cause for the slow down in 2. >>>>>>> >>>>>>> You can disable the optnone function attribute behavior by passing >>>>>>> "-Xclang -disable-O0-optnone" to clang >>>>>>> >>>>>>> ~Craig >>>>>>> >>>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>>> >>>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>>> Runtime: 9.487353 second >>>>>>>> >>>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>> Runtime: 24.15 seconds >>>>>>>> >>>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>> Runtime: 9.53 seconds >>>>>>>> >>>>>>>> 1 and 3 have almost the same performance, while 2 is significantly >>>>>>>> worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>>> >>>>>>>> Is this a wrong expectation? >>>>>>>> >>>>>>>> @Peizhao, what did you try in your last post? >>>>>>>> >>>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> It's really nice of you pointing out the -Xclang option, it makes >>>>>>>>> things much easier. I really appreciate your help! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Peizhao >>>>>>>>> >>>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>> >>>>>>>>>> clang -O0 does not disable all optimization passes modify the >>>>>>>>>> IR.; In fact it causes most functions to get tagged with noinline to >>>>>>>>>> prevent inlinining >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> It also disable lifetime instrinsics emission and TBAA, etc. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> What you really need to do is >>>>>>>>>> >>>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>>> >>>>>>>>>> Find the -cc1 command line from that output. Execute that command >>>>>>>>>> with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> That’s a bit complicated: CC1 options can be passed through with >>>>>>>>>> -Xclang, for example here just adding to the regular clang invocation ` >>>>>>>>>> -Xclang -disable-llvm-passes` >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> — >>>>>>>>>> Mehdi >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> You should be able to feed the output from that command to >>>>>>>>>> opt/llc and get consistent results. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>> >>>>>>>>>>> Hi folks, >>>>>>>>>>> >>>>>>>>>>> I am wondering about the relationship clang, opt and llc. I >>>>>>>>>>> understand that this has been asked, e.g., >>>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for posting a >>>>>>>>>>> similar question again, but I still have something that hasn't been >>>>>>>>>>> resolved yet. >>>>>>>>>>> >>>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>>> >>>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>>> ... >>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>> >>>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>>> ... >>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>> >>>>>>>>>>> I took a look at the source code of the clang tool and the opt >>>>>>>>>>> tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() functions >>>>>>>>>>> to add passes to their optimization pipeline; and for the backend, the >>>>>>>>>>> clang and llc both use the addPassesToEmitFile() function to generate >>>>>>>>>>> object code. >>>>>>>>>>> >>>>>>>>>>> So presumably the above two approaches to generating optimized >>>>>>>>>>> executable file should do the same thing. However, I am seeing that the >>>>>>>>>>> second approach is around 2% slower than the first approach (which is the >>>>>>>>>>> way developers usually use) pretty consistently. >>>>>>>>>>> >>>>>>>>>>> Can anyone point me to the reasons why this happens? Or even >>>>>>>>>>> correct my wrong understanding of the relationship between these two >>>>>>>>>>> approaches? >>>>>>>>>>> >>>>>>>>>>> PS: I used the -debug-pass=Structure option to print out the >>>>>>>>>>> passes, they seem the same except that the first approach has an extra pass >>>>>>>>>>> called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>>> >>>>>>>>>>> Peizhao >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180108/c8c578a5/attachment-0001.html>
Sean Silva via llvm-dev
2018-Jan-08 07:02 UTC
[llvm-dev] Relationship between clang, opt and llc
For the types of things that you are looking for, you may just want to try a bunch of -mllvm options. You can tune inlining and unrolling threshold like that, for example. On Jan 7, 2018 10:33 PM, "toddy wang via llvm-dev" <llvm-dev at lists.llvm.org> wrote:> Hi Mehdi, > > Now we have 5 pipelines. (In addition to the first 3, which I have > described in detail above, please refer my latest reply for details) > 1. clang + opt + gold > 2. clang + opt + lld > 3. clang + GNU ld/ gold /lld > > 4. clang + opt + llc + clang > clang -emit-llvm -O1 -Xclang -disable-llvm-passes for c/c++ to .bc > generation and minimal front-end optimization > opt for single bc file optimization > llc single bc file to obj file generation and back-end optimization (no > link-time optimization is possible, since llc works on 1 bc file at a time) > clang again for linking all obj file to generate final executable. (although > in principle there can be a link-time optimization even with all obj files, > it requires a lot of work and is machine-dependent. This may also be the > reason why modern compilers like LLVM/GCC/ICC, etc performs LTO not at obj > level. But, obj level may yield extra benefit even LTO at intermediate > level has been applied by compilers, because obj level can see more > information.) > > `clang -Ox` + `opt -Ox` + `llc -Ox` is too coarse-grain. > > 5. Modify clang to align with GCC/ICC so that many tunables are exposed at > clang command line. Not sure how much work is needed, but at least requires > an overall understanding of compiler internals, which can be gradually > figured out. > > I believe 5 is interesting, but 2 may be good enough. More experiments are > needed before decision is made. > > On Mon, Jan 8, 2018 at 12:56 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> Hi Toddy, >> >> You can achieve what you're looking for with a pipeline based on `clang >> -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't be >> guarantee'd to be well supported across releases of the compiler. >> >> Otherwise, if there are some performance-releated (or not...) command >> line options you think clang is missing / would benefit, I invite you to >> propose adding them to cfe-dev at lists.llvm.org and submit a patch! >> >> Best, >> >> -- >> Mehdi >> >> 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >> >>> Thanks a lot, Mehdi. >>> >>> For GCC, there are around 190 optimization flags exposed as command-line >>> options. >>> For Clang/LLVM, the number is 40, and many important optimization >>> parameters are not exposed at all, such as loop unrolling factor, inline >>> function size parameters. >>> >>> I understand there is very different idea for whether or not expose many >>> flags to end-user. >>> Personally, I believe it is a reasonable to keep end-user controllable >>> command-line options minimal for user-friendliness. >>> However, for users who care a lot for a tiny bit performance >>> improvement, like HPC community, it may be better to expose as many >>> fine-grained tunables in the form of command line options as possible. Or, >>> at least there should be a way to achieve this fairly easy. >>> >>> I am curious about which way is the best for my purpose. >>> Please see my latest reply for 3 possible fine-grained optimization >>> pipeline. >>> Looking forward to more discussions. >>> >>> Thanks a lot! >>> >>> On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> "SetC" options are LLVM cl::opt options, they are intended for LLVM >>>> developer and experimentations. If a settings is intended to be used as a >>>> public API, there is usually a programmatic way of setting it in LLVM. >>>> "SetA" is what clang as a C++ compiler exposes to the end-user. >>>> Internally clang will (most of the time) use one or multiple LLVM APIs to >>>> propagate a settings. >>>> >>>> Best, >>>> >>>> -- >>>> Mehdi >>>> >>>> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >>>> llvm-dev at lists.llvm.org>: >>>> >>>>> Craig, thanks a lot! >>>>> >>>>> I'm actually confused by clang optimization flags. >>>>> >>>>> If I run clang -help, it will show many optimizations (denoted as set >>>>> A) and non-optimization options (denoted as set B). >>>>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>>>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>>>> C). >>>>> >>>>> There are many options in set C while not in set A, and also options >>>>> in set A but not in set C. >>>>> >>>>> The general question is: what is the relationship between set A and >>>>> set C, at the same optimization level O0/O1/O2/O3? >>>>> Another question is: how to specify an option in set C as a clang >>>>> command line option, if it is not in A? >>>>> >>>>> For example, -dse is in set C but not in set A, how can I specify it >>>>> as a clang option? Or simply I cannot do that. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper <craig.topper at gmail.com> >>>>> wrote: >>>>> >>>>>> O0 didn't start applying optnone until r304127 in May 2017 which is >>>>>> after the 4.0 family was branched. So only 5.0, 6.0, and trunk have that >>>>>> behavior. Commit message copied below >>>>>> >>>>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>>>> >>>>>> Date: Mon May 29 05:38:20 2017 +0000 >>>>>> >>>>>> >>>>>> IRGen: Add optnone attribute on function during O0 >>>>>> >>>>>> >>>>>> >>>>>> Amongst other, this will help LTO to correctly handle/honor files >>>>>> >>>>>> compiled with O0, helping debugging failures. >>>>>> >>>>>> It also seems in line with how we handle other options, like how >>>>>> >>>>>> -fnoinline adds the appropriate attribute as well. >>>>>> >>>>>> >>>>>> >>>>>> Differential Revision: https://reviews.llvm.org/D28404 >>>>>> >>>>>> >>>>>> >>>>>> ~Craig >>>>>> >>>>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang <wenwangtoddy at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> @Zhaopei, thanks for the clarification. >>>>>>> >>>>>>> @Craig and @Michael, for clang 4.0.1, -Xclang -disable-O0-optnone >>>>>>> gives the following error message. From which version -disable-O0-optnone >>>>>>> gets supported? >>>>>>> >>>>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone -Xclang >>>>>>> -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>>>> error: unknown argument: '-disable-O0-optnone' >>>>>>> >>>>>>> [twang15 at c89 temp]$ clang++ --version >>>>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>>>> Target: x86_64-unknown-linux-gnu >>>>>>> >>>>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper <craig.topper at gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> If you pass -O0 to clang, most functions will be tagged with an >>>>>>>> optnone function attribute that will prevent opt and llc even if you pass >>>>>>>> -O3 to opt and llc. This is the mostly likely cause for the slow down in 2. >>>>>>>> >>>>>>>> You can disable the optnone function attribute behavior by passing >>>>>>>> "-Xclang -disable-O0-optnone" to clang >>>>>>>> >>>>>>>> ~Craig >>>>>>>> >>>>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>>>> >>>>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>>>> Runtime: 9.487353 second >>>>>>>>> >>>>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>> Runtime: 24.15 seconds >>>>>>>>> >>>>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc >>>>>>>>> LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>> Runtime: 9.53 seconds >>>>>>>>> >>>>>>>>> 1 and 3 have almost the same performance, while 2 is significantly >>>>>>>>> worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>>>> >>>>>>>>> Is this a wrong expectation? >>>>>>>>> >>>>>>>>> @Peizhao, what did you try in your last post? >>>>>>>>> >>>>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>> >>>>>>>>>> It's really nice of you pointing out the -Xclang option, it makes >>>>>>>>>> things much easier. I really appreciate your help! >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Peizhao >>>>>>>>>> >>>>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>> >>>>>>>>>>> clang -O0 does not disable all optimization passes modify the >>>>>>>>>>> IR.; In fact it causes most functions to get tagged with noinline to >>>>>>>>>>> prevent inlinining >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> It also disable lifetime instrinsics emission and TBAA, etc. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> What you really need to do is >>>>>>>>>>> >>>>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>>>> >>>>>>>>>>> Find the -cc1 command line from that output. Execute that >>>>>>>>>>> command with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> That’s a bit complicated: CC1 options can be passed through with >>>>>>>>>>> -Xclang, for example here just adding to the regular clang invocation ` >>>>>>>>>>> -Xclang -disable-llvm-passes` >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> — >>>>>>>>>>> Mehdi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You should be able to feed the output from that command to >>>>>>>>>>> opt/llc and get consistent results. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ~Craig >>>>>>>>>>> >>>>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi folks, >>>>>>>>>>>> >>>>>>>>>>>> I am wondering about the relationship clang, opt and llc. I >>>>>>>>>>>> understand that this has been asked, e.g., >>>>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for posting a >>>>>>>>>>>> similar question again, but I still have something that hasn't been >>>>>>>>>>>> resolved yet. >>>>>>>>>>>> >>>>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>>>> >>>>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>>>> ... >>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>> >>>>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>>>> ... >>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>> >>>>>>>>>>>> I took a look at the source code of the clang tool and the opt >>>>>>>>>>>> tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() >>>>>>>>>>>> functions to add passes to their optimization pipeline; and for the >>>>>>>>>>>> backend, the clang and llc both use the addPassesToEmitFile() function to >>>>>>>>>>>> generate object code. >>>>>>>>>>>> >>>>>>>>>>>> So presumably the above two approaches to generating optimized >>>>>>>>>>>> executable file should do the same thing. However, I am seeing that the >>>>>>>>>>>> second approach is around 2% slower than the first approach (which is the >>>>>>>>>>>> way developers usually use) pretty consistently. >>>>>>>>>>>> >>>>>>>>>>>> Can anyone point me to the reasons why this happens? Or even >>>>>>>>>>>> correct my wrong understanding of the relationship between these two >>>>>>>>>>>> approaches? >>>>>>>>>>>> >>>>>>>>>>>> PS: I used the -debug-pass=Structure option to print out the >>>>>>>>>>>> passes, they seem the same except that the first approach has an extra pass >>>>>>>>>>>> called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>>>> >>>>>>>>>>>> Peizhao >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>> >>> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180107/e3520bcf/attachment.html>