Mehdi AMINI via llvm-dev
2018-Jan-08 16:53 UTC
[llvm-dev] Relationship between clang, opt and llc
2018-01-08 8:41 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>:> Hi Medhi, > > It seems -mllvm does not work as expected. Anything wrong? > > [twang15 at c92 temp]$ clang++ -O3 -mllvm *-deadargelim* LULESH.cc > clang (LLVM option parsing): Unknown command line argument > '-deadargelim'. Try: 'clang (LLVM option parsing) -help' > clang (LLVM option parsing): Did you mean '-regalloc'? > > [twang15 at c92 temp]$ clang++ -O3 -mllvm *deadargelim* LULESH.cc > clang (LLVM option parsing): Unknown command line argument 'deadargelim'. > Try: 'clang (LLVM option parsing) -help' >You can't schedule passes this way, only set parameters like -unroll-threshold=<uint> etc. -- Mehdi> > -Tao > > On Mon, Jan 8, 2018 at 11:12 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> >> >> 2018-01-07 23:16 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >> >>> -mllvm <value> Additional arguments to forward to LLVM's option >>> processing >>> >>> This is dumped by clang. I am not sure what I am supposed to put as >>> value in order to tune unrolling/inlining threshold. >>> >> >> >> As the help says, this is used to pass argument to LLVM itself. If you >> remember you earlier question about setA (clang options) and setC (opt >> options), this allows to reach setC from the clang command line. >> Any option that you see in the output of `opt --help` can be set from >> clang using `-mllvm`. Same caveat as I mentioned before: these aren't >> supposed to be end-user options. >> >> -- >> Mehdi >> >> >> >>> >>> On Mon, Jan 8, 2018 at 2:02 AM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> >>>> For the types of things that you are looking for, you may just want to >>>> try a bunch of -mllvm options. You can tune inlining and unrolling >>>> threshold like that, for example. >>>> >>>> On Jan 7, 2018 10:33 PM, "toddy wang via llvm-dev" < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hi Mehdi, >>>>> >>>>> Now we have 5 pipelines. (In addition to the first 3, which I have >>>>> described in detail above, please refer my latest reply for details) >>>>> 1. clang + opt + gold >>>>> 2. clang + opt + lld >>>>> 3. clang + GNU ld/ gold /lld >>>>> >>>>> 4. clang + opt + llc + clang >>>>> clang -emit-llvm -O1 -Xclang -disable-llvm-passes for c/c++ to .bc >>>>> generation and minimal front-end optimization >>>>> opt for single bc file optimization >>>>> llc single bc file to obj file generation and back-end optimization >>>>> (no link-time optimization is possible, since llc works on 1 bc file at a >>>>> time) >>>>> clang again for linking all obj file to generate final executable. (although >>>>> in principle there can be a link-time optimization even with all obj files, >>>>> it requires a lot of work and is machine-dependent. This may also be the >>>>> reason why modern compilers like LLVM/GCC/ICC, etc performs LTO not at obj >>>>> level. But, obj level may yield extra benefit even LTO at intermediate >>>>> level has been applied by compilers, because obj level can see more >>>>> information.) >>>>> >>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` is too coarse-grain. >>>>> >>>>> 5. Modify clang to align with GCC/ICC so that many tunables are >>>>> exposed at clang command line. Not sure how much work is needed, but at >>>>> least requires an overall understanding of compiler internals, which can be >>>>> gradually figured out. >>>>> >>>>> I believe 5 is interesting, but 2 may be good enough. More experiments >>>>> are needed before decision is made. >>>>> >>>>> On Mon, Jan 8, 2018 at 12:56 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Toddy, >>>>>> >>>>>> You can achieve what you're looking for with a pipeline based on >>>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't >>>>>> be guarantee'd to be well supported across releases of the compiler. >>>>>> >>>>>> Otherwise, if there are some performance-releated (or not...) command >>>>>> line options you think clang is missing / would benefit, I invite you to >>>>>> propose adding them to cfe-dev at lists.llvm.org and submit a patch! >>>>>> >>>>>> Best, >>>>>> >>>>>> -- >>>>>> Mehdi >>>>>> >>>>>> 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >>>>>> >>>>>>> Thanks a lot, Mehdi. >>>>>>> >>>>>>> For GCC, there are around 190 optimization flags exposed as >>>>>>> command-line options. >>>>>>> For Clang/LLVM, the number is 40, and many important optimization >>>>>>> parameters are not exposed at all, such as loop unrolling factor, inline >>>>>>> function size parameters. >>>>>>> >>>>>>> I understand there is very different idea for whether or not expose >>>>>>> many flags to end-user. >>>>>>> Personally, I believe it is a reasonable to keep end-user >>>>>>> controllable command-line options minimal for user-friendliness. >>>>>>> However, for users who care a lot for a tiny bit performance >>>>>>> improvement, like HPC community, it may be better to expose as many >>>>>>> fine-grained tunables in the form of command line options as possible. Or, >>>>>>> at least there should be a way to achieve this fairly easy. >>>>>>> >>>>>>> I am curious about which way is the best for my purpose. >>>>>>> Please see my latest reply for 3 possible fine-grained optimization >>>>>>> pipeline. >>>>>>> Looking forward to more discussions. >>>>>>> >>>>>>> Thanks a lot! >>>>>>> >>>>>>> On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> "SetC" options are LLVM cl::opt options, they are intended for LLVM >>>>>>>> developer and experimentations. If a settings is intended to be used as a >>>>>>>> public API, there is usually a programmatic way of setting it in LLVM. >>>>>>>> "SetA" is what clang as a C++ compiler exposes to the end-user. >>>>>>>> Internally clang will (most of the time) use one or multiple LLVM APIs to >>>>>>>> propagate a settings. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> -- >>>>>>>> Mehdi >>>>>>>> >>>>>>>> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org>: >>>>>>>> >>>>>>>>> Craig, thanks a lot! >>>>>>>>> >>>>>>>>> I'm actually confused by clang optimization flags. >>>>>>>>> >>>>>>>>> If I run clang -help, it will show many optimizations (denoted as >>>>>>>>> set A) and non-optimization options (denoted as set B). >>>>>>>>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>>>>>>>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>>>>>>>> C). >>>>>>>>> >>>>>>>>> There are many options in set C while not in set A, and also >>>>>>>>> options in set A but not in set C. >>>>>>>>> >>>>>>>>> The general question is: what is the relationship between set A >>>>>>>>> and set C, at the same optimization level O0/O1/O2/O3? >>>>>>>>> Another question is: how to specify an option in set C as a clang >>>>>>>>> command line option, if it is not in A? >>>>>>>>> >>>>>>>>> For example, -dse is in set C but not in set A, how can I specify >>>>>>>>> it as a clang option? Or simply I cannot do that. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper < >>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> O0 didn't start applying optnone until r304127 in May 2017 which >>>>>>>>>> is after the 4.0 family was branched. So only 5.0, 6.0, and trunk have that >>>>>>>>>> behavior. Commit message copied below >>>>>>>>>> >>>>>>>>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>>>>>>>> >>>>>>>>>> Date: Mon May 29 05:38:20 2017 +0000 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> IRGen: Add optnone attribute on function during O0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Amongst other, this will help LTO to correctly handle/honor >>>>>>>>>> files >>>>>>>>>> >>>>>>>>>> compiled with O0, helping debugging failures. >>>>>>>>>> >>>>>>>>>> It also seems in line with how we handle other options, like >>>>>>>>>> how >>>>>>>>>> >>>>>>>>>> -fnoinline adds the appropriate attribute as well. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Differential Revision: https://reviews.llvm.org/D28404 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang < >>>>>>>>>> wenwangtoddy at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> @Zhaopei, thanks for the clarification. >>>>>>>>>>> >>>>>>>>>>> @Craig and @Michael, for clang 4.0.1, -Xclang >>>>>>>>>>> -disable-O0-optnone gives the following error message. From which >>>>>>>>>>> version -disable-O0-optnone gets supported? >>>>>>>>>>> >>>>>>>>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone >>>>>>>>>>> -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>>>>>>>> error: unknown argument: '-disable-O0-optnone' >>>>>>>>>>> >>>>>>>>>>> [twang15 at c89 temp]$ clang++ --version >>>>>>>>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>>>>>>>> Target: x86_64-unknown-linux-gnu >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> If you pass -O0 to clang, most functions will be tagged with an >>>>>>>>>>>> optnone function attribute that will prevent opt and llc even if you pass >>>>>>>>>>>> -O3 to opt and llc. This is the mostly likely cause for the slow down in 2. >>>>>>>>>>>> >>>>>>>>>>>> You can disable the optnone function attribute behavior by >>>>>>>>>>>> passing "-Xclang -disable-O0-optnone" to clang >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>>>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>>>>>>>> >>>>>>>>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>>>>>>>> Runtime: 9.487353 second >>>>>>>>>>>>> >>>>>>>>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>> Runtime: 24.15 seconds >>>>>>>>>>>>> >>>>>>>>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>> Runtime: 9.53 seconds >>>>>>>>>>>>> >>>>>>>>>>>>> 1 and 3 have almost the same performance, while 2 is >>>>>>>>>>>>> significantly worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>>>>>>>> >>>>>>>>>>>>> Is this a wrong expectation? >>>>>>>>>>>>> >>>>>>>>>>>>> @Peizhao, what did you try in your last post? >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> It's really nice of you pointing out the -Xclang option, it >>>>>>>>>>>>>> makes things much easier. I really appreciate your help! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> clang -O0 does not disable all optimization passes modify >>>>>>>>>>>>>>> the IR.; In fact it causes most functions to get tagged with noinline to >>>>>>>>>>>>>>> prevent inlinining >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It also disable lifetime instrinsics emission and TBAA, etc. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What you really need to do is >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Find the -cc1 command line from that output. Execute that >>>>>>>>>>>>>>> command with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> That’s a bit complicated: CC1 options can be passed through >>>>>>>>>>>>>>> with -Xclang, for example here just adding to the regular clang invocation >>>>>>>>>>>>>>> ` -Xclang -disable-llvm-passes` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> — >>>>>>>>>>>>>>> Mehdi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You should be able to feed the output from that command to >>>>>>>>>>>>>>> opt/llc and get consistent results. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi folks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am wondering about the relationship clang, opt and llc. I >>>>>>>>>>>>>>>> understand that this has been asked, e.g., >>>>>>>>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for >>>>>>>>>>>>>>>> posting a similar question again, but I still have something that hasn't >>>>>>>>>>>>>>>> been resolved yet. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I took a look at the source code of the clang tool and the >>>>>>>>>>>>>>>> opt tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() >>>>>>>>>>>>>>>> functions to add passes to their optimization pipeline; and for the >>>>>>>>>>>>>>>> backend, the clang and llc both use the addPassesToEmitFile() function to >>>>>>>>>>>>>>>> generate object code. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So presumably the above two approaches to generating >>>>>>>>>>>>>>>> optimized executable file should do the same thing. However, I am seeing >>>>>>>>>>>>>>>> that the second approach is around 2% slower than the first approach (which >>>>>>>>>>>>>>>> is the way developers usually use) pretty consistently. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can anyone point me to the reasons why this happens? Or >>>>>>>>>>>>>>>> even correct my wrong understanding of the relationship between these two >>>>>>>>>>>>>>>> approaches? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PS: I used the -debug-pass=Structure option to print out >>>>>>>>>>>>>>>> the passes, they seem the same except that the first approach has an extra >>>>>>>>>>>>>>>> pass called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180108/3a86e716/attachment.html>
toddy wang via llvm-dev
2018-Jan-08 16:59 UTC
[llvm-dev] Relationship between clang, opt and llc
On Mon, Jan 8, 2018 at 11:53 AM, Mehdi AMINI <joker.eph at gmail.com> wrote:> > > 2018-01-08 8:41 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: > >> Hi Medhi, >> >> It seems -mllvm does not work as expected. Anything wrong? >> >> [twang15 at c92 temp]$ clang++ -O3 -mllvm *-deadargelim* LULESH.cc >> clang (LLVM option parsing): Unknown command line argument >> '-deadargelim'. Try: 'clang (LLVM option parsing) -help' >> clang (LLVM option parsing): Did you mean '-regalloc'? >> >> [twang15 at c92 temp]$ clang++ -O3 -mllvm *deadargelim* LULESH.cc >> clang (LLVM option parsing): Unknown command line argument >> 'deadargelim'. Try: 'clang (LLVM option parsing) -help' >> > > You can't schedule passes this way, only set parameters > like -unroll-threshold=<uint> etc. > > Where can I find options like -unroll-threshold=<uint>? I cannot find itin either opt -help or clang -help.> -- > Mehdi > > >> >> -Tao >> >> On Mon, Jan 8, 2018 at 11:12 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> >>> >>> 2018-01-07 23:16 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >>> >>>> -mllvm <value> Additional arguments to forward to LLVM's >>>> option processing >>>> >>>> This is dumped by clang. I am not sure what I am supposed to put as >>>> value in order to tune unrolling/inlining threshold. >>>> >>> >>> >>> As the help says, this is used to pass argument to LLVM itself. If you >>> remember you earlier question about setA (clang options) and setC (opt >>> options), this allows to reach setC from the clang command line. >>> Any option that you see in the output of `opt --help` can be set from >>> clang using `-mllvm`. Same caveat as I mentioned before: these aren't >>> supposed to be end-user options. >>> >>> -- >>> Mehdi >>> >>> >>> >>>> >>>> On Mon, Jan 8, 2018 at 2:02 AM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> >>>>> For the types of things that you are looking for, you may just want to >>>>> try a bunch of -mllvm options. You can tune inlining and unrolling >>>>> threshold like that, for example. >>>>> >>>>> On Jan 7, 2018 10:33 PM, "toddy wang via llvm-dev" < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hi Mehdi, >>>>>> >>>>>> Now we have 5 pipelines. (In addition to the first 3, which I have >>>>>> described in detail above, please refer my latest reply for details) >>>>>> 1. clang + opt + gold >>>>>> 2. clang + opt + lld >>>>>> 3. clang + GNU ld/ gold /lld >>>>>> >>>>>> 4. clang + opt + llc + clang >>>>>> clang -emit-llvm -O1 -Xclang -disable-llvm-passes for c/c++ to .bc >>>>>> generation and minimal front-end optimization >>>>>> opt for single bc file optimization >>>>>> llc single bc file to obj file generation and back-end optimization >>>>>> (no link-time optimization is possible, since llc works on 1 bc file at a >>>>>> time) >>>>>> clang again for linking all obj file to generate final executable. (although >>>>>> in principle there can be a link-time optimization even with all obj files, >>>>>> it requires a lot of work and is machine-dependent. This may also be the >>>>>> reason why modern compilers like LLVM/GCC/ICC, etc performs LTO not at obj >>>>>> level. But, obj level may yield extra benefit even LTO at intermediate >>>>>> level has been applied by compilers, because obj level can see more >>>>>> information.) >>>>>> >>>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` is too coarse-grain. >>>>>> >>>>>> 5. Modify clang to align with GCC/ICC so that many tunables are >>>>>> exposed at clang command line. Not sure how much work is needed, but at >>>>>> least requires an overall understanding of compiler internals, which can be >>>>>> gradually figured out. >>>>>> >>>>>> I believe 5 is interesting, but 2 may be good enough. More >>>>>> experiments are needed before decision is made. >>>>>> >>>>>> On Mon, Jan 8, 2018 at 12:56 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Toddy, >>>>>>> >>>>>>> You can achieve what you're looking for with a pipeline based on >>>>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't >>>>>>> be guarantee'd to be well supported across releases of the compiler. >>>>>>> >>>>>>> Otherwise, if there are some performance-releated (or not...) >>>>>>> command line options you think clang is missing / would benefit, I invite >>>>>>> you to propose adding them to cfe-dev at lists.llvm.org and submit a >>>>>>> patch! >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> -- >>>>>>> Mehdi >>>>>>> >>>>>>> 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >>>>>>> >>>>>>>> Thanks a lot, Mehdi. >>>>>>>> >>>>>>>> For GCC, there are around 190 optimization flags exposed as >>>>>>>> command-line options. >>>>>>>> For Clang/LLVM, the number is 40, and many important optimization >>>>>>>> parameters are not exposed at all, such as loop unrolling factor, inline >>>>>>>> function size parameters. >>>>>>>> >>>>>>>> I understand there is very different idea for whether or not expose >>>>>>>> many flags to end-user. >>>>>>>> Personally, I believe it is a reasonable to keep end-user >>>>>>>> controllable command-line options minimal for user-friendliness. >>>>>>>> However, for users who care a lot for a tiny bit performance >>>>>>>> improvement, like HPC community, it may be better to expose as many >>>>>>>> fine-grained tunables in the form of command line options as possible. Or, >>>>>>>> at least there should be a way to achieve this fairly easy. >>>>>>>> >>>>>>>> I am curious about which way is the best for my purpose. >>>>>>>> Please see my latest reply for 3 possible fine-grained optimization >>>>>>>> pipeline. >>>>>>>> Looking forward to more discussions. >>>>>>>> >>>>>>>> Thanks a lot! >>>>>>>> >>>>>>>> On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> "SetC" options are LLVM cl::opt options, they are intended for >>>>>>>>> LLVM developer and experimentations. If a settings is intended to be used >>>>>>>>> as a public API, there is usually a programmatic way of setting it in LLVM. >>>>>>>>> "SetA" is what clang as a C++ compiler exposes to the end-user. >>>>>>>>> Internally clang will (most of the time) use one or multiple LLVM APIs to >>>>>>>>> propagate a settings. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Mehdi >>>>>>>>> >>>>>>>>> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >>>>>>>>> llvm-dev at lists.llvm.org>: >>>>>>>>> >>>>>>>>>> Craig, thanks a lot! >>>>>>>>>> >>>>>>>>>> I'm actually confused by clang optimization flags. >>>>>>>>>> >>>>>>>>>> If I run clang -help, it will show many optimizations (denoted as >>>>>>>>>> set A) and non-optimization options (denoted as set B). >>>>>>>>>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>>>>>>>>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>>>>>>>>> C). >>>>>>>>>> >>>>>>>>>> There are many options in set C while not in set A, and also >>>>>>>>>> options in set A but not in set C. >>>>>>>>>> >>>>>>>>>> The general question is: what is the relationship between set A >>>>>>>>>> and set C, at the same optimization level O0/O1/O2/O3? >>>>>>>>>> Another question is: how to specify an option in set C as a clang >>>>>>>>>> command line option, if it is not in A? >>>>>>>>>> >>>>>>>>>> For example, -dse is in set C but not in set A, how can I specify >>>>>>>>>> it as a clang option? Or simply I cannot do that. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper < >>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> O0 didn't start applying optnone until r304127 in May 2017 which >>>>>>>>>>> is after the 4.0 family was branched. So only 5.0, 6.0, and trunk have that >>>>>>>>>>> behavior. Commit message copied below >>>>>>>>>>> >>>>>>>>>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>>>>>>>>> >>>>>>>>>>> Date: Mon May 29 05:38:20 2017 +0000 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> IRGen: Add optnone attribute on function during O0 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Amongst other, this will help LTO to correctly handle/honor >>>>>>>>>>> files >>>>>>>>>>> >>>>>>>>>>> compiled with O0, helping debugging failures. >>>>>>>>>>> >>>>>>>>>>> It also seems in line with how we handle other options, >>>>>>>>>>> like how >>>>>>>>>>> >>>>>>>>>>> -fnoinline adds the appropriate attribute as well. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Differential Revision: https://reviews.llvm.org/D28404 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ~Craig >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang < >>>>>>>>>>> wenwangtoddy at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> @Zhaopei, thanks for the clarification. >>>>>>>>>>>> >>>>>>>>>>>> @Craig and @Michael, for clang 4.0.1, -Xclang >>>>>>>>>>>> -disable-O0-optnone gives the following error message. From which >>>>>>>>>>>> version -disable-O0-optnone gets supported? >>>>>>>>>>>> >>>>>>>>>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone >>>>>>>>>>>> -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>>>>>>>>> error: unknown argument: '-disable-O0-optnone' >>>>>>>>>>>> >>>>>>>>>>>> [twang15 at c89 temp]$ clang++ --version >>>>>>>>>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>>>>>>>>> Target: x86_64-unknown-linux-gnu >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper < >>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> If you pass -O0 to clang, most functions will be tagged with >>>>>>>>>>>>> an optnone function attribute that will prevent opt and llc even if you >>>>>>>>>>>>> pass -O3 to opt and llc. This is the mostly likely cause for the slow down >>>>>>>>>>>>> in 2. >>>>>>>>>>>>> >>>>>>>>>>>>> You can disable the optnone function attribute behavior by >>>>>>>>>>>>> passing "-Xclang -disable-O0-optnone" to clang >>>>>>>>>>>>> >>>>>>>>>>>>> ~Craig >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>>>>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>>>>>>>>> Runtime: 9.487353 second >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>>> Runtime: 24.15 seconds >>>>>>>>>>>>>> >>>>>>>>>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>>> Runtime: 9.53 seconds >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1 and 3 have almost the same performance, while 2 is >>>>>>>>>>>>>> significantly worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is this a wrong expectation? >>>>>>>>>>>>>> >>>>>>>>>>>>>> @Peizhao, what did you try in your last post? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> It's really nice of you pointing out the -Xclang option, it >>>>>>>>>>>>>>> makes things much easier. I really appreciate your help! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>>>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> clang -O0 does not disable all optimization passes modify >>>>>>>>>>>>>>>> the IR.; In fact it causes most functions to get tagged with noinline to >>>>>>>>>>>>>>>> prevent inlinining >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It also disable lifetime instrinsics emission and TBAA, etc. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> What you really need to do is >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Find the -cc1 command line from that output. Execute that >>>>>>>>>>>>>>>> command with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That’s a bit complicated: CC1 options can be passed through >>>>>>>>>>>>>>>> with -Xclang, for example here just adding to the regular clang invocation >>>>>>>>>>>>>>>> ` -Xclang -disable-llvm-passes` >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> — >>>>>>>>>>>>>>>> Mehdi >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You should be able to feed the output from that command to >>>>>>>>>>>>>>>> opt/llc and get consistent results. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi folks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am wondering about the relationship clang, opt and llc. >>>>>>>>>>>>>>>>> I understand that this has been asked, e.g., >>>>>>>>>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for >>>>>>>>>>>>>>>>> posting a similar question again, but I still have something that hasn't >>>>>>>>>>>>>>>>> been resolved yet. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I took a look at the source code of the clang tool and the >>>>>>>>>>>>>>>>> opt tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() >>>>>>>>>>>>>>>>> functions to add passes to their optimization pipeline; and for the >>>>>>>>>>>>>>>>> backend, the clang and llc both use the addPassesToEmitFile() function to >>>>>>>>>>>>>>>>> generate object code. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So presumably the above two approaches to generating >>>>>>>>>>>>>>>>> optimized executable file should do the same thing. However, I am seeing >>>>>>>>>>>>>>>>> that the second approach is around 2% slower than the first approach (which >>>>>>>>>>>>>>>>> is the way developers usually use) pretty consistently. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can anyone point me to the reasons why this happens? Or >>>>>>>>>>>>>>>>> even correct my wrong understanding of the relationship between these two >>>>>>>>>>>>>>>>> approaches? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PS: I used the -debug-pass=Structure option to print out >>>>>>>>>>>>>>>>> the passes, they seem the same except that the first approach has an extra >>>>>>>>>>>>>>>>> pass called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180108/6f64b3a3/attachment-0001.html>
Mehdi AMINI via llvm-dev
2018-Jan-08 17:48 UTC
[llvm-dev] Relationship between clang, opt and llc
2018-01-08 8:59 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>:> > > On Mon, Jan 8, 2018 at 11:53 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> >> >> 2018-01-08 8:41 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >> >>> Hi Medhi, >>> >>> It seems -mllvm does not work as expected. Anything wrong? >>> >>> [twang15 at c92 temp]$ clang++ -O3 -mllvm *-deadargelim* LULESH.cc >>> clang (LLVM option parsing): Unknown command line argument >>> '-deadargelim'. Try: 'clang (LLVM option parsing) -help' >>> clang (LLVM option parsing): Did you mean '-regalloc'? >>> >>> [twang15 at c92 temp]$ clang++ -O3 -mllvm *deadargelim* LULESH.cc >>> clang (LLVM option parsing): Unknown command line argument >>> 'deadargelim'. Try: 'clang (LLVM option parsing) -help' >>> >> >> You can't schedule passes this way, only set parameters >> like -unroll-threshold=<uint> etc. >> >> Where can I find options like -unroll-threshold=<uint>? I cannot find it > in either opt -help or clang -help. >This one shows up in `opt --help-hidden`. Otherwise in the source code for each transformation. (remember when I mentioned these are intended for LLVM developers and not end-user facing?). -- Mehdi> -- >> Mehdi >> >> >>> >>> -Tao >>> >>> On Mon, Jan 8, 2018 at 11:12 AM, Mehdi AMINI <joker.eph at gmail.com> >>> wrote: >>> >>>> >>>> >>>> 2018-01-07 23:16 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >>>> >>>>> -mllvm <value> Additional arguments to forward to LLVM's >>>>> option processing >>>>> >>>>> This is dumped by clang. I am not sure what I am supposed to put as >>>>> value in order to tune unrolling/inlining threshold. >>>>> >>>> >>>> >>>> As the help says, this is used to pass argument to LLVM itself. If you >>>> remember you earlier question about setA (clang options) and setC (opt >>>> options), this allows to reach setC from the clang command line. >>>> Any option that you see in the output of `opt --help` can be set from >>>> clang using `-mllvm`. Same caveat as I mentioned before: these aren't >>>> supposed to be end-user options. >>>> >>>> -- >>>> Mehdi >>>> >>>> >>>> >>>>> >>>>> On Mon, Jan 8, 2018 at 2:02 AM, Sean Silva <chisophugis at gmail.com> >>>>> wrote: >>>>> >>>>>> For the types of things that you are looking for, you may just want >>>>>> to try a bunch of -mllvm options. You can tune inlining and unrolling >>>>>> threshold like that, for example. >>>>>> >>>>>> On Jan 7, 2018 10:33 PM, "toddy wang via llvm-dev" < >>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>> >>>>>>> Hi Mehdi, >>>>>>> >>>>>>> Now we have 5 pipelines. (In addition to the first 3, which I have >>>>>>> described in detail above, please refer my latest reply for details) >>>>>>> 1. clang + opt + gold >>>>>>> 2. clang + opt + lld >>>>>>> 3. clang + GNU ld/ gold /lld >>>>>>> >>>>>>> 4. clang + opt + llc + clang >>>>>>> clang -emit-llvm -O1 -Xclang -disable-llvm-passes for c/c++ to .bc >>>>>>> generation and minimal front-end optimization >>>>>>> opt for single bc file optimization >>>>>>> llc single bc file to obj file generation and back-end optimization >>>>>>> (no link-time optimization is possible, since llc works on 1 bc file at a >>>>>>> time) >>>>>>> clang again for linking all obj file to generate final executable. (although >>>>>>> in principle there can be a link-time optimization even with all obj files, >>>>>>> it requires a lot of work and is machine-dependent. This may also be the >>>>>>> reason why modern compilers like LLVM/GCC/ICC, etc performs LTO not at obj >>>>>>> level. But, obj level may yield extra benefit even LTO at intermediate >>>>>>> level has been applied by compilers, because obj level can see more >>>>>>> information.) >>>>>>> >>>>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` is too coarse-grain. >>>>>>> >>>>>>> 5. Modify clang to align with GCC/ICC so that many tunables are >>>>>>> exposed at clang command line. Not sure how much work is needed, but at >>>>>>> least requires an overall understanding of compiler internals, which can be >>>>>>> gradually figured out. >>>>>>> >>>>>>> I believe 5 is interesting, but 2 may be good enough. More >>>>>>> experiments are needed before decision is made. >>>>>>> >>>>>>> On Mon, Jan 8, 2018 at 12:56 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Toddy, >>>>>>>> >>>>>>>> You can achieve what you're looking for with a pipeline based on >>>>>>>> `clang -Ox` + `opt -Ox` + `llc -Ox` (or lld instead of llc), but this won't >>>>>>>> be guarantee'd to be well supported across releases of the compiler. >>>>>>>> >>>>>>>> Otherwise, if there are some performance-releated (or not...) >>>>>>>> command line options you think clang is missing / would benefit, I invite >>>>>>>> you to propose adding them to cfe-dev at lists.llvm.org and submit a >>>>>>>> patch! >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> -- >>>>>>>> Mehdi >>>>>>>> >>>>>>>> 2018-01-07 21:03 GMT-08:00 toddy wang <wenwangtoddy at gmail.com>: >>>>>>>> >>>>>>>>> Thanks a lot, Mehdi. >>>>>>>>> >>>>>>>>> For GCC, there are around 190 optimization flags exposed as >>>>>>>>> command-line options. >>>>>>>>> For Clang/LLVM, the number is 40, and many important optimization >>>>>>>>> parameters are not exposed at all, such as loop unrolling factor, inline >>>>>>>>> function size parameters. >>>>>>>>> >>>>>>>>> I understand there is very different idea for whether or not >>>>>>>>> expose many flags to end-user. >>>>>>>>> Personally, I believe it is a reasonable to keep end-user >>>>>>>>> controllable command-line options minimal for user-friendliness. >>>>>>>>> However, for users who care a lot for a tiny bit performance >>>>>>>>> improvement, like HPC community, it may be better to expose as many >>>>>>>>> fine-grained tunables in the form of command line options as possible. Or, >>>>>>>>> at least there should be a way to achieve this fairly easy. >>>>>>>>> >>>>>>>>> I am curious about which way is the best for my purpose. >>>>>>>>> Please see my latest reply for 3 possible fine-grained >>>>>>>>> optimization pipeline. >>>>>>>>> Looking forward to more discussions. >>>>>>>>> >>>>>>>>> Thanks a lot! >>>>>>>>> >>>>>>>>> On Sun, Jan 7, 2018 at 10:11 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> "SetC" options are LLVM cl::opt options, they are intended for >>>>>>>>>> LLVM developer and experimentations. If a settings is intended to be used >>>>>>>>>> as a public API, there is usually a programmatic way of setting it in LLVM. >>>>>>>>>> "SetA" is what clang as a C++ compiler exposes to the end-user. >>>>>>>>>> Internally clang will (most of the time) use one or multiple LLVM APIs to >>>>>>>>>> propagate a settings. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Mehdi >>>>>>>>>> >>>>>>>>>> 2018-01-05 17:41 GMT-08:00 toddy wang via llvm-dev < >>>>>>>>>> llvm-dev at lists.llvm.org>: >>>>>>>>>> >>>>>>>>>>> Craig, thanks a lot! >>>>>>>>>>> >>>>>>>>>>> I'm actually confused by clang optimization flags. >>>>>>>>>>> >>>>>>>>>>> If I run clang -help, it will show many optimizations (denoted >>>>>>>>>>> as set A) and non-optimization options (denoted as set B). >>>>>>>>>>> If I run llvm-as < /dev/null | opt -O0/1/2/3 -disable-output >>>>>>>>>>> -debug-pass=Arguments, it also shows many optimization flags (denote as set >>>>>>>>>>> C). >>>>>>>>>>> >>>>>>>>>>> There are many options in set C while not in set A, and also >>>>>>>>>>> options in set A but not in set C. >>>>>>>>>>> >>>>>>>>>>> The general question is: what is the relationship between set A >>>>>>>>>>> and set C, at the same optimization level O0/O1/O2/O3? >>>>>>>>>>> Another question is: how to specify an option in set C as a >>>>>>>>>>> clang command line option, if it is not in A? >>>>>>>>>>> >>>>>>>>>>> For example, -dse is in set C but not in set A, how can I >>>>>>>>>>> specify it as a clang option? Or simply I cannot do that. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 5, 2018 at 7:55 PM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> O0 didn't start applying optnone until r304127 in May 2017 >>>>>>>>>>>> which is after the 4.0 family was branched. So only 5.0, 6.0, and trunk >>>>>>>>>>>> have that behavior. Commit message copied below >>>>>>>>>>>> >>>>>>>>>>>> Author: Mehdi Amini <joker.eph at gmail.com> >>>>>>>>>>>> >>>>>>>>>>>> Date: Mon May 29 05:38:20 2017 +0000 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> IRGen: Add optnone attribute on function during O0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Amongst other, this will help LTO to correctly >>>>>>>>>>>> handle/honor files >>>>>>>>>>>> >>>>>>>>>>>> compiled with O0, helping debugging failures. >>>>>>>>>>>> >>>>>>>>>>>> It also seems in line with how we handle other options, >>>>>>>>>>>> like how >>>>>>>>>>>> >>>>>>>>>>>> -fnoinline adds the appropriate attribute as well. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Differential Revision: https://reviews.llvm.org/D28404 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 5, 2018 at 4:49 PM, toddy wang < >>>>>>>>>>>> wenwangtoddy at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> @Zhaopei, thanks for the clarification. >>>>>>>>>>>>> >>>>>>>>>>>>> @Craig and @Michael, for clang 4.0.1, -Xclang >>>>>>>>>>>>> -disable-O0-optnone gives the following error message. From which >>>>>>>>>>>>> version -disable-O0-optnone gets supported? >>>>>>>>>>>>> >>>>>>>>>>>>> [twang15 at c89 temp]$ clang++ -O0 -Xclang -disable-O0-optnone >>>>>>>>>>>>> -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc >>>>>>>>>>>>> error: unknown argument: '-disable-O0-optnone' >>>>>>>>>>>>> >>>>>>>>>>>>> [twang15 at c89 temp]$ clang++ --version >>>>>>>>>>>>> clang version 4.0.1 (tags/RELEASE_401/final) >>>>>>>>>>>>> Target: x86_64-unknown-linux-gnu >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 5, 2018 at 4:45 PM, Craig Topper < >>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> If you pass -O0 to clang, most functions will be tagged with >>>>>>>>>>>>>> an optnone function attribute that will prevent opt and llc even if you >>>>>>>>>>>>>> pass -O3 to opt and llc. This is the mostly likely cause for the slow down >>>>>>>>>>>>>> in 2. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can disable the optnone function attribute behavior by >>>>>>>>>>>>>> passing "-Xclang -disable-O0-optnone" to clang >>>>>>>>>>>>>> >>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 5, 2018 at 1:19 PM, toddy wang via llvm-dev < >>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I tried the following on LULESH1.0 serial version ( >>>>>>>>>>>>>>> https://codesign.llnl.gov/lulesh/LULESH.cc) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. clang++ -O3 LULESH.cc; ./a.out 20 >>>>>>>>>>>>>>> Runtime: 9.487353 second >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>>>> Runtime: 24.15 seconds >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o >>>>>>>>>>>>>>> a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; >>>>>>>>>>>>>>> clang++ b.o -o b.out; ./b.out 20 >>>>>>>>>>>>>>> Runtime: 9.53 seconds >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1 and 3 have almost the same performance, while 2 is >>>>>>>>>>>>>>> significantly worse, while I expect 1, 2 ,3 should have trivial difference. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Is this a wrong expectation? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> @Peizhao, what did you try in your last post? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It's really nice of you pointing out the -Xclang option, it >>>>>>>>>>>>>>>> makes things much easier. I really appreciate your help! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini < >>>>>>>>>>>>>>>> mehdi.amini at apple.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> clang -O0 does not disable all optimization passes modify >>>>>>>>>>>>>>>>> the IR.; In fact it causes most functions to get tagged with noinline to >>>>>>>>>>>>>>>>> prevent inlinining >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It also disable lifetime instrinsics emission and TBAA, >>>>>>>>>>>>>>>>> etc. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> What you really need to do is >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> clang -O3 -c emit-llvm -o source.bc -v >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Find the -cc1 command line from that output. Execute that >>>>>>>>>>>>>>>>> command with --disable-llvm-passes. leave the -O3 and everything else. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> That’s a bit complicated: CC1 options can be passed >>>>>>>>>>>>>>>>> through with -Xclang, for example here just adding to the regular clang >>>>>>>>>>>>>>>>> invocation ` -Xclang -disable-llvm-passes` >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> — >>>>>>>>>>>>>>>>> Mehdi >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You should be able to feed the output from that command to >>>>>>>>>>>>>>>>> opt/llc and get consistent results. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi folks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am wondering about the relationship clang, opt and llc. >>>>>>>>>>>>>>>>>> I understand that this has been asked, e.g., >>>>>>>>>>>>>>>>>> http://stackoverflow.com/questions/40350990/relationsh >>>>>>>>>>>>>>>>>> ip-between-clang-opt-llc-and-llvm-linker. Sorry for >>>>>>>>>>>>>>>>>> posting a similar question again, but I still have something that hasn't >>>>>>>>>>>>>>>>>> been resolved yet. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> More specifically I am wondering about the following two >>>>>>>>>>>>>>>>>> approaches compiling optimized executable: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. clang -O3 -c source.c -o source.o >>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. clang -O0 -c -emit-llvm -o source.bc >>>>>>>>>>>>>>>>>> opt -O3 source.bc -o source.bc >>>>>>>>>>>>>>>>>> llc -O3 -filetype=obj source.bc -o source.o >>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>> clang a.o b.o c.o ... -o executable >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I took a look at the source code of the clang tool and >>>>>>>>>>>>>>>>>> the opt tool, they both seem to use the PassManagerBuilder::populateModulePassManager() >>>>>>>>>>>>>>>>>> and PassManagerBuilder::populateFunctionPassManager() >>>>>>>>>>>>>>>>>> functions to add passes to their optimization pipeline; and for the >>>>>>>>>>>>>>>>>> backend, the clang and llc both use the addPassesToEmitFile() function to >>>>>>>>>>>>>>>>>> generate object code. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So presumably the above two approaches to generating >>>>>>>>>>>>>>>>>> optimized executable file should do the same thing. However, I am seeing >>>>>>>>>>>>>>>>>> that the second approach is around 2% slower than the first approach (which >>>>>>>>>>>>>>>>>> is the way developers usually use) pretty consistently. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can anyone point me to the reasons why this happens? Or >>>>>>>>>>>>>>>>>> even correct my wrong understanding of the relationship between these two >>>>>>>>>>>>>>>>>> approaches? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PS: I used the -debug-pass=Structure option to print out >>>>>>>>>>>>>>>>>> the passes, they seem the same except that the first approach has an extra >>>>>>>>>>>>>>>>>> pass called "-add-discriminator", but I don't think that's the reason. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Peizhao >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>> >>>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180108/1928c19d/attachment.html>