Bhatu via llvm-dev
2019-May-13 07:49 UTC
[llvm-dev] Is it possible to reproduce the result of opt -O3 manually?
I think this has to do with how the pass manager is populated when we give -O3 vs when we give particular pass names. Some passes have multiple createXYZPass() methods that accept arguments too. These methods call non-default pass constructors, which in turn cause the passes to behave in a different manner. eg: Pass *llvm::createLICMPass() { return new LegacyLICMPass(); } Pass *llvm::createLICMPass(unsigned LicmMssaOptCap, unsigned LicmMssaNoAccForPromotionCap) { return new LegacyLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap); } or Pass *createLoopVectorizePass() { return new LoopVectorize(); } Pass *createLoopVectorizePass(bool InterleaveOnlyWhenForced, bool VectorizeOnlyWhenForced) { return new LoopVectorize(InterleaveOnlyWhenForced, VectorizeOnlyWhenForced); } When we give pass names, opt calls the default constructor (eg: LoopVectorize()) whereas when we give O3, it can call a different version. You can check in PassManagerBuilder.cpp (populateModulePassManager, populateFunctionPassManager) to see where different versions are being populated. Those must be the points in the pipeline where the IR starts differing. On Sat, May 11, 2019 at 10:09 PM Mehdi AMINI via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > On Thu, May 9, 2019 at 5:20 PM Rahim Mammadli via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Dear developers, >> >> I am trying to reproduce the results of applying opt -O3 to a source file >> in the form of LLVM IR. I want to get the same IR by manually ordering the >> passes used by O3 and passing them to opt. >> >> To illustrate what I am doing on an example, as an input I use linpack >> benchmark from the LLVM test suite[1]: >> >> 1. First I produce the intermediate representation using clang: >> clang -O3 -Xclang -disable-llvm-optzns -emit-llvm -S linpack-pc.c -o >> linpack-pc.ll >> >> 2. Then I use opt to optimize the IR: >> opt -S -O3 -o linpack-pc-3.ll linpack-pc.ll >> >> Now my goal is to produce the IR identical to linpack-pc-3.ll by passing >> a sequence of optimizations to opt. To get the list of optimizations used >> by opt for O3, I run this: >> opt -O3 -disable-output -debug-pass=Arguments linpack-pc.ll >> >> Which produces (shortened to avoid wasting space): >> Pass Arguments: -tti -targetlibinfo -tbaa ... >> Pass Arguments: -targetlibinfo -tti -tbaa ... >> Pass Arguments: -domtree >> >> So apparently there are three sequences of passes applied to IR as part >> of O3. I wasn't able to reproduce the same IR as linpack-pc-3.ll using >> these passes, I tried applying passes sequentially or concatenating them >> and passing as a single sequence to opt. Neither produced the needed >> output. Moreover the performance of the final executable downgraded by >> about 35%. I'm using LLVM 3.8 and my OS is Ubuntu 16.04. >> >> [1] >> https://github.com/llvm/llvm-test-suite/blob/master/SingleSource/Benchmarks/Linpack/linpack-pc.c >> >> > I'd very much appreciate if you could help me with this. Thank you. >> > > Your approach seems sensible to me. I usually have been debugging this > kind of problem by piping the output of the two runs with > `-print-after-all` to files and diff them to find out where the difference > pops in. > > -- > Mehdi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Regards Bhatu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/9b9f1506/attachment-0001.html>
Rahim Mammadli via llvm-dev
2019-May-14 13:42 UTC
[llvm-dev] Is it possible to reproduce the result of opt -O3 manually?
Dear Bhatu and Mehdi, Thank you for your helpful suggestions. Indeed -print-after-all option is quite useful to find out the differences between the two approaches. As Bhatu pointed out, when providing -O3 as an option some of the passes are initialized differently when compared to simply passing a list of passes to opt. Some of the passes even depend on OptLevel as a parameter which influences the behavior of a pass, e.g. the following passes directly depend on the OptLevel parameter: * createSimpleLoopUnrollPass() * createLoopUnrollAndJamPass() * createLoopUnrollPass() However, the only way of setting OptLevel and SizeLevel parameters through CLI is by providing -O<> optimization flag, which in turn results in all of the associated passes being added to the list of passes. Is there a way of setting OptLevel = 3 and SizeLevel = 0 without running all the passes associated with -O3? That would probably get me closest to replicating O3, but I cannot figure out how to do it. Kind Regards, Rahim Mammadli On Mon, 13 May 2019 at 09:49, Bhatu <CS12B1010 at iith.ac.in> wrote:> I think this has to do with how the pass manager is populated when we give > -O3 vs when we give particular pass names. > Some passes have multiple createXYZPass() methods that accept arguments > too. These methods call non-default pass constructors, which in turn cause > the passes to behave in a different manner. > eg: > Pass *llvm::createLICMPass() { return new LegacyLICMPass(); } > Pass *llvm::createLICMPass(unsigned LicmMssaOptCap, > unsigned LicmMssaNoAccForPromotionCap) { > return new LegacyLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap); > } > or > Pass *createLoopVectorizePass() { return new LoopVectorize(); } > Pass *createLoopVectorizePass(bool InterleaveOnlyWhenForced, > bool VectorizeOnlyWhenForced) { > return new LoopVectorize(InterleaveOnlyWhenForced, > VectorizeOnlyWhenForced); > } > When we give pass names, opt calls the default constructor (eg: > LoopVectorize()) whereas when we give O3, it can call a different version. > You can check in PassManagerBuilder.cpp (populateModulePassManager, > populateFunctionPassManager) to see where different versions are being > populated. Those must be the points in the pipeline where the IR starts > differing. > > On Sat, May 11, 2019 at 10:09 PM Mehdi AMINI via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi, >> >> On Thu, May 9, 2019 at 5:20 PM Rahim Mammadli via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Dear developers, >>> >>> I am trying to reproduce the results of applying opt -O3 to a source >>> file in the form of LLVM IR. I want to get the same IR by manually ordering >>> the passes used by O3 and passing them to opt. >>> >>> To illustrate what I am doing on an example, as an input I use linpack >>> benchmark from the LLVM test suite[1]: >>> >>> 1. First I produce the intermediate representation using clang: >>> clang -O3 -Xclang -disable-llvm-optzns -emit-llvm -S linpack-pc.c -o >>> linpack-pc.ll >>> >>> 2. Then I use opt to optimize the IR: >>> opt -S -O3 -o linpack-pc-3.ll linpack-pc.ll >>> >>> Now my goal is to produce the IR identical to linpack-pc-3.ll by passing >>> a sequence of optimizations to opt. To get the list of optimizations used >>> by opt for O3, I run this: >>> opt -O3 -disable-output -debug-pass=Arguments linpack-pc.ll >>> >>> Which produces (shortened to avoid wasting space): >>> Pass Arguments: -tti -targetlibinfo -tbaa ... >>> Pass Arguments: -targetlibinfo -tti -tbaa ... >>> Pass Arguments: -domtree >>> >>> So apparently there are three sequences of passes applied to IR as part >>> of O3. I wasn't able to reproduce the same IR as linpack-pc-3.ll using >>> these passes, I tried applying passes sequentially or concatenating them >>> and passing as a single sequence to opt. Neither produced the needed >>> output. Moreover the performance of the final executable downgraded by >>> about 35%. I'm using LLVM 3.8 and my OS is Ubuntu 16.04. >>> >>> [1] >>> https://github.com/llvm/llvm-test-suite/blob/master/SingleSource/Benchmarks/Linpack/linpack-pc.c >>> >>> >> I'd very much appreciate if you could help me with this. Thank you. >>> >> >> Your approach seems sensible to me. I usually have been debugging this >> kind of problem by piping the output of the two runs with >> `-print-after-all` to files and diff them to find out where the difference >> pops in. >> >> -- >> Mehdi >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > -- > Regards > Bhatu >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190514/a169aa06/attachment.html>
Bhatu via llvm-dev
2019-May-15 14:36 UTC
[llvm-dev] Is it possible to reproduce the result of opt -O3 manually?
Hi, I don't see a clean solution to achieve this. A hacky way would be to modify the default constructors of those individual passes to replicate the same behavior as that when under O3. If the same pass is scheduled multiple times in different ways, then you could create new proxy passes(loop-unroll-simple, loop-unroll-O3) to run the same pass but with just different parameters. (+ llvm-dev) On Tue, May 14, 2019 at 7:13 PM Rahim Mammadli <rahim at mamed.li> wrote:> Dear Bhatu and Mehdi, > > Thank you for your helpful suggestions. > > Indeed -print-after-all option is quite useful to find out the differences > between the two approaches. As Bhatu pointed out, when providing -O3 as an > option some of the passes are initialized differently when compared to > simply passing a list of passes to opt. Some of the passes even depend on > OptLevel as a parameter which influences the behavior of a pass, e.g. the > following passes directly depend on the OptLevel parameter: > * createSimpleLoopUnrollPass() > * createLoopUnrollAndJamPass() > * createLoopUnrollPass() > > However, the only way of setting OptLevel and SizeLevel parameters through > CLI is by providing -O<> optimization flag, which in turn results in all of > the associated passes being added to the list of passes. Is there a way of > setting OptLevel = 3 and SizeLevel = 0 without running all the passes > associated with -O3? That would probably get me closest to replicating O3, > but I cannot figure out how to do it. > > Kind Regards, > > Rahim Mammadli > > > On Mon, 13 May 2019 at 09:49, Bhatu <CS12B1010 at iith.ac.in> wrote: > >> I think this has to do with how the pass manager is populated when we >> give -O3 vs when we give particular pass names. >> Some passes have multiple createXYZPass() methods that accept arguments >> too. These methods call non-default pass constructors, which in turn cause >> the passes to behave in a different manner. >> eg: >> Pass *llvm::createLICMPass() { return new LegacyLICMPass(); } >> Pass *llvm::createLICMPass(unsigned LicmMssaOptCap, >> unsigned LicmMssaNoAccForPromotionCap) { >> return new LegacyLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap); >> } >> or >> Pass *createLoopVectorizePass() { return new LoopVectorize(); } >> Pass *createLoopVectorizePass(bool InterleaveOnlyWhenForced, >> bool VectorizeOnlyWhenForced) { >> return new LoopVectorize(InterleaveOnlyWhenForced, >> VectorizeOnlyWhenForced); >> } >> When we give pass names, opt calls the default constructor (eg: >> LoopVectorize()) whereas when we give O3, it can call a different version. >> You can check in PassManagerBuilder.cpp (populateModulePassManager, >> populateFunctionPassManager) to see where different versions are being >> populated. Those must be the points in the pipeline where the IR starts >> differing. >> >> On Sat, May 11, 2019 at 10:09 PM Mehdi AMINI via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi, >>> >>> On Thu, May 9, 2019 at 5:20 PM Rahim Mammadli via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Dear developers, >>>> >>>> I am trying to reproduce the results of applying opt -O3 to a source >>>> file in the form of LLVM IR. I want to get the same IR by manually ordering >>>> the passes used by O3 and passing them to opt. >>>> >>>> To illustrate what I am doing on an example, as an input I use linpack >>>> benchmark from the LLVM test suite[1]: >>>> >>>> 1. First I produce the intermediate representation using clang: >>>> clang -O3 -Xclang -disable-llvm-optzns -emit-llvm -S linpack-pc.c -o >>>> linpack-pc.ll >>>> >>>> 2. Then I use opt to optimize the IR: >>>> opt -S -O3 -o linpack-pc-3.ll linpack-pc.ll >>>> >>>> Now my goal is to produce the IR identical to linpack-pc-3.ll by >>>> passing a sequence of optimizations to opt. To get the list of >>>> optimizations used by opt for O3, I run this: >>>> opt -O3 -disable-output -debug-pass=Arguments linpack-pc.ll >>>> >>>> Which produces (shortened to avoid wasting space): >>>> Pass Arguments: -tti -targetlibinfo -tbaa ... >>>> Pass Arguments: -targetlibinfo -tti -tbaa ... >>>> Pass Arguments: -domtree >>>> >>>> So apparently there are three sequences of passes applied to IR as part >>>> of O3. I wasn't able to reproduce the same IR as linpack-pc-3.ll using >>>> these passes, I tried applying passes sequentially or concatenating them >>>> and passing as a single sequence to opt. Neither produced the needed >>>> output. Moreover the performance of the final executable downgraded by >>>> about 35%. I'm using LLVM 3.8 and my OS is Ubuntu 16.04. >>>> >>>> [1] >>>> https://github.com/llvm/llvm-test-suite/blob/master/SingleSource/Benchmarks/Linpack/linpack-pc.c >>>> >>>> >>> I'd very much appreciate if you could help me with this. Thank you. >>>> >>> >>> Your approach seems sensible to me. I usually have been debugging this >>> kind of problem by piping the output of the two runs with >>> `-print-after-all` to files and diff them to find out where the difference >>> pops in. >>> >>> -- >>> Mehdi >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> >> -- >> Regards >> Bhatu >> >-- Regards Bhatu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/27abc676/attachment.html>