Alexandros Lamprineas via llvm-dev
2018-Sep-26 18:52 UTC
[llvm-dev] [RFC] Delay Phi Operand Folding
Hi llvm-dev, I was thinking to make InstCombine capable of turning on/off the folding of Phi operands. I am not entirely sure this is a right approach. My motivation is similar to https://reviews.llvm.org/D50723. I noticed that InstCombine tries to fold Phi operands and therefore can sink instructions. When this happens too early in the pass pipeline, it can prevent other optimization passes like GVN-Hoist from being beneficial. I was looking at a particular case, where InstCombine would sink a chain of instructions from both sides of a diamond structure except for two geps that were in the beginning of the chain. Each side block had the exact sequence of instructions with the geps in reverse order. The Phis corresponding to the geps were then turned into selects by SimplifyCFG, resulting a sub-optimal sequence with the selects happening too early and presumably creating stalls in the execution pipeline: IR just before InstCombine / \ Use(...(Use(load(gep 0)))) Use(...(Use(load(gep 1)))) Use(...(Use(load(gep 1)))) Use(...(Use(load(gep 0)))) \ / IR after all passes Use(...(Use(load(select(gep 0, gep 1))))) Use(...(Use(load(select(gep 1, gep 0))))) Applying my patch from https://reviews.llvm.org/D52568 allows GVN-Hoist to hoist the whole chain: select(Use(...(Use(load(gep 0)))), Use(...(Use(load(gep 1))))) select(Use(...(Use(load(gep 1)))), Use(...(Use(load(gep 0))))) I am posting some performance numbers targeting Cortex-A57 AArch64 reported by LNT for llvm-test-suite, spec2000, and spec2006 at -O3 using a resent LLVM trunk revision with my patch applied. Performance improvements in execution time: ------------------------------------------------------------- SingleSource/Benchmarks/Shootout-C++/Shootout-C++-ackermann -38.33% MultiSource/Benchmarks/Olden/em3d/em3d -12.41% SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt -7.02% SingleSource/Benchmarks/Shootout-C++/Shootout-C++-methcall -7.02% MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 -4.19% MultiSource/Applications/hexxagon/hexxagon -4.17% SingleSource/Benchmarks/Misc/whetstone -2.62% SingleSource/Benchmarks/Misc-C++/stepanov_container -2.56% MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk -2.41% External/SPEC/CINT2006/401.bzip2/401.bzip2 -2.18% External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk -2.11% External/SPEC/CINT2000/253.perlbmk/253.perlbmk -2.03% SingleSource/Benchmarks/Polybench/stencils/fdtd-apml/fdtd-apml -2.03% SingleSource/Benchmarks/Misc/evalloop -1.88% SingleSource/Benchmarks/Misc/himenobmtxpa -1.75% SingleSource/Benchmarks/Misc/oourafft -1.64% MultiSource/Benchmarks/Ptrdist/bc/bc -1.57% MultiSource/Applications/spiff/spiff -1.54% SingleSource/Benchmarks/CoyoteBench/fftbench -1.47% MultiSource/Applications/aha/aha -1.41% MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 -1.35% MultiSource/Benchmarks/llubenchmark/llu -1.25% MultiSource/Benchmarks/mafft/pairlocalalign -1.24% External/SPEC/CINT2006/471.omnetpp/471.omnetpp -1.23% External/SPEC/CFP2000/183.equake/183.equake -1.10% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm -1.03% Performance regressions in execution time: ---------------------------------------------------------- MultiSource/Benchmarks/Ptrdist/anagram/anagram 1.19% MultiSource/Benchmarks/TSVC/Expansion-dbl/Expansion-dbl 1.87% MultiSource/Applications/lambda-0.1.3/lambda 4.06% Regards, Alexandros IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180926/f7de209c/attachment.html>