Alexandros Lamprineas via llvm-dev
2018-Sep-26 18:52 UTC
[llvm-dev] [RFC] Delay Phi Operand Folding
Hi llvm-dev,
I was thinking to make InstCombine capable of turning on/off the folding of Phi
operands. I am not entirely sure this is a right approach. My motivation is
similar to https://reviews.llvm.org/D50723. I noticed that InstCombine tries to
fold Phi operands and therefore can sink instructions. When this happens too
early in the pass pipeline, it can prevent other optimization passes like
GVN-Hoist from being beneficial. I was looking at a particular case, where
InstCombine would sink a chain of instructions from both sides of a diamond
structure except for two geps that were in the beginning of the chain. Each side
block had the exact sequence of instructions with the geps in reverse order. The
Phis corresponding to the geps were then turned into selects by SimplifyCFG,
resulting a sub-optimal sequence with the selects happening too early and
presumably creating stalls in the execution pipeline:
IR just before InstCombine
/ \
Use(...(Use(load(gep 0)))) Use(...(Use(load(gep 1))))
Use(...(Use(load(gep 1)))) Use(...(Use(load(gep 0))))
\ /
IR after all passes
Use(...(Use(load(select(gep 0, gep 1)))))
Use(...(Use(load(select(gep 1, gep 0)))))
Applying my patch from https://reviews.llvm.org/D52568 allows GVN-Hoist to hoist
the whole chain:
select(Use(...(Use(load(gep 0)))), Use(...(Use(load(gep 1)))))
select(Use(...(Use(load(gep 1)))), Use(...(Use(load(gep 0)))))
I am posting some performance numbers targeting Cortex-A57 AArch64 reported by
LNT for llvm-test-suite, spec2000, and spec2006 at -O3 using a resent LLVM trunk
revision with my patch applied.
Performance improvements in execution time:
-------------------------------------------------------------
SingleSource/Benchmarks/Shootout-C++/Shootout-C++-ackermann -38.33%
MultiSource/Benchmarks/Olden/em3d/em3d -12.41%
SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt
-7.02%
SingleSource/Benchmarks/Shootout-C++/Shootout-C++-methcall -7.02%
MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 -4.19%
MultiSource/Applications/hexxagon/hexxagon -4.17%
SingleSource/Benchmarks/Misc/whetstone -2.62%
SingleSource/Benchmarks/Misc-C++/stepanov_container -2.56%
MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk -2.41%
External/SPEC/CINT2006/401.bzip2/401.bzip2 -2.18%
External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk -2.11%
External/SPEC/CINT2000/253.perlbmk/253.perlbmk -2.03%
SingleSource/Benchmarks/Polybench/stencils/fdtd-apml/fdtd-apml -2.03%
SingleSource/Benchmarks/Misc/evalloop -1.88%
SingleSource/Benchmarks/Misc/himenobmtxpa -1.75%
SingleSource/Benchmarks/Misc/oourafft -1.64%
MultiSource/Benchmarks/Ptrdist/bc/bc -1.57%
MultiSource/Applications/spiff/spiff -1.54%
SingleSource/Benchmarks/CoyoteBench/fftbench -1.47%
MultiSource/Applications/aha/aha -1.41%
MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 -1.35%
MultiSource/Benchmarks/llubenchmark/llu -1.25%
MultiSource/Benchmarks/mafft/pairlocalalign -1.24%
External/SPEC/CINT2006/471.omnetpp/471.omnetpp -1.23%
External/SPEC/CFP2000/183.equake/183.equake -1.10%
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm -1.03%
Performance regressions in execution time:
----------------------------------------------------------
MultiSource/Benchmarks/Ptrdist/anagram/anagram 1.19%
MultiSource/Benchmarks/TSVC/Expansion-dbl/Expansion-dbl 1.87%
MultiSource/Applications/lambda-0.1.3/lambda 4.06%
Regards,
Alexandros
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180926/f7de209c/attachment.html>