via llvm-dev
2016-Mar-29 13:20 UTC
[llvm-dev] [CodeGen] CodeSize - TailMerging and BlockPlacement
Hi everyone, The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. I did an experiment of adding additional BranchFolding and BlockPlacement after the existing BlockPlacement (i.e., -block-placement -branch-folder -block-placement) targeting an AARch64 backend. Thousands of instructions can be removed in spec2006 benchmarks as shown below. I checked the binaries and did not find any increase of unwanted instructions. The change does not hurt any benchmark with noticeable regression and sometimes results in small improvement (1%-3%). 473.astar -7 401.bzip2 -110 403.gcc -13,006 445.gobmk -1,716 464.h264ref -684 456.hmmer -391 462.libquantum -4 429.mcf -4 471.omnetpp -1,980 400.perlbench -4,176 458.sjeng -338 450.soplex -395 483.xalancbmk -4,183 447.dealII -186 433.milc -34 444.namd -104 453.povray -1,785 482.sphinx3 -112 I propose to factor out the relevant code from BranchFolding into a utility, and call it from BlockPlacement whenever the layout is changed. It is similar to D18226 and D18411 which factor tail duplication into a utility and call it from BlockPlacement. Any thoughts, advice, or comments? Best, Haicheng
Hal Finkel via llvm-dev
2016-Apr-22 00:34 UTC
[llvm-dev] [CodeGen] CodeSize - TailMerging and BlockPlacement
----- Original Message -----> From: "via llvm-dev" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Tuesday, March 29, 2016 8:20:22 AM > Subject: [llvm-dev] [CodeGen] CodeSize - TailMerging and BlockPlacement > > Hi everyone, > > The code layout that TailMerging (inside BranchFolding) works on is > not > the final layout optimized based on the branch probability. > Generally, > after BlockPlacement, many new merging opportunities emerge. I did an > experiment of adding additional BranchFolding and BlockPlacement > after > the existing BlockPlacement (i.e., -block-placement -branch-folder > -block-placement) targeting an AARch64 backend. Thousands of > instructions can be removed in spec2006 benchmarks as shown below. I > checked the binaries and did not find any increase of unwanted > instructions. The change does not hurt any benchmark with noticeable > regression and sometimes results in small improvement (1%-3%). > > 473.astar -7 > 401.bzip2 -110 > 403.gcc -13,006 > 445.gobmk -1,716 > 464.h264ref -684 > 456.hmmer -391 > 462.libquantum -4 > 429.mcf -4 > 471.omnetpp -1,980 > 400.perlbench -4,176 > 458.sjeng -338 > 450.soplex -395 > 483.xalancbmk -4,183 > 447.dealII -186 > 433.milc -34 > 444.namd -104 > 453.povray -1,785 > 482.sphinx3 -112 > > I propose to factor out the relevant code from BranchFolding into a > utility, and call it from BlockPlacement whenever the layout is > changed. > It is similar to D18226 and D18411 which factor tail duplication > into a > utility and call it from BlockPlacement. Any thoughts, advice, or > comments?Did anyone yet provide you with feedback on this? It seems like a reasonable plan to me. -Hal> > Best, > > Haicheng > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Haicheng Wu via llvm-dev
2016-Apr-22 14:56 UTC
[llvm-dev] [CodeGen] CodeSize - TailMerging and BlockPlacement
Hi Hal, Thank you for your interest. I haven't received any feedback, but I already started the work. Some early cleaning work was already committed and more will be posted for review soon. Best, Haicheng -----Original Message----- From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Thursday, April 21, 2016 8:34 PM To: haicheng at codeaurora.org Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] [CodeGen] CodeSize - TailMerging and BlockPlacement ----- Original Message -----> From: "via llvm-dev" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Tuesday, March 29, 2016 8:20:22 AM > Subject: [llvm-dev] [CodeGen] CodeSize - TailMerging and > BlockPlacement > > Hi everyone, > > The code layout that TailMerging (inside BranchFolding) works on is > not the final layout optimized based on the branch probability. > Generally, > after BlockPlacement, many new merging opportunities emerge. I did an > experiment of adding additional BranchFolding and BlockPlacement after > the existing BlockPlacement (i.e., -block-placement -branch-folder > -block-placement) targeting an AARch64 backend. Thousands of > instructions can be removed in spec2006 benchmarks as shown below. I > checked the binaries and did not find any increase of unwanted > instructions. The change does not hurt any benchmark with noticeable > regression and sometimes results in small improvement (1%-3%). > > 473.astar -7 > 401.bzip2 -110 > 403.gcc -13,006 > 445.gobmk -1,716 > 464.h264ref -684 > 456.hmmer -391 > 462.libquantum -4 > 429.mcf -4 > 471.omnetpp -1,980 > 400.perlbench -4,176 > 458.sjeng -338 > 450.soplex -395 > 483.xalancbmk -4,183 > 447.dealII -186 > 433.milc -34 > 444.namd -104 > 453.povray -1,785 > 482.sphinx3 -112 > > I propose to factor out the relevant code from BranchFolding into a > utility, and call it from BlockPlacement whenever the layout is > changed. > It is similar to D18226 and D18411 which factor tail duplication > into a > utility and call it from BlockPlacement. Any thoughts, advice, or > comments?Did anyone yet provide you with feedback on this? It seems like a reasonable plan to me. -Hal> > Best, > > Haicheng > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Apparently Analagous Threads
- [LLVMdev] How do you add MachineBlockPlacement to a Function Pass Manager?
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
- [LLVMdev] HazardRecognizer and RegisterAllocation
- [LLVMdev] Tail Duplication Questions
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?