Dilan Manatunga via llvm-dev
2017-Jul-09 23:36 UTC
[llvm-dev] Loop branching inefficiencies in Backend output
Hi, I am working on a custom backend, and I am trying to figure out how to deal with some branching inefficiencies in my output code, and the best way to fix it. So, let's say I am compiling a small function that takes the sum of an array. int loop(int* array, int n) { int ret = 0; for (int i = 0; i < n; i++) { ret += array[i]; } return ret; } The problem I am having is that in the generated code, we got something along these lines. loop: <loop-code> branch cond exit-block branch loop exit-block: <exit-block code> Now, due to the basic block placement, that could be simplified to the following, removing an extra branch instruction. loop: <loop-code> branch !cond loop exit-block: <exit-block code> So, I decided to investigate to see if this problem occurs in other backends (basically to see if I am missing some implementation in my backend). I found that you can see this issue with the same code for the NVPTX backend, but not for the x86 or ARM backends. When I looked at the debug output for the x86 backend though, I can't figure out how they realize to get rid of that branch. Up to the last point that MachineInstrs are printed (the rewrite virtual registers pass), the MachineInstrs still show the two jump paradigm. But in the final output, it somehow becomes one branch case. I know one solution is for me to add a pass after basic block placement to change these instructions, but I am wondering if there is a recommended way to do this. Especially as this seems like a common problem, where an existing generic pass would have some interface a backend should support to allow branch re-writes to create the single branch case. Thanks, -Dilan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/dfe6d347/attachment.html>
Friedman, Eli via llvm-dev
2017-Jul-10 20:01 UTC
[llvm-dev] Loop branching inefficiencies in Backend output
On 7/9/2017 4:36 PM, Dilan Manatunga via llvm-dev wrote:> When I looked at the debug output for the x86 backend though, I can't > figure out how they realize to get rid of that branch. Up to the last > point that MachineInstrs are printed (the rewrite virtual registers > pass), the MachineInstrs still show the two jump paradigm. But in the > final output, it somehow becomes one branch case.The "-print-after-all" flag is often useful to figure out what various passes do.> > I know one solution is for me to add a pass after basic block > placement to change these instructions, but I am wondering if there is > a recommended way to do this. Especially as this seems like a common > problem, where an existing generic pass would have some interface a > backend should support to allow branch re-writes to create the single > branch case.This transform already exists as target-independent code in BranchFolding.cpp. Is TargetInstrInfo::reverseBranchCondition implemented for your target? -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Dilan Manatunga via llvm-dev
2017-Jul-10 20:41 UTC
[llvm-dev] Loop branching inefficiencies in Backend output
Thanks! That is exactly what I was looking for. I was basing most of my stuff on NVPTX and apparently they don't bother implementing the reverseBranchCondition. Also, thanks for print-after-all flag. Should be useful for me. Thanks again for all the help. -Dilan On Mon, Jul 10, 2017 at 1:01 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 7/9/2017 4:36 PM, Dilan Manatunga via llvm-dev wrote: > > When I looked at the debug output for the x86 backend though, I can't > > figure out how they realize to get rid of that branch. Up to the last > > point that MachineInstrs are printed (the rewrite virtual registers > > pass), the MachineInstrs still show the two jump paradigm. But in the > > final output, it somehow becomes one branch case. > > The "-print-after-all" flag is often useful to figure out what various > passes do. > > > > > I know one solution is for me to add a pass after basic block > > placement to change these instructions, but I am wondering if there is > > a recommended way to do this. Especially as this seems like a common > > problem, where an existing generic pass would have some interface a > > backend should support to allow branch re-writes to create the single > > branch case. > > This transform already exists as target-independent code in > BranchFolding.cpp. Is TargetInstrInfo::reverseBranchCondition > implemented for your target? > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux > Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170710/01691c87/attachment.html>