thr3ads.net - llvm dev - [llvm-dev] Loop branching inefficiencies in Backend output [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Dilan Manatunga via llvm-dev

2017-Jul-09 23:36 UTC

[llvm-dev] Loop branching inefficiencies in Backend output

Hi,

I am working on a custom backend, and I am trying to figure out how to deal
with some branching inefficiencies in my output code, and the best way to
fix it.

So, let's say I am compiling a small function that takes the sum of an
array.
int loop(int* array, int n) {
  int ret = 0;
  for (int i = 0; i < n; i++) {
    ret += array[i];
  }
  return ret;
}

The problem I am having is that in the generated code, we got something
along these lines.

loop:
    <loop-code>
    branch cond exit-block
    branch loop
exit-block:
    <exit-block code>

Now, due to the basic block placement, that could be simplified to the
following, removing an extra branch instruction.
loop:
    <loop-code>
    branch !cond loop
exit-block:
    <exit-block code>

So, I decided to investigate to see if this problem occurs in other
backends (basically to see if I am missing some implementation in my
backend). I found that you can see this issue with the same code for the
NVPTX backend, but not for the x86 or ARM backends.

When I looked at the debug output for the x86 backend though, I can't
figure out how they realize to get rid of that branch. Up to the last point
that MachineInstrs are printed (the rewrite virtual registers pass), the
MachineInstrs still show the two jump paradigm. But in the final output, it
somehow becomes one branch case.

I know one solution is for me to add a pass after basic block placement to
change these instructions, but I am wondering if there is a recommended way
to do this. Especially as this seems like a common problem, where an
existing generic pass would have some interface a backend should support to
allow branch re-writes to create the single branch case.

Thanks,
-Dilan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/dfe6d347/attachment.html>

Friedman, Eli via llvm-dev

2017-Jul-10 20:01 UTC

head link

[llvm-dev] Loop branching inefficiencies in Backend output

On 7/9/2017 4:36 PM, Dilan Manatunga via llvm-dev wrote:> When I looked at the debug output for the x86 backend though, I can't 
> figure out how they realize to get rid of that branch. Up to the last 
> point that MachineInstrs are printed (the rewrite virtual registers 
> pass), the MachineInstrs still show the two jump paradigm. But in the 
> final output, it somehow becomes one branch case.
The "-print-after-all" flag is often useful to figure out what various
passes do.
>
> I know one solution is for me to add a pass after basic block 
> placement to change these instructions, but I am wondering if there is 
> a recommended way to do this. Especially as this seems like a common 
> problem, where an existing generic pass would have some interface a 
> backend should support to allow branch re-writes to create the single 
> branch case.
This transform already exists as target-independent code in 
BranchFolding.cpp.  Is TargetInstrInfo::reverseBranchCondition 
implemented for your target?

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

Dilan Manatunga via llvm-dev

2017-Jul-10 20:41 UTC

head link

[llvm-dev] Loop branching inefficiencies in Backend output

Thanks! That is exactly what I was looking for. I was basing most of my
stuff on NVPTX and apparently they don't bother implementing the
reverseBranchCondition.

Also, thanks for print-after-all flag. Should be useful for me.

Thanks again for all the help.
-Dilan

On Mon, Jul 10, 2017 at 1:01 PM Friedman, Eli <efriedma at codeaurora.org>
wrote:
> On 7/9/2017 4:36 PM, Dilan Manatunga via llvm-dev wrote:
> > When I looked at the debug output for the x86 backend though, I
can't
> > figure out how they realize to get rid of that branch. Up to the last
> > point that MachineInstrs are printed (the rewrite virtual registers
> > pass), the MachineInstrs still show the two jump paradigm. But in the
> > final output, it somehow becomes one branch case.
>
> The "-print-after-all" flag is often useful to figure out what
various
> passes do.
>
> >
> > I know one solution is for me to add a pass after basic block
> > placement to change these instructions, but I am wondering if there is
> > a recommended way to do this. Especially as this seems like a common
> > problem, where an existing generic pass would have some interface a
> > backend should support to allow branch re-writes to create the single
> > branch case.
>
> This transform already exists as target-independent code in
> BranchFolding.cpp.  Is TargetInstrInfo::reverseBranchCondition
> implemented for your target?
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170710/01691c87/attachment.html>

llvm dev - Jul 2017 - Loop branching inefficiencies in Backend output

[llvm-dev] Loop branching inefficiencies in Backend output

[llvm-dev] Loop branching inefficiencies in Backend output

[llvm-dev] Loop branching inefficiencies in Backend output