thr3ads.net - llvm dev - [llvm-dev] Instcombine-code-sinking increases the value’s live range [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Chuang-Yu Cheng via llvm-dev

2021-Sep-29 08:52 UTC

[llvm-dev] Instcombine-code-sinking increases the value’s live range

Hi,

In the InstCombinePass, by default the pass will try to sink an
instruction to its successor basic block when possible (so that the
instruction isn’t executed on a path where its result isn’t needed.).
But doing that will also increase a value’s live range. For example:

entry:
  ..
  %6 = load float, ..
  %s.0 = load float, ..
  %mul22 = fmul float %6, %s.0
  %add23 = fadd float %mul22, zeroinitializer

  %7 = load float, ..
  %s.1 = load float, ..
  %mul26 = fmul float %7, %s.1
  %add27 = fadd float %add23, %mul26

  ..
  br i1 %cmp, label %cleanup, label %if.end1

if.end1:
  %15 = load float, ..
  %add67 = fadd %add27, %15
  store float %add67, ..
  br label %cleanup

cleanup:
  return


In the original input, only %add27 has longer live range, but after
InstCombine with instcombine-code-sinking=true (default), it turns out
that %6, %s.0, %7, %s.1 are having longer live ranges.

entry:
  ..
  %6 = load float, ..
  %s.0 = load float, ..

  %7 = load float, ..
  %s.1 = load float, ..

  ..
  br i1 %cmp, label %cleanup, label %if.end1

if.end1:
  %mul22 = fmul float %6, %s.0
  %add23 = fadd float %mul22, zeroinitializer

  %mul26 = fmul float %7, %s.1
  %add27 = fadd float %add23, %mul26

  %15 = load float, ..
  %add67 = fadd %add27, %15
  store float %add67, ..
  br label %cleanup

cleanup:
  return

We see an issue which causes our customized register-allocator keeping
those values like %6, %s.0, %7, %s.1 in registers with a long period.

My questions are:

Does llvm expect the backend's instruction scheduler and register
allocator can handle this properly?

Can this be solved by llvm’s GlobalISel?

Thank you!
CY

Chuang-Yu Cheng via llvm-dev

2021-Oct-13 09:20 UTC

head link

[llvm-dev] Instcombine-code-sinking increases the value’s live range

Answer by myself :P

The original input pattern is as below:

```c
local memory
for (...) {
  a function (has side effect) which copies from global to local memory
  access data in local memory and do compute
}

if (...)
  return;
store the computed result back.
```

If the for loop is fully unrolled, and the computing part is sunk to
the basicblock which stores the computed result back, then the backend
compiler needs to find some places (registers or memory) to store
these copied data.

I've tested with aarch64 and amdgcn, in the test pattern both targets
will spill the data to memory.

In the for-loop If we can directly copy instead of using a copy
function, both targets can generate better basicblock layouts.
(aarch64: "Machine code sinking (machine-sink)" pass, amdgcn:
"Code
sinking (sink)" pass)

On Wed, Sep 29, 2021 at 9:52 AM Chuang-Yu Cheng
<cycheng.buddhist at gmail.com> wrote:>
> Hi,
>
> In the InstCombinePass, by default the pass will try to sink an
> instruction to its successor basic block when possible (so that the
> instruction isn’t executed on a path where its result isn’t needed.).
> But doing that will also increase a value’s live range. For example:
>
> entry:
>   ..
>   %6 = load float, ..
>   %s.0 = load float, ..
>   %mul22 = fmul float %6, %s.0
>   %add23 = fadd float %mul22, zeroinitializer
>
>   %7 = load float, ..
>   %s.1 = load float, ..
>   %mul26 = fmul float %7, %s.1
>   %add27 = fadd float %add23, %mul26
>
>   ..
>   br i1 %cmp, label %cleanup, label %if.end1
>
> if.end1:
>   %15 = load float, ..
>   %add67 = fadd %add27, %15
>   store float %add67, ..
>   br label %cleanup
>
> cleanup:
>   return
>
>
> In the original input, only %add27 has longer live range, but after
> InstCombine with instcombine-code-sinking=true (default), it turns out
> that %6, %s.0, %7, %s.1 are having longer live ranges.
>
> entry:
>   ..
>   %6 = load float, ..
>   %s.0 = load float, ..
>
>   %7 = load float, ..
>   %s.1 = load float, ..
>
>   ..
>   br i1 %cmp, label %cleanup, label %if.end1
>
> if.end1:
>   %mul22 = fmul float %6, %s.0
>   %add23 = fadd float %mul22, zeroinitializer
>
>   %mul26 = fmul float %7, %s.1
>   %add27 = fadd float %add23, %mul26
>
>   %15 = load float, ..
>   %add67 = fadd %add27, %15
>   store float %add67, ..
>   br label %cleanup
>
> cleanup:
>   return
>
> We see an issue which causes our customized register-allocator keeping
> those values like %6, %s.0, %7, %s.1 in registers with a long period.
>
> My questions are:
>
> Does llvm expect the backend's instruction scheduler and register
> allocator can handle this properly?
>
> Can this be solved by llvm’s GlobalISel?
>
> Thank you!
> CY

Amara Emerson via llvm-dev

2021-Oct-14 05:02 UTC

head link

[llvm-dev] Instcombine-code-sinking increases the value’s live range

> On Sep 29, 2021, at 1:52 AM, Chuang-Yu Cheng via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi,
> 
> In the InstCombinePass, by default the pass will try to sink an
> instruction to its successor basic block when possible (so that the
> instruction isn’t executed on a path where its result isn’t needed.).
> But doing that will also increase a value’s live range. For example:
> 
> entry:
>  ..
>  %6 = load float, ..
>  %s.0 = load float, ..
>  %mul22 = fmul float %6, %s.0
>  %add23 = fadd float %mul22, zeroinitializer
> 
>  %7 = load float, ..
>  %s.1 = load float, ..
>  %mul26 = fmul float %7, %s.1
>  %add27 = fadd float %add23, %mul26
> 
>  ..
>  br i1 %cmp, label %cleanup, label %if.end1
> 
> if.end1:
>  %15 = load float, ..
>  %add67 = fadd %add27, %15
>  store float %add67, ..
>  br label %cleanup
> 
> cleanup:
>  return
> 
> 
> In the original input, only %add27 has longer live range, but after
> InstCombine with instcombine-code-sinking=true (default), it turns out
> that %6, %s.0, %7, %s.1 are having longer live ranges.
> 
> entry:
>  ..
>  %6 = load float, ..
>  %s.0 = load float, ..
> 
>  %7 = load float, ..
>  %s.1 = load float, ..
> 
>  ..
>  br i1 %cmp, label %cleanup, label %if.end1
> 
> if.end1:
>  %mul22 = fmul float %6, %s.0
>  %add23 = fadd float %mul22, zeroinitializer
> 
>  %mul26 = fmul float %7, %s.1
>  %add27 = fadd float %add23, %mul26
> 
>  %15 = load float, ..
>  %add67 = fadd %add27, %15
>  store float %add67, ..
>  br label %cleanup
> 
> cleanup:
>  return
> 
> We see an issue which causes our customized register-allocator keeping
> those values like %6, %s.0, %7, %s.1 in registers with a long period.
> 
> My questions are:
> 
> Does llvm expect the backend's instruction scheduler and register
> allocator can handle this properly?
> 
> Can this be solved by llvm’s GlobalISel?GlobalISel’s function-scope optimization doesn’t really help in these cases
unless the target can somehow fold expressions into simpler instructions. If
that’s not possible, the generated code should be fairly similar to that of
SelectionDAG.> 
> Thank you!
> CY
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - Sep 2021 - Instcombine-code-sinking increases the value’s live range

[llvm-dev] Instcombine-code-sinking increases the value’s live range

[llvm-dev] Instcombine-code-sinking increases the value’s live range

[llvm-dev] Instcombine-code-sinking increases the value’s live range