thr3ads.net - llvm dev - [llvm-dev] [RFC] New pass: LoopExitValues [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Steve King via llvm-dev

2015-Aug-31 17:16 UTC

[llvm-dev] [RFC] New pass: LoopExitValues

Hello LLVM,
This is a proposal for a new pass that improves performance and code
size in some nested loop situations.  The pass is target
independent.>From the description in the file header:
This optimization finds loop exit values reevaluated after the loop
execution and replaces them by the corresponding exit values if they
are available. Such sequences can arise after the
SimplifyIndVals+LoopStrengthReduce passes. This pass should be run
after LoopStrengthReduce.

A former colleague created this pass back in LLVM 2.9 and we've been
using it ever since.  I've done some light refactoring and
modernization.

This pass broke 4 existing tests that were sensitive to generated
code.  I've corrected all these, but please give them special
scrutiny.

The patch is available here: http://reviews.llvm.org/D12494

Please advise.

Regards,
-steve

Jake VanAdrighem via llvm-dev

2015-Sep-01 00:52 UTC

head link

[llvm-dev] [RFC] New pass: LoopExitValues

Do you have some specific performance measurements?

Jake

On Mon, Aug 31, 2015 at 10:16 AM, Steve King via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello LLVM,
> This is a proposal for a new pass that improves performance and code
> size in some nested loop situations.  The pass is target independent.
> From the description in the file header:
>
> This optimization finds loop exit values reevaluated after the loop
> execution and replaces them by the corresponding exit values if they
> are available. Such sequences can arise after the
> SimplifyIndVals+LoopStrengthReduce passes. This pass should be run
> after LoopStrengthReduce.
>
> A former colleague created this pass back in LLVM 2.9 and we've been
> using it ever since.  I've done some light refactoring and
> modernization.
>
> This pass broke 4 existing tests that were sensitive to generated
> code.  I've corrected all these, but please give them special
> scrutiny.
>
> The patch is available here: http://reviews.llvm.org/D12494
>
> Please advise.
>
> Regards,
> -steve
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150831/535428eb/attachment.html>

Steve King via llvm-dev

2015-Sep-01 18:06 UTC

head link

[llvm-dev] [RFC] New pass: LoopExitValues

On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem
<jvanadrighem at gmail.com> wrote:> Do you have some specific performance measurements?
Averaging 4 runs of 10000 iterations each of Coremark on my X86_64
desktop showed:

-O2 performance: +2.9% faster with the L.E.V. pass
-Os size: 1.5% smaller with the L.E.V. pass

In the case of Coremark, the benefit comes mainly from the matrix
portion benchmark, which uses nested loops.  Similarly, I used a
matrix multiplication for the regression test as shown below.  The
L.E.V. pass eliminated 4 instructions.

void matrix_mul(unsigned int Size, unsigned int *Dst, unsigned int
*Src, unsigned int Val) {
  for (int Outer = 0; Outer < Size; ++Outer)
    for (int Inner = 0; Inner < Size; ++Inner)
       Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val;
}


With LoopExitValues
-------------------------------
matrix_mul:
    testl %edi, %edi
    je .LBB0_5
    xorl %r9d, %r9d
    xorl %r8d, %r8d
.LBB0_2:
    xorl %r11d, %r11d
.LBB0_3:
    movl %r9d, %r10d
    movl (%rdx,%r10,4), %eax
    imull %ecx, %eax
    movl %eax, (%rsi,%r10,4)
    incl %r11d
    incl %r9d
    cmpl %r11d, %edi
    jne .LBB0_3
    incl %r8d
    cmpl %edi, %r8d
    jne .LBB0_2
.LBB0_5:
    retq



Without LoopExitValues:
-----------------------------------
matrix_mul:
    pushq %rbx           # Eliminated by L.E.V. pass
.Ltmp0:
.Ltmp1:
    testl %edi, %edi
    je .LBB0_5
    xorl %r8d, %r8d
    xorl %r9d, %r9d
.LBB0_2:
    xorl %r10d, %r10d
    movl %r8d, %eax              # Eliminated by L.E.V. pass
.LBB0_3:
    movl %eax, %r11d
    movl (%rdx,%r11,4), %ebx
    imull %ecx, %ebx
    movl %ebx, (%rsi,%r11,4)
    incl %r10d
    incl %eax
    cmpl %r10d, %edi
    jne .LBB0_3
    incl %r9d
    addl %edi, %r8d            # Eliminated by L.E.V. pass
    cmpl %edi, %r9d
    jne .LBB0_2
.LBB0_5:
    popq %rbx                    # Eliminated by L.E.V. pass
    retq

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Sep 2015 - [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

Apparently Analagous Threads