thr3ads.net - llvm dev - [llvm-dev] LLC does not do proper copy propagation (or copy coalescing) [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Alex Susu via llvm-dev

2017-Jun-15 20:26 UTC

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

Hello.
     Could you please tell me how can I optimize with the back end (llc) the
following
piece of assembly code generated by llc:

       // NOTE: my processor accepts loops in the form of
REPEAT(num_times)..END_REPEAT
       R0 = ...
       REPEAT(256)
         R5 = R0; // basically unnecessary reg. copy
         REPEAT(256)
           R10 = LS[R4];
           R2 = LS[R5];
           R4 = R4 + R1;
           R5 = R5 + R1; // should be R0 = R0 + R1
           R10 = R2 * R10;
           R3 = R3 + R10;
         END_REPEAT;
         REDUCE R3;
         R0 = R5; // basically unnecessary reg. copy
       END_REPEAT;

     The above code has the deficiencies created basically by PHI elimination
and not
applying a proper register copy propagation on machine instructions before
Register
Allocation.

     I see 3 options to address my problem:
       - implement a case that handles this in PHI elimination
(PHIElimination.cpp);
       - create a new pass that does copy propagation (based on DFA) on machine 
instructions before Register Allocation;
       - optimize copy coalescing such as the standard one or the one activated
by
-pbqp-coalescing in lib/CodeGen/RegAllocPBQP.cpp (there is an email also about
PBQP
coalescing at http://lists.llvm.org/pipermail/llvm-dev/2016-June/100523.html).

   Best regards,
     Alex

陳韋任 via llvm-dev

2017-Jun-15 20:41 UTC

head link

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

>
>     I see 3 options to address my problem:
>       - implement a case that handles this in PHI elimination
> (PHIElimination.cpp);
>       - create a new pass that does copy propagation (based on DFA) on
> machine instructions before Register Allocation;
>       - optimize copy coalescing such as the standard one or the one
> activated by -pbqp-coalescing in lib/CodeGen/RegAllocPBQP.cpp (there is an
> email also about PBQP coalescing at http://lists.llvm.org/pipermai
> l/llvm-dev/2016-June/100523.html).
>
Usually this is done by copy coalescing, do you know why yours cannot be
eliminated, is your case not be handled well in existing copy
coalescing (RegisterCoalescer.cpp
for example)?

HTH,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170616/95c5671f/attachment.html>

Alex Susu via llvm-dev

2017-Jun-16 23:28 UTC

head link

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

Hello.
     Wei-Ren, as I've pointed out in the previous email: the piece of code
below has the
deficiency that it uses register R5 instead of using R0 - this happens because
in LLVM IR
I created 2 variables, varIndexInner and varIndexOuter, since I have 2 loops and
the
variable has to be iterated in the inner loop and I need to preserve its value
when going
to the next iteration for the outer loop.
       // NOTE: my processor accepts loops in the form of
REPEAT(num_times)..END_REPEAT
       R0 = ...
       REPEAT(256)
         R5 = R0; // basically unnecessary reg. copy
         REPEAT(256)
           R10 = LS[R4];
           R2 = LS[R5];
           R4 = R4 + R1;
           R5 = R5 + R1; // should be R0 = R0 + R1
           R10 = R2 * R10;
           R3 = R3 + R10;
         END_REPEAT;
         REDUCE R3;
         R0 = R5; // basically unnecessary reg. copy
       END_REPEAT;


     The reason the RegisterCoalescer.cpp is not able to optimize this problem I
mentioned
about is that R0 and R5 have interfering live intervals.

     I'm trying to implement a case to handle this optimization I want in 
RegisterCoalescer.cpp, but it seems a bit complicated. (However, it seems more
natural to
do a standard copy propagation with Data Flow Analysis on the MachineBasicBlocks
with
virtual registers, after coming out of SSA form. Muchnick's book from 1997
talks in detail
about this in Section 12.5.)

     More exactly the registers and copies concerned for the above ASM code
(copying text
from the stderr of llc) are:
       BB#0:
         vreg99 = 0 // IMPORTANT: this instruction is dead and I guess if it is
DCE-ed
RegisterCoalescer.cpp would be able to optimize my code

       BB#1:
         vreg94 = some_data_offset

       BB#3:
         vreg99 = COPY vreg94 // This copy does propagate

       BB#4:
         vreg61 = LOAD vreg99
         vreg99 = ADD vreg99, 1
         jmp_cond BB#4, BB#9

       BB#9:
         vreg94 = COPY vreg99 // This copy does NOT propagate
         jmp_cond BB#3

     Can somebody tell me how can I run the Dead Code Elimination and then 
RegisterCoalescer again in LLC in order to see if I can maybe optimize this
piece of code?

     I'm interested in doing this optimization since the code runs on a very
wide SIMD
processor and every instruction counts.

   Thank you very much,
     Alex



On 6/15/2017 11:41 PM, 陳韋任 wrote:>         I see 3 options to address my problem:
>           - implement a case that handles this in PHI elimination
(PHIElimination.cpp);
>           - create a new pass that does copy propagation (based on DFA) on
machine
>     instructions before Register Allocation;
>           - optimize copy coalescing such as the standard one or the one
activated by
>     -pbqp-coalescing in lib/CodeGen/RegAllocPBQP.cpp (there is an email
also about PBQP
>     coalescing at
http://lists.llvm.org/pipermail/llvm-dev/2016-June/100523.html
>    
<http://lists.llvm.org/pipermail/llvm-dev/2016-June/100523.html>).
>
>
> Usually this is done by copy coalescing, do you know why yours cannot be
eliminated, is
> your case not be handled well in existing copy coalescing
(RegisterCoalescer.cpp for
> example)?
>
> HTH,
> chenwj
>
> --
> Wei-Ren Chen (陳韋任)
> Homepage: https://people.cs.nctu.edu.tw/~chenwj

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Jun 2017 - LLC does not do proper copy propagation (or copy coalescing)

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

[llvm-dev] LLC does not do proper copy propagation (or copy coalescing)

Apparently Analagous Threads