thr3ads.net - llvm dev - [LLVMdev] code generation removes duplicated instructions [Jul 2011]

If this information is useful, please help other people find it:
Share via:

D S Khudia

2011-Jul-06 13:55 UTC

[LLVMdev] code generation removes duplicated instructions

Thank you for replying. Yes. The remaining part of the BB is in splitted
basic block.

The following is an example code generation for arm and x86 for a same IR
BB. In the x86 code I can see that the same computation is done twice and
result is stored in two different registers and then these two different
registers are used for comparision. By the way I am duplicating instruction
and inserting comparison to catch transient errors.

IR BB:

bb:                                               ; preds = %bb1.split
  %0 = load i32* %i, align 4
  %HV10_ = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0
  %1 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0
  %HVCmp15 = icmp ne i32* %1, %HV10_
  br i1 %HVCmp15, label %relExit, label %bb.split

x86 asm:

.LBB0_1:                                # %bb
                                        #   in Loop: Header=BB0_5 Depth=1
  leal  972(%esp), %eax
  movl  568(%esp), %ecx
  imull $4, %ecx, %edx
  addl  %eax, %edx
  imull $4, %ecx, %ecx
  addl  %eax, %ecx
  cmpl  %edx, %ecx
  movl  %ecx, 508(%esp)         # 4-byte Spill
  jne .LBB0_88

arm asm:

.LBB0_1:                                @ %bb
                                        @   in Loop: Header=BB0_5 Depth=1
  ldr r0, [sp, #444]
  add r1, sp, #53, 28         @ 848
  add r0, r1, r0, lsl #2
  cmp r0, r0
  str r0, [sp, #384]
  bne .LBB0_88
  b .LBB0_2

Thanks
Daya

On Wed, Jul 6, 2011 at 6:18 AM, Renato Golin <renato.golin at arm.com>
wrote:
> On 6 July 2011 02:31, D S Khudia <daya.khudia at gmail.com> wrote:
> >   %0 = load i32* %i, align 4
> >   %HV14_ = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0
> >   %1 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0
> >   %HVCmp7 = icmp ne i32* %1, %HV14_
> >   br i1 %HVCmp7, label %relExit, label %bb.split
> >
> > So that HV14_ is a new instruction and I am inserting a comparison to
> jump
> > to a newly created basic block. Somehow the code generation for arm
> removes
> > the duplicated instruction and cmp instruction in arm assembly looks
as
> > follows.
> > cmp r0, r0
>
> Hi Daya,
>
> This is perfectly legal, since the two registers have exactly the same
> value (a[i]) and the comparison will always be the same.
>
> I suppose the rest of the original IR (the other array load, the
> stores and the increment) are in the branched basic blocks...
>
>
> > This defeats the purpose of doing the duplication in the first place.
> Does
> > anyone have any insight on this? Can anyone suggest some starting
points
> to
> > debug this?
>
> What is the purpose of the duplication?
>
> cheers,
> --renato
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110706/e5b4fe16/attachment.html>

Renato Golin

2011-Jul-06 14:27 UTC

head link

[LLVMdev] code generation removes duplicated instructions

On 6 July 2011 14:55, D S Khudia <daya.khudia at gmail.com>
wrote:> The following is an example code generation for arm and x86 for a same IR
> BB. In the x86 code I can see that the same computation is done twice and
> result is stored in two different registers and then these two different
> registers are used for comparision.
Yes, but you shouldn't rely on it, since the compiler is free to
optimize that away.

> By the way I am duplicating instruction
> and inserting comparison to catch transient errors.
I thought so. Try running llc with -O0 or disable specific
optimizations (like dead-code elimination) to keep your comparisons
intact.

But since the two values are identical, it's possible that even so,
both will live in the same register.

cheers,
--renato

D S Khudia

2011-Jul-06 14:57 UTC

head link

[LLVMdev] code generation removes duplicated instructions

Hello,

The code snippet pasted in the previous email are generated at -O0 with llc.
Since I am inserting a new basic block (contains printf statement and
program exit) which is jumped upon based on the result of
the comparison, the compiler cannot/shouldnot optimize that away by means of
DCE or anything else.

The same kind of stuff is happening for the following duplication.

bb6.split:                                        ; preds = %bb6
  %33 = load i32* %32, align 4
  %34 = load i32* %i, align 4
  %HV4_3 = sub nsw i32 %34, %33
  %35 = sub nsw i32 %34, %33
  %HV4_2 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %HV4_3
  %36 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %35
  %LDCmp6 = icmp ne i32* %36, %HV4_2
  br i1 %LDCmp6, label %relExit, label %bb6.split.split

In the above example even the operands are not same and I guess compiler
cannot be that smart at -O0. I sense something is wrong with the code
generation for ARM.

What other way do you suggest for duplicating since you mentioned I
shouldn't rely on duplication the way I am doing it?

Thanks a lot. I really appreciate your help.
Daya

On Wed, Jul 6, 2011 at 10:27 AM, Renato Golin <renato.golin at arm.com>
wrote:
> On 6 July 2011 14:55, D S Khudia <daya.khudia at gmail.com> wrote:
> > The following is an example code generation for arm and x86 for a same
IR
> > BB. In the x86 code I can see that the same computation is done twice
and
> > result is stored in two different registers and then these two
different
> > registers are used for comparision.
>
> Yes, but you shouldn't rely on it, since the compiler is free to
> optimize that away.
>
>
> > By the way I am duplicating instruction
> > and inserting comparison to catch transient errors.
>
> I thought so. Try running llc with -O0 or disable specific
> optimizations (like dead-code elimination) to keep your comparisons
> intact.
>
> But since the two values are identical, it's possible that even so,
> both will live in the same register.
>
> cheers,
> --renato
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110706/3286a9e8/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Jul 2011 - [LLVMdev] code generation removes duplicated instructions

[LLVMdev] code generation removes duplicated instructions

[LLVMdev] code generation removes duplicated instructions

[LLVMdev] code generation removes duplicated instructions

Possibly Parallel Threads