thr3ads.net - llvm dev - [LLVMdev] Help me improve two-address code [Apr 2009]

If this information is useful, please help other people find it:
Share via:

Greg McGary

2009-Apr-16 22:17 UTC

[LLVMdev] Help me improve two-address code

I have my new port limping enough to compile a very basic function:

int
foo (int a, int b, int c, int d)
{
  return a + b - c + d;
}

clang-cc -O2 yields:

define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone {
entry:
    %add = add i32 %b, %a        ; <i32> [#uses=1]
    %sub = sub i32 %add, %c        ; <i32> [#uses=1]
    %add4 = add i32 %sub, %d        ; <i32> [#uses=1]
    ret i32 %add4
}

which lowers to this assembler code (note: args arrive in r1..r12, and 
results are returned in r1..r3.):

foo:
    add r2,r1      ### add r1,r2 is better
    sub r2,r3
    mov r1,r2      ### unnecessary!!
    add r1,r4
    jmp [r30]
    .end foo

The mov insn would be unnecessary if the operand order for the first add 
were reversed.  For this function, GCC does the right thing.

Is there some optimizer knob I'm not turning properly?  In more complex 
cases, GCC does poorly with two-address operand choices and so bloats 
the code with unnecessary register moves.  I have high hopes LLVM can do 
better, so this result for a simple case is bothersome.

G

Evan Cheng

2009-Apr-16 23:04 UTC

head link

[LLVMdev] Help me improve two-address code

On Apr 16, 2009, at 3:17 PM, Greg McGary wrote:
> I have my new port limping enough to compile a very basic function:
>
> int
> foo (int a, int b, int c, int d)
> {
>  return a + b - c + d;
> }
>
> clang-cc -O2 yields:
>
> define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone {
> entry:
>    %add = add i32 %b, %a        ; <i32> [#uses=1]
>    %sub = sub i32 %add, %c        ; <i32> [#uses=1]
>    %add4 = add i32 %sub, %d        ; <i32> [#uses=1]
>    ret i32 %add4
> }
>
> which lowers to this assembler code (note: args arrive in r1..r12, and
> results are returned in r1..r3.):
>
> foo:
>    add r2,r1      ### add r1,r2 is better
>    sub r2,r3
>    mov r1,r2      ### unnecessary!!
>    add r1,r4
>    jmp [r30]
>    .end foo
>
> The mov insn would be unnecessary if the operand order for the first  
> add
> were reversed.  For this function, GCC does the right thing.
>
> Is there some optimizer knob I'm not turning properly?  In more  
> complex
> cases, GCC does poorly with two-address operand choices and so bloats
> the code with unnecessary register moves.  I have high hopes LLVM  
> can do
> better, so this result for a simple case is bothersome.
Are you marking add as commutable? Are you making mov as a copy  
instruction?

Evan
>
> G
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Greg McGary

2009-Apr-16 23:25 UTC

head link

[LLVMdev] Help me improve two-address code

Evan Cheng wrote:> On Apr 16, 2009, at 3:17 PM, Greg McGary wrote:
>   
>> Is there some optimizer knob I'm not turning properly?  In more
complex
>> cases, GCC does poorly with two-address operand choices and so bloats
>> the code with unnecessary register moves.  I have high hopes LLVM  
>> can do better, so this result for a simple case is bothersome.
>>     
>
> Are you marking add as commutable? Are you making mov as a copy  
> instruction?
>   
How do I mark them?  For the commutative property, I observed this 
definition:

def add        : SDNode<"ISD::ADD"       , SDTIntBinOp   ,
                        [SDNPCommutative, SDNPAssociative]>;

... and assumed it was sufficient, since I saw no other targets making 
special arrangements.

I see no obvious (to me, anyway 8^) "copy instruction" property.  The 
insn in question is generated by copyRegToReg(), and satisfies the 
isMoveInstr() predicate.

G

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Apr 2009 - [LLVMdev] Help me improve two-address code

[LLVMdev] Help me improve two-address code

[LLVMdev] Help me improve two-address code

[LLVMdev] Help me improve two-address code

Possibly Parallel Threads