thr3ads.net - llvm dev - [LLVMdev] Post-inc combining [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Jonas Paulsson

2011-Jan-28 07:13 UTC

[LLVMdev] Post-inc combining

Hi,

I would like to transform a LLVM function containing a load and an add of the
base address inside a loop to a post-incremented load. In
DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add
for instance if it is a predecessor/successor of the load. I find this odd, as
this
is exactly what I would like to handle: a simple loop with an address that is
inremented in each iteration.

I am considering using a target intrinsic for this purpose, as the SCEV
interface is available on the LLVM I/R. In this way, I could get a DAG with a
post-inc-load node instead of the load and add nodes.

Is this a work in progress? Please explain why these constraints are put in the
above mentioned method as they do not seem to facilitate post-inc instruction
combining.

Best regards,

Jonas Paulsson

 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110128/f92d6159/attachment.html>

Anton Korobeynikov

2011-Jan-28 12:21 UTC

head link

[LLVMdev] Post-inc combining

Hello
> Is this a work in progress? Please explain why these constraints are put in
> the above mentioned method as they do not seem to facilitate post-inc
> instruction combining.Have you looked how the stuff is implemented in the existing backends?
At least ARM and MSP430 have post-inc stuff working.

-- 
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

Bob Wilson

2011-Jan-28 16:56 UTC

head link

[LLVMdev] Post-inc combining

On Jan 27, 2011, at 11:13 PM, Jonas Paulsson wrote:
> Hi,
> 
> I would like to transform a LLVM function containing a load and an add of
the base address inside a loop to a post-incremented load. In
DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add
for instance if it is a predecessor/successor of the load. I find this odd, as
this
> is exactly what I would like to handle: a simple loop with an address that
is inremented in each iteration.
> 
> I am considering using a target intrinsic for this purpose, as the SCEV
interface is available on the LLVM I/R. In this way, I could get a DAG with a
post-inc-load node instead of the load and add nodes.
> 
> Is this a work in progress? Please explain why these constraints are put in
the above mentioned method as they do not seem to facilitate post-inc
instruction combining.
The "predecessor" and "successor" terminology used there
refers to the DAG, not to the order of the operations in the llvm IR.  For
example, if the result of the ADD is the value being stored to memory, then you
couldn't fold that into into a post-inc STORE:

 %x = add i32 %addr, 4;
 store i32 %x, i32* %addr

In the DAG for that, the ADD is a predecessor of the STORE.  If the result of
the add is used for some other memory reference, then it would not be a
predecessor and could be folded.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110128/3301f7a9/attachment.html>

Jonas Paulsson

2011-Feb-07 07:32 UTC

head link

[LLVMdev] Post-inc combining

When I compile the following program (for ARM):

  for(i=0;i<n2;i+=n3)
    {
      s+=a[i];
    }

, with GCC, I get the following loop body, with a post-modify load:

.L4:
        add     r1, r1, r3
        ldr     r4, [ip], r6
        rsb     r5, r3, r1
        cmp     r2, r5
        add     r0, r0, r4
        bgt     .L4

With LLVM, however, I get:

.LBB0_3:                                @ %for.body
                                        @ =>This Inner Loop Header: Depth=1
        add     r12, lr, r3
        ldr     lr, [r0, lr, lsl #2]
        add     r1, lr, r1
        cmp     r12, r2
        mov     lr, r12
        blt     .LBB0_3

, which does not seem to be auto-incrementing, I think.

I wonder what I should do to get loops auto-incing generally, for instance in
this simple loop:

  for(i=0;i<256;i++)
    {
        s+=a[i];
    }

, which now yields

.LBB0_1:                                @ %for.body
                                        @ =>This Inner Loop Header: Depth=1
        ldr     r3, [r0, r2]
        add     r2, r2, #4
        add     r1, r3, r1
        cmp     r2, #1, 22      @ 1024
        bne     .LBB0_1

, which uses r0 as base address with r2 as offset. On my target, it is much
preferred  to use auto-inc in cases like this. I repeat my question, as I
don't quite understand why the ldr/add is used by ARM here, instead of
post-inc. I guess I would like the DAG combiner to work in cases like this, but
it does not seem to do so.

Thank you,

Jonas





Subject: Re: [LLVMdev] Post-inc combining
From: bob.wilson at apple.com
Date: Fri, 28 Jan 2011 08:56:09 -0800
CC: llvmdev at cs.uiuc.edu
To: jnspaulsson at hotmail.com




On Jan 27, 2011, at 11:13 PM, Jonas Paulsson wrote:Hi,

I would like to transform a LLVM function containing a load and an add of the
base address inside a loop to a post-incremented load. In
DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add
for instance if it is a predecessor/successor of the load. I find this odd, as
this
is exactly what I would like to handle: a simple loop with an address that is
inremented in each iteration.

I am considering using a target intrinsic for this purpose, as the SCEV
interface is available on the LLVM I/R. In this way, I could get a DAG with a
post-inc-load node instead of the load and add nodes.

Is this a work in progress? Please explain why these constraints are put in the
above mentioned method as they do not seem to facilitate post-inc instruction
combining.
The "predecessor" and "successor" terminology used there
refers to the DAG, not to the order of the operations in the llvm IR.  For
example, if the result of the ADD is the value being stored to memory, then you
couldn't fold that into into a post-inc STORE:
 %x = add i32 %addr, 4; store i32 %x, i32* %addr
In the DAG for that, the ADD is a predecessor of the STORE.  If the result of
the add is used for some other memory reference, then it would not be a
predecessor and could be folded.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110207/54cd90c3/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Feb 2011 - [LLVMdev] Post-inc combining

[LLVMdev] Post-inc combining

[LLVMdev] Post-inc combining

[LLVMdev] Post-inc combining

[LLVMdev] Post-inc combining

Maybe Matching Threads