thr3ads.net - llvm dev - [LLVMdev] Optimization of array access [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Michael Smith

2011-Nov-21 16:58 UTC

[LLVMdev] Optimization of array access

I've attached a 2 examples of patterns for updating an array. They're
simplified examples of some code generation I'm doing where I need to pass a
large, unknown number of arguments (ints, doubles, pointers to other
information) to a function.

The two patterns are:
                Stack[0] = 0;
                Stack[1] = 1;
                Stack[2] = 42;
And
                Int I = 0;
                Stack[i++] = 0;
                Stack[i++] = 1;
                Stack[i++] = 42;

I've found that the optimizer is much faster at handling the 2nd pattern
than the 1st. With 3000 assignments - as in the examples attached - "opt
-std-compile-opts -S func1.ll -o func1.opt" takes 10 seconds, while
func2.ll takes 0.2s. Anyone have any ideas why?

I've tested with LLVM 2.9, and LLVM 3.0 RC1.

Regards,
Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/8bd28fe3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: opt.zip
Type: application/x-zip-compressed
Size: 196440 bytes
Desc: opt.zip
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/8bd28fe3/attachment.bin>

Don Quixote de la Mancha

2011-Nov-21 19:29 UTC

head link

[LLVMdev] Optimization of array access

On Mon, Nov 21, 2011 at 8:58 AM, Michael Smith
<Michael.Smith at synopsys.com> wrote:> The two patterns are:
>
>                 Stack[0] = 0;
>                 Stack[1] = 1;
>                 Stack[2] = 42;
>
> And
>
>                 Int I = 0;
>
>                 Stack[i++] = 0;
>                 Stack[i++] = 1;
>                 Stack[i++] = 42;
The optimizer is probably not sophisticated enough to figure out that
the array indices in your first pattern are sequentially incrementing
integers.  So the generated code has to store each individual array
index constant somewhere, load it into a register, multiply it by the
size of your array elements, then add it to the pointer to the base of
the array.  It does that for each array access.

On some architectures, loading constant values can be expensive
because the machine code words do not provide enough bits to encode
arbitrary immediate constants.  The constants are most often stored
just after the end of the function's executable code, then read into a
register with PC-relative addressing.

Other architectures allow one to load any immediate constant, but this
comes at the cost of placing the value of the constant just after the
instruction that loads it.  But that stalls the processor's pipeline
until the constant has been loaded into the register.

In your second pattern, the simple way would be to have a fixed
pointer to the base of your array, and a register that is used as an
offset.  Most architectures offer autoincrement/autodecrement modes
for access patterns just as you describe, in which that index register
is incremented by the size of your array elements just after it is
used as an offset.

But in this case the optimizer is smart enough to see that you're
indexing through an array.  Rather than using a base register and an
offset register, the initial value of i is added to the base register,
then this register is used all by itself, again with autoincrementing.

Not having to use a second register for the index allows that second
register to be used for some other purpose, so that other parts of
your code can be optimized more effectively.

-- 
Don Quixote de la Mancha
Dulcinea Technologies Corporation
Software of Elegance and Beauty
http://www.dulcineatech.com
quixote at dulcineatech.com

Cameron Zwarich

2011-Nov-21 20:17 UTC

head link

[LLVMdev] Optimization of array access

Where does the -time-passes option say the time is going?

Sent from my iPhone

On Nov 21, 2011, at 8:58 AM, Michael Smith <Michael.Smith at synopsys.com>
wrote:
> I’ve attached a 2 examples of patterns for updating an array. They’re
simplified examples of some code generation I’m doing where I need to pass a
large, unknown number of arguments (ints, doubles, pointers to other
information) to a function.
>  
> The two patterns are:
>                 Stack[0] = 0;
>                 Stack[1] = 1;
>                 Stack[2] = 42;
> And
>                 Int I = 0;
>                 Stack[i++] = 0;
>                 Stack[i++] = 1;
>                 Stack[i++] = 42;
>  
> I’ve found that the optimizer is much faster at handling the 2nd pattern
than the 1st. With 3000 assignments - as in the examples attached – “opt
–std-compile-opts –S func1.ll –o func1.opt” takes 10 seconds, while func2.ll
takes 0.2s. Anyone have any ideas why?
>  
> I’ve tested with LLVM 2.9, and LLVM 3.0 RC1.
>  
> Regards,
> Michael
> <opt.zip>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/7bfac643/attachment.html>

Michael Smith

2011-Nov-21 20:20 UTC

head link

[LLVMdev] Optimization of array access

Dead Code Elimination. I’m going to try re-working it following some of the
insights Don Quixote provided.

From: Cameron Zwarich [mailto:zwarich at apple.com]
Sent: Monday, November 21, 2011 12:17 PM
To: Michael Smith
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Optimization of array access

Where does the -time-passes option say the time is going?

Sent from my iPhone

On Nov 21, 2011, at 8:58 AM, Michael Smith <Michael.Smith at
synopsys.com<mailto:Michael.Smith at synopsys.com>> wrote:
I’ve attached a 2 examples of patterns for updating an array. They’re simplified
examples of some code generation I’m doing where I need to pass a large, unknown
number of arguments (ints, doubles, pointers to other information) to a
function.

The two patterns are:
                Stack[0] = 0;
                Stack[1] = 1;
                Stack[2] = 42;
And
                Int I = 0;
                Stack[i++] = 0;
                Stack[i++] = 1;
                Stack[i++] = 42;

I’ve found that the optimizer is much faster at handling the 2nd pattern than
the 1st. With 3000 assignments - as in the examples attached – “opt
–std-compile-opts –S func1.ll –o func1.opt” takes 10 seconds, while func2.ll
takes 0.2s. Anyone have any ideas why?

I’ve tested with LLVM 2.9, and LLVM 3.0 RC1.

Regards,
Michael
<opt.zip>
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/41fa16b2/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Nov 2011 - [LLVMdev] Optimization of array access

[LLVMdev] Optimization of array access

[LLVMdev] Optimization of array access

[LLVMdev] Optimization of array access

[LLVMdev] Optimization of array access

Possibly Parallel Threads