thr3ads.net - llvm dev - [LLVMdev] Landing my new development on the trunk ... [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Brian West

2010-Oct-27 20:29 UTC

[LLVMdev] Landing my new development on the trunk ...

Here is the patch for the new Operator Strength Reduction optimization
pass that I have written.  The bulk of the code is in

lib/Transforms/Scalar/OperatorStrengthReduce.cpp

The optimization is based on the algorithm described within the paper
that can be found here:

http://portal.acm.org/citation.cfm?id=504709.504710

Keith D. Cooper , L. Taylor Simpson , Christopher A. Vick, Operator
strength reduction, ACM Transactions on Programming Languages and
Systems (TOPLAS), v.23 n.5, p.603-625, September 2001.

I have embedded the paper's pseudo code into comment blocks within the code.

The algorithm finds reduction opportunities in both array accesses and
explicit multiplications within loops.

Next, I plan on writing the regression tests.

thanks,
Brian


Brian West <bnwest <at> rice.edu> writes:> I am currently writing a new optimization pass for LLVM, based on the
> paper Operator Strength Reduction (Taylor Simpson et al, 2001).  I
> noticed from the Developer Policy page that my code will need to be
> reviewed before it is committed to the trunk.
>
> The Policy page also indicated that all "major" changes should to
be
> announced before coding and should be done as a series of incremental
> changes.  I am not sure a new stand alone optimization pass counts as a
> major change or not.
>
> I have run my pass against the test suite (as described here: "How to
> test my pass"
> http://comments.gmane.org/gmane.comp.compilers.llvm.devel/32875).  I
> have fixed all of the compilation errors and all but one of the
> execution errors.
>
> I have made a pass over my code to conform to the LLVM coding standards.
>
> I have not written the regressions tests yet.
>
> I also still need to work on the performance, but as is it is faster
> than the existing -loop-reduce pass.
>
> My question is how to proceed.  What steps do I need to take to get my
> code reviewed and committed to the trunk?
>
> thanks,
> Brian West

Duncan Sands <baldrick <at> free.fr> writes:> Hi Brian,
>
> > My question is how to proceed.  What steps do I need to take to get my
> > code reviewed and committed to the trunk?
>
> the first step is to send a patch adding your pass (and hopefully the
> associated documentation!) to LLVM to the mailing list, so that others
> can test and study your code.
>
> Ciao,
>
> Duncan.-------------- next part --------------
A non-text attachment was scrubbed...
Name: osr.patch
Type: application/octet-stream
Size: 81300 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101027/faef5b6d/attachment.obj>

Eli Friedman

2010-Oct-28 01:34 UTC

head link

[LLVMdev] Landing my new development on the trunk ...

On Wed, Oct 27, 2010 at 1:29 PM, Brian West <bnwest at rice.edu>
wrote:> Here is the patch for the new Operator Strength Reduction optimization
> pass that I have written.  The bulk of the code is in
>
> lib/Transforms/Scalar/OperatorStrengthReduce.cpp
>
> The algorithm finds reduction opportunities in both array accesses and
> explicit multiplications within loops.
>
> Next, I plan on writing the regression tests.
Comments:
1. Please don't top-post.
2. Instruction::getOpcodeName instead of your OpcodeToString function.
3. LLVM already has a significant amount of infrastructure for loop
passes; why does this pass have its own code for finding loops?
4. What's the use case for this pass?  Can this pass improve
performance of generated code beyond the existing -loop-reduce pass?
If not, could it usefully act as a faster replacement for -loop-reduce
for fast code generation? (Perhaps someone doing fast code generation
with LLVM could comment on whether -loop-reduce is useful, and whether
the performance is an issue.)

-Eli

Brian West

2010-Oct-28 16:38 UTC

head link

[LLVMdev] Landing my new development on the trunk ...

On 10/27/10 8:34 PM, Eli Friedman wrote:> On Wed, Oct 27, 2010 at 1:29 PM, Brian West<bnwest at rice.edu> 
wrote:
>> Here is the patch for the new Operator Strength Reduction optimization
>> pass that I have written.  The bulk of the code is in
>>
>> lib/Transforms/Scalar/OperatorStrengthReduce.cpp
>>
>> The algorithm finds reduction opportunities in both array accesses and
>> explicit multiplications within loops.
>>
>> Next, I plan on writing the regression tests.
>
> Comments:
> 1. Please don't top-post.
> 2. Instruction::getOpcodeName instead of your OpcodeToString function.
Can do. I did not see getOpcodeName() in 
lib/Target/CBackend/CBackend.cpp, so I wrote my own. FWIW I am only 
using OpcodeToString() for debug output.

> 3. LLVM already has a significant amount of infrastructure for loop
> passes; why does this pass have its own code for finding loops?I saw the loop infrastructure for CFG loops. This algorithm finds loops 
in the data flow (more precisely: strongly-connected components in the 
SSA-graph), e.g.

bb5:
%20 = add nsw i32 %i.0, 1
br label %bb6

bb6:
%i.0 = phi i32 [ 0, %entry ], [ %20, %bb5 ]

There is a data flow loop between %20 and %i.0. The OSR paper has a nice 
figure showing data flow loops.

Here is small excerpt from the OSR paper:

"OSR is driven by a simple depth-first search of the SSA-graph, using 
Tarjan’s strongly-connected component finder. The SCC-finder lets OSR 
discover induction variables in topological order and process them as 
they are discovered. As the SCCs are discovered, they are processed by a 
set of mutually-recursive routines that test for induction variables and 
region constants, and then reduce the appropriate operations."

> 4. What's the use case for this pass?  Can this pass improve
> performance of generated code beyond the existing -loop-reduce pass?
> If not, could it usefully act as a faster replacement for -loop-reduce
> for fast code generation? (Perhaps someone doing fast code generation
> with LLVM could comment on whether -loop-reduce is useful, and whether
> the performance is an issue.)
-loop-reduce (LSR) only looks at implicit multiplication in array accesses. The
following code has an explicit multiplication within a loop that LSR will not
reduce:

extern float
loop4(float* a, int start, int stop)
{
int i;
float sum;
float *pEntry;
char *p;
unsigned long offset;
const unsigned int float_size = sizeof(float);

sum = 0.0;
p = (char*)a;
for (i=start-1; i<stop; ++i) {
offset = i * float_size;
pEntry = (float*) (p + offset);
sum += *pEntry;
}

return sum;
}

OSR also finds reductions in some esoteric array accesses that LSR 
misses. I found these LSR misses while writing less-than-real-world code 
to break OSR. I can include these examples, if requested.

For array accesses, OSR and LSR produces similar LLVM IR and I am 
guessing that their execution times should be similar.

Empirically the OSR optimization is compile-time faster than LSR. I have 
also noticed that OSR has more "analysis" requirements: Induction 
Variable User, Natural Loop Information, Canonicalize natural loops, and 
Scalar Evolution Analysis. Both OSR and LSR require the Dominator Tree 
Construction analysis pass.

I did not mention in the original email (and should have) that OSR needs 
-instcombine to be run after it for cleanup. Also -licm, -reassociate, 
-gvn and -sccp can be enabling optimizations for OSR.

> -EliEli,

Thanks for your comments.

Brian West

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Oct 2010 - [LLVMdev] Landing my new development on the trunk ...

[LLVMdev] Landing my new development on the trunk ...

[LLVMdev] Landing my new development on the trunk ...

[LLVMdev] Landing my new development on the trunk ...

Seemingly Similar Threads