llvm dev - Nov 2010 - [LLVMdev] Unrolling power sum calculations into constant time expressions

Hello,

I noticed that feeding 'clang -O3' with functions like:
int sum1(int x) {
     int ret = 0;
     for(int i = 0; i < x; i++)
         ret += i;
     return ret;
}
int sum2(int x) {
     int ret = 0;
     for(int i = 0; i < x; i++)
         ret += i*i;
     return ret;
}
...
int sum20(int x) {
     int ret = 0;
     for(int i = 0; i < x; i++)
         ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i;
     return ret;
}
etc.
Makes LLVM unroll all those loops into constant time expressions!

A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a 
polynomial of degree k+1, but it is not a trivial derivation (I wouldn't 
do it by hand for k > 2) and I thought that someone hardcoded such a 
case into llvm loop optimizer, and I've been looking through the loop 
analysis and loop passes code, but couldn't find it anywhere...
And what llvm generates doesn't look like hard-coded formulas for sums 
of powers, because it doesn't do it as optimally as it could, e.g. 
sum20(x) could be expressed as a 21-degree polynomial with constant 
coefficients, so it could be compiled to about 21 muls, 21 adds and 1 
div. Yet, llvm generates about 300 instructions -- but no looping, 
constant time regardless of x!

So I suppose the unrolling of this loop is made by a combination of 
simpler optimizations... which is just... "wow".
I've been looking through the analyses and passes code and have no clue 
what may be able to make such great optimizations -- I am really 
impressed by the derivation power of this.
Any clue what can trigger such optimizations (I am totally new to llvm 
code...)?

Cheers,
Juliusz Sompolski

llvm dev - Nov 2010 - [LLVMdev] Unrolling power sum calculations into constant time expressions

[LLVMdev] Unrolling power sum calculations into constant time expressions

[LLVMdev] Unrolling power sum calculations into constant time expressions

[LLVMdev] Unrolling power sum calculations into constant time expressions

Possibly Parallel Threads