Juliusz Sompolski
2010-Nov-23 11:25 UTC
[LLVMdev] Unrolling power sum calculations into constant time expressions
Hello, I noticed that feeding 'clang -O3' with functions like: int sum1(int x) { int ret = 0; for(int i = 0; i < x; i++) ret += i; return ret; } int sum2(int x) { int ret = 0; for(int i = 0; i < x; i++) ret += i*i; return ret; } ... int sum20(int x) { int ret = 0; for(int i = 0; i < x; i++) ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i; return ret; } etc. Makes LLVM unroll all those loops into constant time expressions! A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a polynomial of degree k+1, but it is not a trivial derivation (I wouldn't do it by hand for k > 2) and I thought that someone hardcoded such a case into llvm loop optimizer, and I've been looking through the loop analysis and loop passes code, but couldn't find it anywhere... And what llvm generates doesn't look like hard-coded formulas for sums of powers, because it doesn't do it as optimally as it could, e.g. sum20(x) could be expressed as a 21-degree polynomial with constant coefficients, so it could be compiled to about 21 muls, 21 adds and 1 div. Yet, llvm generates about 300 instructions -- but no looping, constant time regardless of x! So I suppose the unrolling of this loop is made by a combination of simpler optimizations... which is just... "wow". I've been looking through the analyses and passes code and have no clue what may be able to make such great optimizations -- I am really impressed by the derivation power of this. Any clue what can trigger such optimizations (I am totally new to llvm code...)? Cheers, Juliusz Sompolski
Frits van Bommel
2010-Nov-23 12:28 UTC
[LLVMdev] Unrolling power sum calculations into constant time expressions
On Tue, Nov 23, 2010 at 12:25 PM, Juliusz Sompolski <julek at vectorwise.com> wrote:> A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a > polynomial of degree k+1, but it is not a trivial derivation (I wouldn't > do it by hand for k > 2) and I thought that someone hardcoded such a > case into llvm loop optimizer, and I've been looking through the loop > analysis and loop passes code, but couldn't find it anywhere... > And what llvm generates doesn't look like hard-coded formulas for sums > of powers, because it doesn't do it as optimally as it could, e.g. > sum20(x) could be expressed as a 21-degree polynomial with constant > coefficients, so it could be compiled to about 21 muls, 21 adds and 1 > div. Yet, llvm generates about 300 instructions -- but no looping, > constant time regardless of x! > > So I suppose the unrolling of this loop is made by a combination of > simpler optimizations... which is just... "wow". > I've been looking through the analyses and passes code and have no clue > what may be able to make such great optimizations -- I am really > impressed by the derivation power of this. > Any clue what can trigger such optimizations (I am totally new to llvm > code...)?My guess is something based on Scalar Evolution.
Nick Lewycky
2010-Nov-24 06:18 UTC
[LLVMdev] Unrolling power sum calculations into constant time expressions
Juliusz Sompolski wrote:> Hello, > > I noticed that feeding 'clang -O3' with functions like: > int sum1(int x) { > int ret = 0; > for(int i = 0; i< x; i++) > ret += i; > return ret; > } > int sum2(int x) { > int ret = 0; > for(int i = 0; i< x; i++) > ret += i*i; > return ret; > } > ... > int sum20(int x) { > int ret = 0; > for(int i = 0; i< x; i++) > ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i; > return ret; > } > etc. > Makes LLVM unroll all those loops into constant time expressions! > > A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a > polynomial of degree k+1, but it is not a trivial derivation (I wouldn't > do it by hand for k> 2)See lib/Analysis/ScalarEvolution.cpp's BinomialCoefficient(). Notably the comment near the top of the function. LLVM IR permits arbitrary bitwidth integer arithmetic which the backends lower to math that is actually supported on the target. Nick and I thought that someone hardcoded such a> case into llvm loop optimizer, and I've been looking through the loop > analysis and loop passes code, but couldn't find it anywhere... > And what llvm generates doesn't look like hard-coded formulas for sums > of powers, because it doesn't do it as optimally as it could, e.g. > sum20(x) could be expressed as a 21-degree polynomial with constant > coefficients, so it could be compiled to about 21 muls, 21 adds and 1 > div. Yet, llvm generates about 300 instructions -- but no looping, > constant time regardless of x! > > So I suppose the unrolling of this loop is made by a combination of > simpler optimizations... which is just... "wow". > I've been looking through the analyses and passes code and have no clue > what may be able to make such great optimizations -- I am really > impressed by the derivation power of this. > Any clue what can trigger such optimizations (I am totally new to llvm > code...)? > > Cheers, > Juliusz Sompolski > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >