Tony Jiang via llvm-dev
2017-Aug-31 22:19 UTC
[llvm-dev] [RFC] Value Range Based Optimization Opportunity in LLVM
Hi All, The macro UNISKIP mentioned in the second example should be LENGTH. Sorry about that. Regards, Tony Jiang, M.Sc. LLVM PPC Backend Development IBM Toronto Lab, C2/712/8200/MKM 8200 Warden Ave, Markham, L6G 1C7 Email: jtony at ca.ibm.com Phone: 905-413-3676 From: Tony Jiang/Toronto/IBM To: llvm-dev at lists.llvm.org Date: 08/31/2017 04:54 PM Subject: [RFC] Value Range Based Optimization Opportunity in LLVM Hi All, We have recently found some optimization opportunities created by replicating code into branches in order to enable optimization. In general, the optimization opportunity we are pursuing is like the following. Given pseudo-code: // block A if (some condition) // block B // block C If it can be efficiently proven that some portion of block C can be simplified had control flow not gone through the if statement, it might be useful to convert this CFG into a diamond and hoist that portion of block C into both block B and the new block. Consider the following example: int test(int *Ptr, int a, int b, int c, int d) { int Ret = 0; if (__builtin_expect(!Ptr, 0)) { Ret = calli(a); // return Ret / (a|1) / (b|1) / (c|1) / (d|1); // Copy return to here } return Ret / (a|1) / (b|1) / (c|1) / (d|1); // This can be simplified to return 0 } In this case, replicating the return statement in the branch allows the optimizer to prove the value of Ret at the end of the function is 0 and eliminate the arithmetic calculations. A second example: unsigned long funcReturningArbitraryi64(unsigned char *p); #define LENGTH(uv) ( (uv) < 0x80 ? 1 : \ (uv) < 0x800 ? 2 : \ (uv) < 0x10000 ? 3 : \ (uv) < 0x200000 ? 4 : \ (uv) < 0x4000000 ? 5 : \ (uv) < 0x80000000 ? 6 : 7 ) int func(unsigned char *p, bool flag) { unsigned long c = *p; int len; // ... #ifdef _ORIG if(flag) { // ... c = funcReturningArbitraryi64(p); } len = LENGTH(c); #else if(flag) { // ... c = funcReturningArbitraryi64(p); len = LENGTH(c); } else { len = LENGTH(c); } #endif // ... return len; } In this case, we see that creating an else block and replicating the return statement in both the if and else branch (like the code snippet guarded by the #else) enables the macro UNISKIP in the else branch to be optimized. Most of the examples we have come up with so far are centered around value ranges along the conditional branches. When the range of values a symbol can have along different branches is provably different, opportunities for optimization may arise. However, this likely isn't the only category of optimizations that could benefit from this. Is there an existing transformation in LLVM that should be doing this already that is missing this opportunity? If not, we would like to pursue adding this. Of course, this optimization would be subject to a cost model as it may result in code growth. For example, it may not be advantageous to do this if the simplified branch is cold. If anyone has any comments/suggestions we are very much interested in hearing them. Regards, Tony Jiang, M.Sc. LLVM PPC Backend Development IBM Toronto Lab, C2/712/8200/MKM 8200 Warden Ave, Markham, L6G 1C7 Email: jtony at ca.ibm.com Phone: 905-413-3676 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170831/046c09bf/attachment.html>