Gil Dogon
2007-Jan-29 09:29 UTC
[LLVMdev] A question about GetElementPtr common subexpression elimination/loop invariant code motion
Hello. I have a problem which is quite basic for array optimization, amd I wonder whether I am missing something, but I could not find the LLVM pass that does it. Consider the following code snippet: int test() { int mat[7][7][7]; int i,j,k,sum=0; for(i=0;i<7;i++){ for(j=0;j<7;j++){ for(k=0;k<7;k++){ sum+=mat[i][j][k]^mat[i][j][k^1]; } } } return sum; } When I run llvm on it (I am using a rather dated llvm 1.7, but I have the feeling this optimization is already there somewhere) I get the following llvm code : int %test() { entry: %mat = alloca [7 x [7 x [7 x int]]], align 16 ; <[7 x [7 x [7 x int]]]*> [#uses=2] br label %cond_true cond_true: ; preds = %bb31, %bb22, %cond_true, %entry %j.1.2.ph = phi int [ 0, %entry ], [ %j.1.2.ph, %cond_true ], [ %tmp24, %bb22 ], [ 0, %bb31 ] ; <int> [#uses=4] %i.0.0.ph = phi int [ 0, %entry ], [ %i.0.0.ph, %cond_true ], [ %i.0.0.ph, %bb22 ], [ %tmp33, %bb31 ] ; <int> [#uses=5] %k.2.4 = phi int [ 0, %entry ], [ %tmp19, %cond_true ], [ 0, %bb22 ], [ 0, %bb31 ] ; <int> [#uses=3] %sum.0.4 = phi int [ 0, %entry ], [ %tmp17, %cond_true ], [ %tmp17, %bb22 ], [ %tmp17, %bb31 ] ; <int> [#uses=1] %tmp5 = getelementptr [7 x [7 x [7 x int]]]* %mat, int 0, int %i.0.0.ph, int %j.1.2.ph, int %k.2.4 ; <int*> [#uses=1] %tmp6 = load int* %tmp5 ; <int> [#uses=1] %tmp10 = xor int %k.2.4, 1 ; <int> [#uses=1] %tmp13 = getelementptr [7 x [7 x [7 x int]]]* %mat, int 0, int %i.0.0.ph, int %j.1.2.ph, int %tmp10 ; <int*> [#uses=1] %tmp14 = load int* %tmp13 ; <int> [#uses=1] %tmp15 = xor int %tmp14, %tmp6 ; <int> [#uses=1] %tmp17 = add int %tmp15, %sum.0.4 ; <int> [#uses=4] %tmp19 = add int %k.2.4, 1 ; <int> [#uses=2] %tmp13 = setgt int %tmp19, 6 ; <bool> [#uses=1] br bool %tmp13, label %bb22, label %cond_true bb22: ; preds = %cond_true %tmp24 = add int %j.1.2.ph, 1 ; <int> [#uses=2] %tmp2719 = setgt int %tmp24, 6 ; <bool> [#uses=1] br bool %tmp2719, label %bb31, label %cond_true bb31: ; preds = %bb22 %tmp33 = add int %i.0.0.ph, 1 ; <int> [#uses=2] %tmp36 = setgt int %tmp33, 6 ; <bool> [#uses=1] br bool %tmp36, label %bb40, label %cond_true bb40: ; preds = %bb31 ret int %tmp17 } Now the problem with this code , is that the calculation of the address mat[i][j] which is done by the (two) getelementptr instructions is quite expensive (involving at least two multiplications and one addition) hence it actualy should have been moved out of the inner loop. At the very least inside the loop it should have been calculated once , and not twice. Anyway this is just a syptom of a general phenomenon in which it seems my LLVM assumes that getelemenptrs are atomic and it does not try to do any optimization on them. Can someone tell me if there is a known solution /existing code transformation that does the optimization ?
Chris Lattner
2007-Jan-29 19:54 UTC
[LLVMdev] A question about GetElementPtr common subexpression elimination/loop invariant code motion
On Mon, 29 Jan 2007, Gil Dogon wrote:> Now the problem with this code , is that the calculation of the address > mat[i][j] which is done by the (two) getelementptr instructions > is quite expensive (involving at least two multiplications and one > addition) hence it actualy should have been moved out of the inner loop.Right.> and not twice. Anyway this is just a syptom of a general > phenomenon in which it seems my LLVM assumes that getelemenptrs are > atomic and it does not try to do any optimization on > them.LLVM assumes that the code generator will handle this optimization when the getelementptr's are lowered into integer arithmetic. Many things can only be optimized after lowering in the code generator, and LICM of simple integer arithmetic is one trivial example. The goal of the LLVM-level optimizers is to handle the heavier lifting stuff, such as code restructuring, loop optimization, memory optimization, etc. -Chris -- http://nondot.org/sabre/ http://llvm.org/
Hi, I'm looking to do a semester-long course project involving LLVM. To avoid duplicating efforts, I wanted to know if the following projects (lifted off http://llvm.org/OpenProjects.html) are done with or are currently worked upon. Atleast I couldn't find evidence of these in the 1.9 sources. 1. Implement GVN-PRE, a powerful and simple Partial Redundancy Elimination algorithm for SSA form 2. Implement a Dependence Analysis Infrastructure - Design some way to represent and query dep analysis 3. Value range propagation pass>From what I've seen of LLVM, I think this is a cool CompilerInfrastructure you've built! Thanks, Prashanth