Seung Jae Lee
2007-Aug-25 05:07 UTC
[LLVMdev] constructing 'for' statement from LLVM bitcode
Hello, guys. I am trying to construct higher-level 'for' from the low-level LLVM bitcode(ver 1.9). It's partly successful thanks to David A. Greene's advice suggested to use Control Dependence Graph(CDG). I could find which BB contributes to form which loop with CDG. For example, for this simple function: ----------------------------------------------------------- void bsloop(int n, int pM, Params* ps) { int i, sim; for (sim=0; sim < pM; sim++) { for (i = 0; i < n; i++) { Params* p = ps + i; double eps = norm(0,1); } } } ----------------------------------------------------------- LLVM bitcode is emitted like this: ----------------------------------------------------------- void %bsloop(int %n, int %pM, %struct.Params* %ps) { entry: %pM = cast int %pM to uint ; <uint> [#uses=1] %tmp1822 = setgt int %pM, 0 ; <bool> [#uses=1] br bool %tmp1822, label %bb9.outer, label %return bb9.outer: ; preds = %bb12, %entry %indvar23 = phi uint [ 0, %entry ], [ %indvar.next24, %bb12 ] ; <uint> [#uses=1] br label %bb9 bb1: ; preds = %bb9 %tmp = tail call double %mtrandres53( sbyte 1 ) ; <double> [#uses=0] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br label %bb9 bb9: ; preds = %bb1, %bb9.outer %indvar = phi uint [ 0, %bb9.outer ], [ %indvar.next, %bb1 ] ; <uint> [#uses=2] %i.1 = cast uint %indvar to int ; <int> [#uses=1] %tmp = setlt int %i.1, %n ; <bool> [#uses=1] br bool %tmp, label %bb1, label %bb12 bb12: ; preds = %bb9 %indvar.next24 = add uint %indvar23, 1 ; <uint> [#uses=2] %exitcond = seteq uint %indvar.next24, %pM ; <bool> [#uses=1] br bool %exitcond, label %return, label %bb9.outer return: ; preds = %bb12, %entry ret void } ----------------------------------------------------------- This code above offers me info of Control Flow Graph(CFG). (If I could number each BB above like this:) ----------------------------------------------------------- entry -> (0) bb9.outer -> (1) bb1 -> (2) bb9 -> (3) bb12 -> (4) return -> (5) ----------------------------------------------------------- With the CFG info, I could make the dominator tree, dominance frontier and finally Control Depedence Graph(CDG) shown as follows: ----------------------------------------------------------- (x) | CDG of (x) (0) | (1),(3),(4) (1) | - (2) | - (3) | (2),(3) (4) | (1),(3),(4) (5) | - ----------------------------------------------------------- Now that (3) is a member of CDG of (3) and (4) is a member of CDG of (4), they are found as loops and the loop which is composed of BB (2) and (3) is inside of the loop comprises BB (1)~(4). I can manage to re-construct 'for' with this CDG info but generally it might be much complex if 'for' is nested several times and mixed with 'if' and so on in the high-level code so it's not sure if I can succeed in those cases. Do you have any idea on how I can construct 'for' more systemically with this CDG info from LLVM bitcode? Thanks, SJL
Chris Lattner
2007-Aug-25 05:23 UTC
[LLVMdev] constructing 'for' statement from LLVM bitcode
On Aug 24, 2007, at 10:07 PM, Seung Jae Lee wrote:> Do you have any idea on how I can construct 'for' more systemically > with this CDG info from LLVM bitcode? >I strongly suggest looking at the Muchnick book: http:// www.amazon.com/Advanced-Compiler-Design-Implementation-Muchnick/dp/ 1558603204 It has a section on "structural analysis" that you will find useful. Why do you want "for statements"? -Chris
Wojciech Matyjewicz
2007-Sep-15 13:07 UTC
[LLVMdev] constructing 'for' statement from LLVM bitcode
Hi, Seung Jae Lee wrote:> Hello, guys. > I am trying to construct higher-level 'for' from the low-level LLVM bitcode(ver 1.9). > It's partly successful thanks to David A. Greene's advice suggested to use Control Dependence Graph(CDG). > I could find which BB contributes to form which loop with CDG.I know my post comes late... Are you still interested in this subject? No? Just ignore it;) I have been recently dealing with a task which is somehow similar to yours. I was trying to reconstruct simple 'for' loops and check if it's possible to automatically parallelize their execution.> For example, for this simple function: > ----------------------------------------------------------- > void bsloop(int n, int pM, Params* ps) { > int i, sim; > for (sim=0; sim < pM; sim++) { > for (i = 0; i < n; i++) { > Params* p = ps + i; > double eps = norm(0,1); > } > } > }I think it would be possible to reconstruct loops like this. However, it might be very difficult (impossible in general?) if they contained 'break', 'continue' or 'goto' statements. Have you tried the use LoopInfo pass for your aim? If you compile the example to bitcode without optimization and then choose optimizer passes manually: opt -mem2reg -instcombine -indvars -o bsloop-opt.bc bsloop.bc you will get the following code for bsloop() function (llvm and llvm-gcc-4.2 from SVN): -------------------------------------------------------- define void @bsloop(i32 %n, i32 %pM, %struct.Params* %ps) { entry: br label %bb16 bb1: ; preds = %bb8 %tmp4 = call i32 (...)* @norm( i32 0, i32 1 ) %indvar.next = add i32 %i.0, 1 br label %bb8 bb8: ; preds = %bb16, %bb1 %i.0 = phi i32 [ %indvar.next, %bb1 ], [ 0, %bb16 ] %tmp11 = icmp slt i32 %i.0, %n br i1 %tmp11, label %bb1, label %bb13 bb13: %indvar.next1 = add i32 %sim.0, 1 br label %bb16 bb16: %sim.0 = phi i32 [ 0, %entry ], [ %indvar.next1, %bb13 ] %tmp19 = icmp slt i32 %sim.0, %pM br i1 %tmp19, label %bb8, label %return return: ret void } -------------------------------------------------------- Then, running LoopInfo pass: opt -analyze -loops bsloop-opt.bc prints: -------------------------------------------------------- Printing analysis 'Natural Loop Construction' for function 'bsloop': Loop Containing: %bb16, %bb13, %bb8, %bb1 Loop Containing: %bb8, %bb1 -------------------------------------------------------- Using the LoopInfo interface allows you to gather much useful information. In the internal loop basic block bb8 is the header, %i.0 is the canonical induction variable (and the only induction variable). Loop has a trip count %n which is a loop invariant. Inspecting the header shows it doesn't produce any values that are used outside of it, except for the loop induction variable. Furthermore, it doesn't produce any side effects. Hence, in this simple case a 'for' loop could be reconstructed: FOR %i.0 = 0 TO %n - 1 STEP 1: %tmp4 = call i32 (...)* @norm( i32 0, i32 1 ) //%indvar.next is no longer needed I know such a heuristic applies to very simple 'for' loops only. However, "very simple 'for' loops" are quite popular;) --Wojtek