Hello LLVM developers, I have a few questions regarding the passes that are run after instruction selection and before register allocation. I am writing a scheduling pass (modulo scheduling). Before I ask my questions, I will first try to explain the approach I am taking. - Currently, I am running the passes in the following order. (-debug-pass=Structure output) Remove unreachable machine basic blocks Live Variable Analysis Eliminate PHI nodes for register allocation Two-Address instruction pass Process Implicit Definitions. MachineDominator Tree Construction Machine Natural Loop Construction Modulo scheduing <== modulo scheduling pass inserted here Slot index numbering Live Interval Analysis MachineDominator Tree Construction Machine Natural Loop Construction Simple Register Coalescing Calculate spill weights Live Stack Slot Analysis Virtual Register Map Linear Scan Register Allocator - The scheduling pass can schedule only single basic block loops. It only looks for loops that have 1 or 2 basic blocks (the number of BBs depends on whether or not the loop header and the latch are the same MBB). Basic blocks outside the loop remain unchanged except for the ones that preceed and succeed the loop. Also, basic blocks for prologue and epilogue are added to the CFG. - Prior to scheduling, redundant moves that were generated by the phi-elimination pass and two-address instruction pass are removed and the basic block in the loop is simplified as much as possible. For example, the header BB of a loop is transformed as follows (note that information in LiveVariables is not updated, so there may exist inconsistencies): BB2: preheader, BB3: header & latch, BB4: exit (before transformation) BB#2: derived from LLVM BB %entry.bb_crit_edge Predecessors according to CFG: BB#0 %reg1025<def> = MOVr %reg1034<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1024<def> = MOVr %reg1033<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1036<def> = MOVi 0, pred:14, pred:%reg0, opt:%reg0 %reg1038<def> = MOVr %reg1024<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1039<def> = MOVr %reg1025<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1040<def> = MOVr %reg1036<kill>, pred:14, pred:%reg0, opt:%reg0 Successors according to CFG: BB#3 BB#3: derived from LLVM BB %bb Predecessors according to CFG: BB#2 BB#3 %reg1026<def> = MOVr %reg1038<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1027<def> = MOVr %reg1039<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1028<def> = MOVr %reg1040<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1030<def> = MOVr %reg1027<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1037<def>, %reg1030<def> = LDR_POST %reg1030, %reg0, 4, pred:14, pred:%reg0 %reg1029<def> = ADDrr %reg1037<kill>, %reg1028, pred:14, pred:%reg0, opt:%reg0 %reg1031<def> = SUBri %reg1026<kill>, 1, pred:14, pred:%reg0, opt:%reg0 CMPzri %reg1031, 0, pred:14, pred:%reg0, %CPSR<imp-def> %reg1038<def> = MOVr %reg1031<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1039<def> = MOVr %reg1030<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1040<def> = MOVr %reg1029<kill>, pred:14, pred:%reg0, opt:%reg0 Bcc <BB#3>, pred:1, pred:%CPSR<kill> Successors according to CFG: BB#4 BB#3 BB#4: derived from LLVM BB %bb.bb2_crit_edge Predecessors according to CFG: BB#3 %reg1041<def> = MOVr %reg1028<kill>, pred:14, pred:%reg0, opt:%reg0 Successors according to CFG: BB#5 (after transformation) BB#3: %reg1028<def> = MOVr %reg1040<kill>, pred:14, pred:%reg0, opt:%reg0 %reg1037<def>, %reg1039<def> = LDR_POST %reg1039, %reg0, 4, pred:14, pred:%reg0 %reg1040<def> = ADDrr %reg1037<kill>, %reg1028, pred:14, pred:%reg0, opt:%reg0 %reg1038<def> = SUBri %reg1038<kill>, 1, pred:14, pred:%reg0, opt:%reg0 CMPzri %reg1038, 0, pred:14, pred:%reg0, %CPSR<imp-def> Bcc <BB#3>, pred:1, pred:%CPSR<kill> $138 = void Here are my questions: 1. Which passes after the scheduling pass can be run without modification? I suspect LiveIntervalAnalysis will not be able to handle the transformed BB judging from the way it handles two-address code and phijoins. Will the other passes need to be changed as well? 2. Is the scheduling pass inserted in the right position? Currently the scheduling pass is run right before Slot index numbering and LiveInterval analysis, since I thought it would required a lot of work to fix the indexes and intervals if the scheduling pass were run after these two passes. 3. If the scheduling pass does local register allocation too, is there a way to tell the register allocation pass that is run later not to touch it? Any advice, comments and suggestions are appreciated. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100811/29c23dc3/attachment.html>
Jakob Stoklund Olesen
2010-Aug-12 02:51 UTC
[LLVMdev] Need advice on writing scheduling pass
On Aug 11, 2010, at 12:14 PM, Akira Hatanaka wrote:> Remove unreachable machine basic blocks > Live Variable Analysis > Eliminate PHI nodes for register allocation > Two-Address instruction pass > Process Implicit Definitions. > MachineDominator Tree Construction > Machine Natural Loop Construction > Modulo scheduing <== modulo scheduling pass inserted here > Slot index numbering > Live Interval Analysis > MachineDominator Tree Construction > Machine Natural Loop Construction > Simple Register Coalescing > Calculate spill weights > Live Stack Slot Analysis > Virtual Register Map > Linear Scan Register Allocator[...]> Here are my questions: > 1. Which passes after the scheduling pass can be run without modification? I suspect LiveIntervalAnalysis will not be able to handle the transformed BB judging from the way it handles two-address code and phijoins. Will the other passes need to be changed as well?"Simple Register Coalescing" can handle any code, but the live intervals must be correct.> 2. Is the scheduling pass inserted in the right position? Currently the scheduling pass is run right before Slot index numbering and LiveInterval analysis, since I thought it would required a lot of work to fix the indexes and intervals if the scheduling pass were run after these two passes.I recommend that you do not edit machine code between "Live Variable Analysis" and "Live Interval Analysis". LiveIntervals cannot handle general code, it requires something that is SSA form except for the specific edits from phi-elim and 2-addr. It also requires kill flags and the live variable analysis information to be correct. If you insert your pass before LiveVariables, you must preserve SSA form. If you insert your pass after LiveIntervals, you must update the intervals manually and correctly. If you don't, everything breaks. It's a pain, sorry!> 3. If the scheduling pass does local register allocation too, is there a way to tell the register allocation pass that is run later not to touch it?Yes, simply replace the virtual registers with the allocated physical registers. Then the register allocator won't touch them. Remember to create live intervals for the physical registers. That is how the register allocator detects interference.> Any advice, comments and suggestions are appreciated.It is much easier to edit machine code while it is in SSA form. That is before LiveVariables. /jakob