Hello LLVM developers,
I have a few questions regarding the passes that are run after instruction
selection and before register allocation. I am writing a scheduling pass
(modulo scheduling). Before I ask my questions, I will first try to explain
the approach I am taking.
- Currently, I am running the passes in the following order.
(-debug-pass=Structure output)
Remove unreachable machine basic blocks
Live Variable Analysis
Eliminate PHI nodes for register allocation
Two-Address instruction pass
Process Implicit Definitions.
MachineDominator Tree Construction
Machine Natural Loop Construction
Modulo scheduing <== modulo scheduling pass inserted here
Slot index numbering
Live Interval Analysis
MachineDominator Tree Construction
Machine Natural Loop Construction
Simple Register Coalescing
Calculate spill weights
Live Stack Slot Analysis
Virtual Register Map
Linear Scan Register Allocator
- The scheduling pass can schedule only single basic block loops. It only
looks for loops that have 1 or 2 basic blocks (the number of BBs depends on
whether or not the loop header and the latch are the same MBB). Basic blocks
outside the loop remain unchanged except for the ones that preceed and
succeed the loop. Also, basic blocks for prologue and epilogue are added to
the CFG.
- Prior to scheduling, redundant moves that were generated by the
phi-elimination pass and two-address instruction pass are removed and the
basic block in the loop is simplified as much as possible. For example, the
header BB of a loop is transformed as follows (note that information in
LiveVariables is not updated, so there may exist inconsistencies):
BB2: preheader, BB3: header & latch, BB4: exit
(before transformation)
BB#2: derived from LLVM BB %entry.bb_crit_edge
Predecessors according to CFG: BB#0
%reg1025<def> = MOVr %reg1034<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1024<def> = MOVr %reg1033<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1036<def> = MOVi 0, pred:14, pred:%reg0, opt:%reg0
%reg1038<def> = MOVr %reg1024<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1039<def> = MOVr %reg1025<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1040<def> = MOVr %reg1036<kill>, pred:14, pred:%reg0,
opt:%reg0
Successors according to CFG: BB#3
BB#3: derived from LLVM BB %bb
Predecessors according to CFG: BB#2 BB#3
%reg1026<def> = MOVr %reg1038<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1027<def> = MOVr %reg1039<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1028<def> = MOVr %reg1040<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1030<def> = MOVr %reg1027<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1037<def>, %reg1030<def> = LDR_POST %reg1030, %reg0, 4,
pred:14,
pred:%reg0
%reg1029<def> = ADDrr %reg1037<kill>, %reg1028, pred:14,
pred:%reg0,
opt:%reg0
%reg1031<def> = SUBri %reg1026<kill>, 1, pred:14,
pred:%reg0,
opt:%reg0
CMPzri %reg1031, 0, pred:14, pred:%reg0, %CPSR<imp-def>
%reg1038<def> = MOVr %reg1031<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1039<def> = MOVr %reg1030<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1040<def> = MOVr %reg1029<kill>, pred:14, pred:%reg0,
opt:%reg0
Bcc <BB#3>, pred:1, pred:%CPSR<kill>
Successors according to CFG: BB#4 BB#3
BB#4: derived from LLVM BB %bb.bb2_crit_edge
Predecessors according to CFG: BB#3
%reg1041<def> = MOVr %reg1028<kill>, pred:14, pred:%reg0,
opt:%reg0
Successors according to CFG: BB#5
(after transformation)
BB#3:
%reg1028<def> = MOVr %reg1040<kill>, pred:14, pred:%reg0,
opt:%reg0
%reg1037<def>, %reg1039<def> = LDR_POST %reg1039, %reg0, 4,
pred:14,
pred:%reg0
%reg1040<def> = ADDrr %reg1037<kill>, %reg1028, pred:14,
pred:%reg0,
opt:%reg0
%reg1038<def> = SUBri %reg1038<kill>, 1, pred:14,
pred:%reg0,
opt:%reg0
CMPzri %reg1038, 0, pred:14, pred:%reg0, %CPSR<imp-def>
Bcc <BB#3>, pred:1, pred:%CPSR<kill>
$138 = void
Here are my questions:
1. Which passes after the scheduling pass can be run without modification? I
suspect LiveIntervalAnalysis will not be able to handle the transformed BB
judging from the way it handles two-address code and phijoins. Will the
other passes need to be changed as well?
2. Is the scheduling pass inserted in the right position? Currently the
scheduling pass is run right before Slot index numbering and LiveInterval
analysis, since I thought it would required a lot of work to fix the indexes
and intervals if the scheduling pass were run after these two passes.
3. If the scheduling pass does local register allocation too, is there a way
to tell the register allocation pass that is run later not to touch it?
Any advice, comments and suggestions are appreciated.
Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100811/29c23dc3/attachment.html>
Jakob Stoklund Olesen
2010-Aug-12 02:51 UTC
[LLVMdev] Need advice on writing scheduling pass
On Aug 11, 2010, at 12:14 PM, Akira Hatanaka wrote:> Remove unreachable machine basic blocks > Live Variable Analysis > Eliminate PHI nodes for register allocation > Two-Address instruction pass > Process Implicit Definitions. > MachineDominator Tree Construction > Machine Natural Loop Construction > Modulo scheduing <== modulo scheduling pass inserted here > Slot index numbering > Live Interval Analysis > MachineDominator Tree Construction > Machine Natural Loop Construction > Simple Register Coalescing > Calculate spill weights > Live Stack Slot Analysis > Virtual Register Map > Linear Scan Register Allocator[...]> Here are my questions: > 1. Which passes after the scheduling pass can be run without modification? I suspect LiveIntervalAnalysis will not be able to handle the transformed BB judging from the way it handles two-address code and phijoins. Will the other passes need to be changed as well?"Simple Register Coalescing" can handle any code, but the live intervals must be correct.> 2. Is the scheduling pass inserted in the right position? Currently the scheduling pass is run right before Slot index numbering and LiveInterval analysis, since I thought it would required a lot of work to fix the indexes and intervals if the scheduling pass were run after these two passes.I recommend that you do not edit machine code between "Live Variable Analysis" and "Live Interval Analysis". LiveIntervals cannot handle general code, it requires something that is SSA form except for the specific edits from phi-elim and 2-addr. It also requires kill flags and the live variable analysis information to be correct. If you insert your pass before LiveVariables, you must preserve SSA form. If you insert your pass after LiveIntervals, you must update the intervals manually and correctly. If you don't, everything breaks. It's a pain, sorry!> 3. If the scheduling pass does local register allocation too, is there a way to tell the register allocation pass that is run later not to touch it?Yes, simply replace the virtual registers with the allocated physical registers. Then the register allocator won't touch them. Remember to create live intervals for the physical registers. That is how the register allocator detects interference.> Any advice, comments and suggestions are appreciated.It is much easier to edit machine code while it is in SSA form. That is before LiveVariables. /jakob