Fernando Magno Quintao Pereira
2006-Apr-29 22:42 UTC
[LLVMdev] Register allocation in LLVM
Hello, all, I want to implement the register allocation algorithm described in the paper "Register Allocation via Coloring of Chordal Graphs, APLAS'05" in LLVM. This is a graph coloring algorithm that can find an optimal coloring of the interference graph in most of the cases. I've downloaded LLVM last week, and started studying the code. Basically, I have to implement: 1) A new register allocation pass, similar to the class RA in RegAllocLocal.cpp, for instance; 2) Replace the phi deconstruction algorithm, which I found in the class PNE (PHIElimination.cpp); I would like to implement an algorithm that uses XOR instructions instead of copy instructions to destroy phi functions. It is the algorithm described in "Optimal register allocation for SSA-form programs in polynomial time, Inf. Process. Lett, 98(4)", or in "Register Allocation for Programs in SSA-Form, CC 2006". Basically, this algorithm takes into consideration the results of the register allocation phase, and adds one permutation of registers to each edge that reaches the basic block where phi functions were destroyed. With three xor, we can implement a swap without an extra register. For instance, if I have (after RegAlloc), these PHI functions at the header of block B_d: a(r1) <- (a1(r1), a2(r2)) b(r2) <- (b1(r2), b2(r1)) assuming that a2, b2 come from block B_s, at the end of block B_s I would have to add the permutation: (r1, r2) <- perm (r2, r1) which I can implement with six Xor instructions. Well, I would like to get some comments about the best way to implement this in LLVM. Should I change the function "copyRegToReg" in X86RegisterInfo.cpp? I am afraid this function is used in other situations where copy instructions are expected. On the other hand, if I create a separate function, and change PHIElimination.cpp, I am afraid to lose retargability. Also, if possible, I would like to know if, besides the source code of the register allocation classes (which is very well commented, and very clean), if there is any online description of the register allocation interface. For instance, concerning register allocation: - To send registers to memory: RegInfo->storeRegToStackSlot(MBB, I, PhysReg, FrameIndex, RC); - To bring registers from memory: RegInfo->loadRegFromStackSlot(MBB, MI, PhysReg, FrameIndex, RC); - Given instruction i, virtual v, and machine reg m, allocate m to v at i: MI->SetMachineOperandReg(i, physReg); And PHI deconstruction: - to remove instructions from Basic Blocks: MachineInstr *MPhi = MBB.remove(MBB.begin()); delete MPhi; - to create new instructions: BuildMI, http://llvm.org/docs/CodeGenerator.html - to discover the origin block of each operand in the phi function: MachineBasicBlock &opBlock *MPhi->getOperand(i).getMachineBasicBlock(); Thank you in advance, Fernando
On Sat, 29 Apr 2006, Fernando Magno Quintao Pereira wrote:> I want to implement the register allocation algorithm described in the > paper "Register Allocation via Coloring of Chordal Graphs, APLAS'05" in > LLVM. This is a graph coloring algorithm that can find an optimal coloring > of the interference graph in most of the cases. I've downloaded LLVM last > week, and started studying the code.Cool, that looks like a nice algorithm!> Basically, I have to implement: > > 1) A new register allocation pass, similar to the class RA in > RegAllocLocal.cpp, for instance;Yup.> 2) Replace the phi deconstruction algorithm, which I found in the class > PNE (PHIElimination.cpp); I would like to implement an algorithm that > uses XOR instructions instead of copy instructions to destroy phi > functions. It is the algorithm described in "Optimal register allocation > for SSA-form programs in polynomial time, Inf. Process. Lett, 98(4)", or > in "Register Allocation for Programs in SSA-Form, CC 2006". Basically, > this algorithm takes into consideration the results of the register > allocation phase, and adds one permutation of registers to each edge that > reaches the basic block where phi functions were destroyed. With three > xor, we can implement a swap without an extra register. For instance, if I > have (after RegAlloc), these PHI functions at the header of block B_d: > a(r1) <- (a1(r1), a2(r2)) > b(r2) <- (b1(r2), b2(r1)) > assuming that a2, b2 come from block B_s, at the end of block B_s I would > have to add the permutation: (r1, r2) <- perm (r2, r1) > which I can implement with six Xor instructions.Ok, seems complicated, but potentially cute. Note that most copy instructions inserted by the phi elimination phase are coallesced away: replacing them with xor instructions would prevent that.> Well, I would like to get some comments about the best way to implement > this in LLVM. Should I change the function "copyRegToReg" in > X86RegisterInfo.cpp?No, certainly not.> I am afraid this function is used in other > situations where copy instructions are expected.Yup, that it would.> On the other hand, if I create a separate function, and change > PHIElimination.cpp, I am afraid to lose retargability.If you need to do this, I'd suggest adding a new virtual method "insertRegisterXor" or something, which works like copyRegToReg. It should be very straight-forward to implement for all targets. I would suggest starting out by implementing it on whatever target you are on, and make it abort on all others. When it comes time to make it work on other targets, the various target maintainers will help you out.> Also, if possible, I would like to know if, besides the source code of > the register allocation classes (which is very well commented, and very > clean), if there is any online description of the register allocation > interface.Unfortunately, not really. The high-level overview of the code generator is here, but it is lacking many details: http://llvm.org/docs/CodeGenerator.html Any contributions to make the documentation better are very welcome! The best way to learn stuff is to look for examples in the existing passes and by asking questions here.> For instance, concerning register allocation: > > - To send registers to memory: > RegInfo->storeRegToStackSlot(MBB, I, PhysReg, FrameIndex, RC); > - To bring registers from memory: > RegInfo->loadRegFromStackSlot(MBB, MI, PhysReg, FrameIndex, RC); > - Given instruction i, virtual v, and machine reg m, allocate m to v at i: > MI->SetMachineOperandReg(i, physReg); > And PHI deconstruction: > > - to remove instructions from Basic Blocks: > MachineInstr *MPhi = MBB.remove(MBB.begin()); > delete MPhi; > > - to create new instructions: > BuildMI, http://llvm.org/docs/CodeGenerator.html > > - to discover the origin block of each operand in the phi function: > MachineBasicBlock &opBlock > *MPhi->getOperand(i).getMachineBasicBlock();Yup! -Chris -- http://nondot.org/sabre/ http://llvm.org/
On Apr 30, 2006, at 10:42 PM, Chris Lattner wrote:> On Sat, 29 Apr 2006, Fernando Magno Quintao Pereira wrote: >> I want to implement the register allocation algorithm described in >> the >> paper "Register Allocation via Coloring of Chordal Graphs, >> APLAS'05" in >> LLVM. This is a graph coloring algorithm that can find an optimal >> coloring >> of the interference graph in most of the cases. I've downloaded >> LLVM last >> week, and started studying the code. > > Cool, that looks like a nice algorithm! > >> Basically, I have to implement: >> >> 1) A new register allocation pass, similar to the class RA in >> RegAllocLocal.cpp, for instance; > > Yup. > >> 2) Replace the phi deconstruction algorithm, which I found in the >> class >> PNE (PHIElimination.cpp); I would like to implement an algorithm that >> uses XOR instructions instead of copy instructions to destroy phi >> functions. It is the algorithm described in "Optimal register >> allocation >> for SSA-form programs in polynomial time, Inf. Process. Lett, 98 >> (4)", or >> in "Register Allocation for Programs in SSA-Form, CC 2006". >> Basically, >> this algorithm takes into consideration the results of the register >> allocation phase, and adds one permutation of registers to each >> edge that >> reaches the basic block where phi functions were destroyed. With >> three >> xor, we can implement a swap without an extra register. For >> instance, if I >> have (after RegAlloc), these PHI functions at the header of block >> B_d: >> a(r1) <- (a1(r1), a2(r2)) >> b(r2) <- (b1(r2), b2(r1)) >> assuming that a2, b2 come from block B_s, at the end of block B_s >> I would >> have to add the permutation: (r1, r2) <- perm (r2, r1) >> which I can implement with six Xor instructions. > > Ok, seems complicated, but potentially cute. Note that most copy > instructions inserted by the phi elimination phase are coallesced > away: replacing them with xor instructions would prevent that.Also note that the 'xor trick' does not work if both registers happen to have the same contents (it zeros them). It would also be fairly tedious to lower the 'xor trick' to a swap instruction for machines that have them (since x86 has one, that would be most machines). Such an instruction can happen in rename, which makes it about the least resource intensive instruction besides nop.> >> Well, I would like to get some comments about the best way to >> implement this in LLVM. Should I change the function >> "copyRegToReg" in X86RegisterInfo.cpp? > > No, certainly not. > >> I am afraid this function is used in other situations where copy >> instructions are expected. > > Yup, that it would. > >> On the other hand, if I create a separate function, and change >> PHIElimination.cpp, I am afraid to lose retargability. > > If you need to do this, I'd suggest adding a new virtual method > "insertRegisterXor" or something, which works like copyRegToReg. > It should be very straight-forward to implement for all targets. I > would suggest starting out by implementing it on whatever target > you are on, and make it abort on all others. When it comes time to > make it work on other targets, the various target maintainers will > help you out. > >> Also, if possible, I would like to know if, besides the source >> code of the register allocation classes (which is very well >> commented, and very clean), if there is any online description of >> the register allocation interface. > > Unfortunately, not really. The high-level overview of the code > generator is here, but it is lacking many details: > http://llvm.org/docs/CodeGenerator.html > > Any contributions to make the documentation better are very welcome! > > The best way to learn stuff is to look for examples in the existing > passes and by asking questions here. > >> For instance, concerning register allocation: >> >> - To send registers to memory: >> RegInfo->storeRegToStackSlot(MBB, I, PhysReg, FrameIndex, RC); >> - To bring registers from memory: >> RegInfo->loadRegFromStackSlot(MBB, MI, PhysReg, FrameIndex, RC); >> - Given instruction i, virtual v, and machine reg m, allocate m to >> v at i: >> MI->SetMachineOperandReg(i, physReg); >> And PHI deconstruction: >> >> - to remove instructions from Basic Blocks: >> MachineInstr *MPhi = MBB.remove(MBB.begin()); >> delete MPhi; >> >> - to create new instructions: >> BuildMI, http://llvm.org/docs/CodeGenerator.html >> >> - to discover the origin block of each operand in the phi function: >> MachineBasicBlock &opBlock >> *MPhi->getOperand(i).getMachineBasicBlock(); > > Yup! > > -Chris > > -- > http://nondot.org/sabre/ > http://llvm.org/ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev