Fernando Magno Quintao Pereira
2007-Jul-18 17:57 UTC
[LLVMdev] New Register Allocation Algorithm
Dear LLVMers, we here in the UCLA compiler's lab have an implementation of a very cool register allocator in LLVM. It finds an optimal register assignment for the x86 machine, via live range splitting, that is, if it is possible to have a register assignment without register spilling, our allocator can find it. We have the short and long versions of our paper here: http://compilers.cs.ucla.edu/fernando/projects/puzzles/ Our algorithm is polynomial time, and up to now, it is adding about 15% more to the total compilation time of LLVM. It does not iterate like the current implementation of RegAllocLinearScan (LN), and does not have to worry about conflicts between virtuals and physical registers, two things that bring the running time of RegAllocLinearScan down. I am not an expert in C++, but I think that with a careful implementation, it would even be possible to match the compilation time of LN. Indeed, for the very large benchmarks, the algorithm sometimes is faster. We have not implemented instruction folding yet. Nothing prevent us of implementing it, it is just a matter of time (and patience). Currently the code that our algorithm produces is faster than LN, if I use -disable-spill-fusing, and sometimes it is faster even when I let LN to use instruction folding. Of course, because the algorithm is very new, there are a lot of things yet to improve, and we would like to hear comments from you guys. We did not do any global coalescing, and our spilling heuristics is the simplest possible: when having to decide among register to spill, evict those that have he longest interval. I had implemented a pass to build the dominator tree of MachineBasicBlocks, but my pass uses the naive quadratic algorithm, and I have this in my list of things to improve. Also, our algorithm only works for x86, because I hard coded the relations between aliased registers in the target independent part of the implementation. This is also something to improve. To implement the allocator, I had to change the implementation of LLVM code here and there: adding a function to insert swaps in MRegisterInfo, changing the coalescing routine in LiveIntervalAnalysis, etc. Anyway, if you guys could take a look into our paper, I think you would like it. The idea is quite new, and if possible, we would like to produce a version that could be distributed with LLVM. But in this case, the work is huge, for the quality of my implementation is way below the quality of the implementation of the current linear scan algorithm. All the best, Fernando
Hi Fernando, On Jul 18, 2007, at 10:57 AM, Fernando Magno Quintao Pereira wrote:> > Dear LLVMers, > > we here in the UCLA compiler's lab have an implementation of a > very > cool register allocator in LLVM. It finds an optimal register > assignment > for the x86 machine, via live range splitting, that is, if it is > possible > to have a register assignment without register spilling, our > allocator can > find it. > We have the short and long versions of our paper here: > > http://compilers.cs.ucla.edu/fernando/projects/puzzles/ > > Our algorithm is polynomial time, and up to now, it is adding > about > 15% more to the total compilation time of LLVM. It does not iterate > like > the current implementation of RegAllocLinearScan (LN), and does not > have > to worry about conflicts between virtuals and physical registers, two > things that bring the running time of RegAllocLinearScan down. I am > not an > expert in C++, but I think that with a careful implementation, it > would > even be possible to match the compilation time of LN. Indeed, for > the very > large benchmarks, the algorithm sometimes is faster.I've only read the abstract. But it does sound very interesting! From the web page, it appears it's not currently passing 100% of the llvm tests. What's your plan to get it there?> We have not implemented instruction folding yet. Nothing > prevent us of > implementing it, it is just a matter of time (and patience). > Currently the > code that our algorithm produces is faster than LN, if I use > -disable-spill-fusing, and sometimes it is faster even when I let > LN to > use instruction folding.Folding is a fairly important (especially for X86). But it's not particularly difficult to implement given the hooks are there.> Of course, because the algorithm is very new, there are a lot of > things yet to improve, and we would like to hear comments from you > guys. > We did not do any global coalescing, and our spilling heuristics is > the > simplest possible: when having to decide among register to spill, > evict > those that have he longest interval. I had implemented a pass to > build the > dominator tree of MachineBasicBlocks, but my pass uses the naive > quadratic > algorithm, and I have this in my list of things to improve. Also, our > algorithm only works for x86, because I hard coded the relations > between > aliased registers in the target independent part of the > implementation. > This is also something to improve. To implement the allocator, I > had to > change the implementation of LLVM code here and there: adding a > function > to insert swaps in MRegisterInfo, changing the coalescing routine in > LiveIntervalAnalysis, etc. >Do you have a plan to improve the MachineBasicBlock dominator pass? This is something we can use right away. :-)> Anyway, if you guys could take a look into our paper, I think you > would like it. The idea is quite new, and if possible, we would > like to > produce a version that could be distributed with LLVM. But in this > case, > the work is huge, for the quality of my implementation is way below > the > quality of the implementation of the current linear scan algorithm.I'll take a detailed look as soon as possible. Thanks, Evan> > All the best, > > Fernando > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Fernando Magno Quintao Pereira
2007-Jul-19 05:22 UTC
[LLVMdev] New Register Allocation Algorithm
> From the web page, it appears it's not currently passing 100% of the > llvm tests. What's your plan to get it there?There are some benchmarks that I cannot compile, but they are very few. I am still improving the implementation, and each day I get some more compiled. Some parts of the register allocator are very complex, like implementing swaps for registers that may alias, reusing colors across the dominance tree, etc, and bugs are still likely to linger in the code for a while, but I will be working on the allocator, and if someone else is interested, I can help in understanding the code and the concepts that underline it.> Do you have a plan to improve the MachineBasicBlock dominator pass? > This is something we can use right away. :-)Yes, I really want to improve that pass, but not right away. Quadratic structures are very bad, you know. Currently, to compute the dominator tree takes longer than to perform register allocation. I will fix that as soon as I have time. Thanks for looking into it, Fernando
Possibly Parallel Threads
- [LLVMdev] New Register Allocation Algorithm
- [LLVMdev] LiveInterval spilling (was LiveInterval Splitting & SubRegisters)
- [LLVMdev] LiveInterval spilling (was LiveInterval Splitting & SubRegisters)
- [LLVMdev] Recalculating live intervals
- [LLVMdev] Spilling register and frame indices