thr3ads.net - llvm dev - [LLVMdev] New Register Allocation Algorithm [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Fernando Magno Quintao Pereira

2007-Jul-18 17:57 UTC

[LLVMdev] New Register Allocation Algorithm

Dear LLVMers,

     we here in the UCLA compiler's lab have an implementation of a very 
cool register allocator in LLVM. It finds an optimal register assignment 
for the x86 machine, via live range splitting, that is, if it is possible 
to have a register assignment without register spilling, our allocator can 
find it.
We have the short and long versions of our paper here:

http://compilers.cs.ucla.edu/fernando/projects/puzzles/

     Our algorithm is polynomial time, and up to now, it is adding about 
15% more to the total compilation time of LLVM. It does not iterate like 
the current implementation of RegAllocLinearScan (LN), and does not have 
to worry about conflicts between virtuals and physical registers, two 
things that bring the running time of RegAllocLinearScan down. I am not an 
expert in C++, but I think that with a careful implementation, it would 
even be possible to match the compilation time of LN. Indeed, for the very 
large benchmarks, the algorithm sometimes is faster.
     We have not implemented instruction folding yet. Nothing prevent us of 
implementing it, it is just a matter of time (and patience). Currently the 
code that our algorithm produces is faster than LN, if I use 
-disable-spill-fusing, and sometimes it is faster even when I let LN to 
use instruction folding.
     Of course, because the algorithm is very new, there are a lot of 
things yet to improve, and we would like to hear comments from you guys. 
We did not do any global coalescing, and our spilling heuristics is the 
simplest possible: when having to decide among register to spill, evict 
those that have he longest interval. I had implemented a pass to build the 
dominator tree of MachineBasicBlocks, but my pass uses the naive quadratic 
algorithm, and I have this in my list of things to improve. Also, our 
algorithm only works for x86, because I hard coded the relations between 
aliased registers in the target independent part of the implementation. 
This is also something to improve. To implement the allocator, I had to 
change the implementation of LLVM code here and there: adding a function 
to insert swaps in MRegisterInfo, changing the coalescing routine in 
LiveIntervalAnalysis, etc.

     Anyway, if you guys could take a look into our paper, I think you 
would like it. The idea is quite new, and if possible, we would like to 
produce a version that could be distributed with LLVM. But in this case, 
the work is huge, for the quality of my implementation is way below the 
quality of the implementation of the current linear scan algorithm.

All the best,

Fernando

Evan Cheng

2007-Jul-19 04:09 UTC

head link

[LLVMdev] New Register Allocation Algorithm

Hi Fernando,

On Jul 18, 2007, at 10:57 AM, Fernando Magno Quintao Pereira wrote:
>
> Dear LLVMers,
>
>      we here in the UCLA compiler's lab have an implementation of a  
> very
> cool register allocator in LLVM. It finds an optimal register  
> assignment
> for the x86 machine, via live range splitting, that is, if it is  
> possible
> to have a register assignment without register spilling, our  
> allocator can
> find it.
> We have the short and long versions of our paper here:
>
> http://compilers.cs.ucla.edu/fernando/projects/puzzles/
>
>      Our algorithm is polynomial time, and up to now, it is adding  
> about
> 15% more to the total compilation time of LLVM. It does not iterate  
> like
> the current implementation of RegAllocLinearScan (LN), and does not  
> have
> to worry about conflicts between virtuals and physical registers, two
> things that bring the running time of RegAllocLinearScan down. I am  
> not an
> expert in C++, but I think that with a careful implementation, it  
> would
> even be possible to match the compilation time of LN. Indeed, for  
> the very
> large benchmarks, the algorithm sometimes is faster.
I've only read the abstract. But it does sound very interesting!

 From the web page, it appears it's not currently passing 100% of the  
llvm tests. What's your plan to get it there?

>      We have not implemented instruction folding yet. Nothing  
> prevent us of
> implementing it, it is just a matter of time (and patience).  
> Currently the
> code that our algorithm produces is faster than LN, if I use
> -disable-spill-fusing, and sometimes it is faster even when I let  
> LN to
> use instruction folding.
Folding is a fairly important (especially for X86). But it's not  
particularly difficult to implement given the hooks are there.
>      Of course, because the algorithm is very new, there are a lot of
> things yet to improve, and we would like to hear comments from you  
> guys.
> We did not do any global coalescing, and our spilling heuristics is  
> the
> simplest possible: when having to decide among register to spill,  
> evict
> those that have he longest interval. I had implemented a pass to  
> build the
> dominator tree of MachineBasicBlocks, but my pass uses the naive  
> quadratic
> algorithm, and I have this in my list of things to improve. Also, our
> algorithm only works for x86, because I hard coded the relations  
> between
> aliased registers in the target independent part of the  
> implementation.
> This is also something to improve. To implement the allocator, I  
> had to
> change the implementation of LLVM code here and there: adding a  
> function
> to insert swaps in MRegisterInfo, changing the coalescing routine in
> LiveIntervalAnalysis, etc.
>
Do you have a plan to improve the MachineBasicBlock dominator pass?  
This is something we can use right away. :-)
>      Anyway, if you guys could take a look into our paper, I think you
> would like it. The idea is quite new, and if possible, we would  
> like to
> produce a version that could be distributed with LLVM. But in this  
> case,
> the work is huge, for the quality of my implementation is way below  
> the
> quality of the implementation of the current linear scan algorithm.
I'll take a detailed look as soon as possible. Thanks,

Evan
>
> All the best,
>
> Fernando
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Fernando Magno Quintao Pereira

2007-Jul-19 05:22 UTC

head link

[LLVMdev] New Register Allocation Algorithm

> From the web page, it appears it's not currently passing 100% of the
> llvm tests. What's your plan to get it there?
There are some benchmarks that I cannot compile, but they are very few. I 
am still improving the implementation, and each day I get some more 
compiled. Some parts of the register allocator are very complex, like 
implementing swaps for registers that may alias, reusing colors across the 
dominance tree, etc, and bugs are still likely to linger in the code for a 
while, but I will be working on the allocator, and if someone else is 
interested, I can help in understanding the code and the concepts that 
underline it.
> Do you have a plan to improve the MachineBasicBlock dominator pass?
> This is something we can use right away. :-)
Yes, I really want to improve that pass, but not right away. Quadratic 
structures are very bad, you know. Currently, to compute the dominator 
tree takes longer than to perform register allocation. I will fix that as 
soon as I have time.

Thanks for looking into it,

Fernando

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jul 2007 - [LLVMdev] New Register Allocation Algorithm

[LLVMdev] New Register Allocation Algorithm

[LLVMdev] New Register Allocation Algorithm

[LLVMdev] New Register Allocation Algorithm

Possibly Parallel Threads