Roman Levenstein
2009-Jan-09 09:36 UTC
[LLVMdev] Is it possible to use the SimpleRegisterCoalescing pass in an iterative way?
Hi, I'm implementing some variations of graph-coloring register allocators for LLVM. Many of them perform their phases (e.g. coalescing, graph simplification, spilling, color selection) in an iterative way. Since LLVM provides an implementation of the coalescing in the SimpleRegisterCoalescing class already, I would like to reuse it (even though I could of course create my own coalescing implementation). But this class is a MachineFunctionPass. I'm wondering, if it is possible to use this pass from another pass (i.e. regalloc) in an iterative way: I need to invoke it multiple times for the same MachineFunction at certain places in my regalloc pass implementation. May be some sort of analysis update can be performed? Or is it designed to be used just once per MachineFunction? Also overall, it would be interesting to know how LLVM supports such kind of iterative algorithms? What are the ways to formulate the independent phases of those algorithms as reusable passes that could be used in an iterative way? Thanks, -Roman
Evan Cheng
2009-Jan-09 19:55 UTC
[LLVMdev] Is it possible to use the SimpleRegisterCoalescing pass in an iterative way?
SimpleRegisterCoalescer is a super class of RegisterCoalescer. A RegisterCoalescer can be run both as a pass or directly. Perhaps the answer is as simple as implementing coalesceFunction. Evan On Jan 9, 2009, at 1:36 AM, Roman Levenstein wrote:> Hi, > > I'm implementing some variations of graph-coloring register > allocators for LLVM. > Many of them perform their phases (e.g. coalescing, graph > simplification, spilling, color selection) in an iterative way. Since > LLVM provides an implementation of the coalescing in the > SimpleRegisterCoalescing class already, I would like to reuse it (even > though I could of course create my own coalescing implementation). But > this class is a MachineFunctionPass. > > I'm wondering, if it is possible to use this pass from another pass > (i.e. regalloc) in an iterative way: > I need to invoke it multiple times for the same MachineFunction at > certain places in my regalloc pass implementation. > May be some sort of analysis update can be performed? Or is it > designed to be used just once per MachineFunction? > > Also overall, it would be interesting to know how LLVM supports such > kind of iterative algorithms? What are the ways to formulate the > independent phases of those algorithms as reusable passes that could > be used in an iterative way? > > Thanks, > -Roman
David Greene
2009-Jan-12 22:00 UTC
[LLVMdev] Is it possible to use the SimpleRegisterCoalescing pass in an iterative way?
On Friday 09 January 2009 03:36, Roman Levenstein wrote:> Hi, > > I'm implementing some variations of graph-coloring register allocators for > LLVM. Many of them perform their phases (e.g. coalescing, graph > simplification, spilling, color selection) in an iterative way. Since > LLVM provides an implementation of the coalescing in the > SimpleRegisterCoalescing class already, I would like to reuse it (even > though I could of course create my own coalescing implementation). But > this class is a MachineFunctionPass. > > I'm wondering, if it is possible to use this pass from another pass > (i.e. regalloc) in an iterative way: > I need to invoke it multiple times for the same MachineFunction at > certain places in my regalloc pass implementation. > May be some sort of analysis update can be performed? Or is it > designed to be used just once per MachineFunction?Hi Roman, I did this a while back when implementing iterative register coalescing. My guess is you're trying to do the same with the Generaslized Graph Coloring algorithm,. It's not easy to accomplish within the LLVM framework. I've made a number of structural changes to our code here and it's in a bit of a state of bitrot due to upstream changes we haven't merged in yet. I got some of this work merged upstream in the RegisterCoalescer and RegallocQuery interfaces. RegallocQuery is supposed to be an opaque communication conduit between register allocators and coalescers such that the allocator can update the coalescer when it makes changes and vice versa. When I did the full implementation I found I had to make more changes to RegallocQuery. I have not pushed those upstream for various reasons, including that at the time the register coalescing code was in a great state of flux and it was difficult to track. It will be a bit of work to get it all up-to-date for merging. Evan is right that coalesceFunction is the primary interface to invoke the coalescer. The iterative coalescer I wrote implements that function. The iterative coalescer is also a FunctionPass which does nothing. It's simply there to fulfill the requirement that the register allocator depends on a coalescer. This way I could use the SimpleRegisterCoalescer or the new iterative coalescer in a pluggable manner. The generalized graph coloring allocator holds a reference to the coalescer. SimpleRegisterCoalescing does nothing in its implementation of coalesceFunction. This is all very far from ideal. What we really need is a PassManager that knows about multually dependent, iterative algorithms. I thought about this a bit back in the day and came up with the idea of a hierarchy of PassManagers. A set of iterative passes could live in a PassManager under the top-level PassManager and the child PassManager could specify dependencies for all of the passes it manages. I didn't get very far with this so I don't know it it's a viable solution. The iterative coalescer does its own correctness and profitability analysis but reuses the code from SimpleRegisterCoalescing to do the actual LLVM IR and dataflow updates. That's non-trivial work. The code to do this in SimpleRegisterCoalescing is a bit...well...convoluted. :) I did a significant amount of work to separate that code out from the dataflow and profitability analysis that determines when to coalesce copies. This I think is very valuable work and I'd like to push it back. It's always good to separate analyses and updates as much as possible so that things can be reused. But I can't promise that I'll be able to merge all of the work I've done. We have quite a bit going on here right now and I've been moved to other bits of work. I'll see if I can get some things accomplished out-of-band. -Dave
Roman Levenstein
2009-Jan-13 08:54 UTC
[LLVMdev] Is it possible to use the SimpleRegisterCoalescing pass in an iterative way?
Hi Dave, Thanks a lot for such an elaborative answer! 2009/1/12 David Greene <dag at cray.com>:> On Friday 09 January 2009 03:36, Roman Levenstein wrote: >> Hi, >> >> I'm implementing some variations of graph-coloring register allocators for >> LLVM. Many of them perform their phases (e.g. coalescing, graph >> simplification, spilling, color selection) in an iterative way. Since >> LLVM provides an implementation of the coalescing in the >> SimpleRegisterCoalescing class already, I would like to reuse it (even >> though I could of course create my own coalescing implementation). But >> this class is a MachineFunctionPass. >> >> I'm wondering, if it is possible to use this pass from another pass >> (i.e. regalloc) in an iterative way: >> I need to invoke it multiple times for the same MachineFunction at >> certain places in my regalloc pass implementation. >> May be some sort of analysis update can be performed? Or is it >> designed to be used just once per MachineFunction? > > Hi Roman, > > I did this a while back when implementing iterative register coalescing. My > guess is you're trying to do the same with the Generaslized Graph Coloring > algorithm,.You are absolutely right.> It's not easy to accomplish within the LLVM framework. I've made a number of > structural changes to our code here and it's in a bit of a state of bitrot due > to upstream changes we haven't merged in yet. > > I got some of this work merged upstream in the RegisterCoalescer and > RegallocQuery interfaces. RegallocQuery is supposed to be an opaque > communication conduit between register allocators and coalescers such that > the allocator can update the coalescer when it makes changes and vice versa.Yes. Initially I didn't notice RegallocQuery in the header files. But after Even pointed out that it can be used, I had a closer look at it.> When I did the full implementation I found I had to make more changes to > RegallocQuery.Yes. This was exactly my conclusion. Namely, it would require quite some work to implement a coalescer this way using this interface. Too many pieces are missing yet.> I have not pushed those upstream for various reasons, > including that at the time the register coalescing code was in a great state > of flux and it was difficult to track. It will be a bit of work to get it all > up-to-date for merging.I see.> Evan is right that coalesceFunction is the primary interface to invoke the > coalescer. The iterative coalescer I wrote implements that function. The > iterative coalescer is also a FunctionPass which does nothing. It's simply > there to fulfill the requirement that the register allocator depends on a > coalescer. This way I could use the SimpleRegisterCoalescer or the new > iterative coalescer in a pluggable manner.Yes. Using coalescers in a pluggable manner would be the ideal solution.> The generalized graph coloring > allocator holds a reference to the coalescer. SimpleRegisterCoalescing does > nothing in its implementation of coalesceFunction.The idea with the empty pass is really nice, but looks a bit like a hack ;-)> This is all very far from ideal.;-)> What we really need is a PassManager that > knows about multually dependent, iterative algorithms. I thought about this a > bit back in the day and came up with the idea of a hierarchy of PassManagers. > A set of iterative passes could live in a PassManager under the top-level > PassManager and the child PassManager could specify dependencies for all of > the passes it manages. I didn't get very far with this so I don't know it > it's a viable solution.I was also thinking about something of this kind. But I was too lazy to come up with a practical solution for it.> The iterative coalescer does its own correctness and profitability analysis > but reuses the code from SimpleRegisterCoalescing to do the actual LLVM IR and > dataflow updates. That's non-trivial work.Oh, yes.> The code to do this in SimpleRegisterCoalescing is a bit...well...convoluted. :)I'd say that the whole SimpleRegisterCoalescing is a bit...well...overcomplicated :-) It does a very, very good job doing aggressive coalescing (and taking into account subtle subregs related things and updating the LiveIntervals information), but IMHO it became so complex that it is almost unmaintainable. I've seen different coalescers for graph coloring (Chitin, Chaitin-Briggs, iterated register coalescing, chordal graph based ones) and linear scan (Wimmer-based) register allocators. But SimpleRegisterCoalescing in LLVM (I particulary like "Simple" in its name ;-) is by far the most complex one I've seen so far. I think, only Evan can completely understand it by now. And there are still subtle bugs in it being found every week or two and fixed in the mainline. It is not a very good sign... Not that I have a constructive solution for it, but may at some point it should be split into independent parts that are more understandable? Or may be it should be even completely redesigned to become more understandable?> I did a significant amount of work to separate that code out from the dataflow and > profitability analysis that determines when to coalesce copies.> This I think is very valuable work and I'd like to push it back. It's always > good to separate analyses and updates as much as possible so that things can > be reused.Absolutely!> But I can't promise that I'll be able to merge all of the work I've done. We > have quite a bit going on here right now and I've been moved to other bits of > work. I'll see if I can get some things accomplished out-of-band.It would be very good, if you could push this code back to LLVM and make it available for others. I hope very much that you'll find some time for it! On my side, after looking at alternatives, I decided to use my own custom coalescers for now. They are small, clean and do what I need. And I have even a limited reuse among the three GC-based register allocators I'm working on (Chitin, Chitin-Briggs, iterated register coalescing). It is not an ideal solution; but for now I'm more concerned about the quality and speed of my register allocators. Once I'm happy with that, I'll probably have a closer look and refactor the code to make it more reusable as a part of a framework. -Roman
Roman Levenstein
2009-Jan-30 07:51 UTC
[LLVMdev] Is it possible to use the SimpleRegisterCoalescing pass in an iterative way?
Hi Dave, Coming back to your mail about the RegisterCoalescer and RegallocQuery interfaces, I'd like to ask a few questions. 2009/1/12 David Greene <dag at cray.com>:> On Friday 09 January 2009 03:36, Roman Levenstein wrote: >> Hi, >> >> I'm implementing some variations of graph-coloring register allocators for >> LLVM. Many of them perform their phases (e.g. coalescing, graph >> simplification, spilling, color selection) in an iterative way. Since >> LLVM provides an implementation of the coalescing in the >> SimpleRegisterCoalescing class already, I would like to reuse it (even >> though I could of course create my own coalescing implementation). But >> this class is a MachineFunctionPass. >> >> I'm wondering, if it is possible to use this pass from another pass >> (i.e. regalloc) in an iterative way: >> I need to invoke it multiple times for the same MachineFunction at >> certain places in my regalloc pass implementation. >> May be some sort of analysis update can be performed? Or is it >> designed to be used just once per MachineFunction? > > Hi Roman, > > I did this a while back when implementing iterative register coalescing. My > guess is you're trying to do the same with the Generaslized Graph Coloring > algorithm,. > > It's not easy to accomplish within the LLVM framework. I've made a number of > structural changes to our code here and it's in a bit of a state of bitrot due > to upstream changes we haven't merged in yet. > > I got some of this work merged upstream in the RegisterCoalescer and > RegallocQuery interfaces. RegallocQuery is supposed to be an opaque > communication conduit between register allocators and coalescers such that > the allocator can update the coalescer when it makes changes and vice versa. > > When I did the full implementation I found I had to make more changes to > RegallocQuery. I have not pushed those upstream for various reasons, > including that at the time the register coalescing code was in a great state > of flux and it was difficult to track. It will be a bit of work to get it all > up-to-date for merging. > > Evan is right that coalesceFunction is the primary interface to invoke the > coalescer. The iterative coalescer I wrote implements that function. The > iterative coalescer is also a FunctionPass which does nothing. It's simply > there to fulfill the requirement that the register allocator depends on a > coalescer. This way I could use the SimpleRegisterCoalescer or the new > iterative coalescer in a pluggable manner. The generalized graph coloring > allocator holds a reference to the coalescer. SimpleRegisterCoalescing does > nothing in its implementation of coalesceFunction.I understand the overall approach. I even tried to define my own coalescer by deriving it from the RegisterCoalescer. This was OK. But then I got stuck, because I don't understand how I can select this coalescer in the command-line of llc. As far as I can tell, there is no registry for the coalescers (similar to what exists for register allocators and which allow you to select a register allocator by providing the --regalloc=register_allocator_name command-line option ) and therefore there is no way to select a specific register coalescer among all defined coalescers. Is my understanding correct? If so, do we need to add such a registry for this purpose? Dave, how have you solved this in your source tree?> The iterative coalescer does its own correctness and profitability analysis > but reuses the code from SimpleRegisterCoalescing to do the actual LLVM IR and > dataflow updates. That's non-trivial work. > The code to do this in SimpleRegisterCoalescing is a bit...well...convoluted. :)I learned it by hart recently. My initial implementation uses my own custom coalescer. But the problem is that it misses many opportunities compared to the "simple" register coalescer. In particular, some cases, where live intervals do not interfere even though their live-ranges overlap. And this may happen e.g. for some live intervals that were produced from phi-nodes. There are also some tricky cases related to commuting operands and so on. Doing all that correctly would probably end up with a code that is as "simple" as the SimpleRegisterCoalescer ;-)> I did a significant amount of work to separate that code out from the dataflow and > profitability analysis that determines when to coalesce copies.This is also the direction that I'm thinking more and more about. I think that if it would be possible to separate/extract an API that would provide the possibility to check for possibility of coalescing of two given intervals (a,b) and to perform the coalescing of two particular intervals (a,b) then it would become much more reusable and can serve as a basis for other coalescers. Those coalescers/register allocators would then decide on their own about the candidates for coalescing based on their own profitability analysis and internal logic and then ask this common code if it is possible to coalesce given intervals and to perform this coalescing and do dataflow updates... I think it is rather possible to extract it from the current implementation, which currently covers too many things at once. For example, it tries to do coalescing for all move-related intervals and it does it based on its own metrics and its own logic. And in many cases the code assumes that only these metrics are used. So the main task is to separate the code that performs coalescing for a pair of intervals from the code that decides which pairs and in which order should be coalesced. Is it more or less inline with your thinking? Is it what you managed to do?> This I think is very valuable work and I'd like to push it back. It's always > good to separate analyses and updates as much as possible so that things can > be reused.> But I can't promise that I'll be able to merge all of the work I've done. We > have quite a bit going on here right now and I've been moved to other bits of > work. I'll see if I can get some things accomplished out-of-band.Dave, BTW, I'd be very interested in having a look at it even of it is not to be pushed back to LLVM by you in the near future. May be I can work on it to make it more prepared for the inclusion into mainline. Or may be it will provide me with a better insight about how the "ideal" coalescer interface should look like in LLVM. So, if you could provide me with the snippets of your coalescer related code, especially the dissected SimpleRegisterCoalescer, I'd be very grateful to you. -Roman