On Tue, 17 Apr 2007, David Greene wrote:> Evan Cheng wrote: >> Obviously, smart heuristics can make a big difference here (estimated >> register pressures, etc.) But the more important thing is how the >> passes down stream can recover from the earlier mistakes. By this, we >> mean live range splitting and re-materialization. > > Can you explain this a little more? Do you mean that neither live > range splitting nor rematerialization are practically possible > with LLVM right now?Anything is possible. :) In practice, there are good ways and bad ways to build things. For example, Evan recently implemented a really simple form of remat that works on instructions that have no register inputs (e.g. things like "load immediate"). That said, we want to implement the full generality of remat and splitting and have them work together well in the same framework. The key to this is getting the right data structure.> If that's the case, what are the primary issues?Evan is the best to respond to this at this point :)> These are things I will definitely have to do here so I'm quite > interested in how this will all work. I'm more than happy to do > a bunch of heavy lifting getting things into shape.Great! -Chris -- http://nondot.org/sabre/ http://llvm.org/
On Apr 17, 2007, at 2:24 PM, Chris Lattner wrote:> On Tue, 17 Apr 2007, David Greene wrote: >> Evan Cheng wrote: >>> Obviously, smart heuristics can make a big difference here >>> (estimated >>> register pressures, etc.) But the more important thing is how the >>> passes down stream can recover from the earlier mistakes. By >>> this, we >>> mean live range splitting and re-materialization. >> >> Can you explain this a little more? Do you mean that neither live >> range splitting nor rematerialization are practically possible >> with LLVM right now? > > Anything is possible. :) > > In practice, there are good ways and bad ways to build things. For > example, Evan recently implemented a really simple form of remat that > works on instructions that have no register inputs (e.g. things like > "load immediate"). > > That said, we want to implement the full generality of remat and > splitting > and have them work together well in the same framework. The key to > this > is getting the right data structure. > >> If that's the case, what are the primary issues? > > Evan is the best to respond to this at this point :)There are a lot of implementation details. Many of which I don't have clear understanding yet. Take splitting for example. The allocator needs to pick a point and recompute conflicts. Can we do that with the given infrastructure? Perhaps. But not without pain. The current data structure is lacking detailed def-use info to properly determine the correct split point. The register allocator even treat "fixed" (i.e. physical register) intervals separately from other active ones. The point is, the current code needs a lot of massaging. If you are interested (after your first rounds of refactoring), please consider a replacement for LiveInterval that keeps more accurate info that can be plugged into the same existing allocator. The next step is to do away with the r2rmap and rep(). Thanks, Evan> >> These are things I will definitely have to do here so I'm quite >> interested in how this will all work. I'm more than happy to do >> a bunch of heavy lifting getting things into shape. > > Great! > > -Chris > > -- > http://nondot.org/sabre/ > http://llvm.org/ > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
--- Evan Cheng <evan.cheng at apple.com> wrote:> > On Apr 17, 2007, at 2:24 PM, Chris Lattner wrote: > > > On Tue, 17 Apr 2007, David Greene wrote: > >> Evan Cheng wrote: > >>> Obviously, smart heuristics can make a big difference here > >>> (estimated > >>> register pressures, etc.) But the more important thing is how the > >>> passes down stream can recover from the earlier mistakes. By > >>> this, we > >>> mean live range splitting and re-materialization. > >> > >> Can you explain this a little more? Do you mean that neither live > >> range splitting nor rematerialization are practically possible > >> with LLVM right now? > > > > Anything is possible. :) > > > > In practice, there are good ways and bad ways to build things. For > > example, Evan recently implemented a really simple form of remat > that > > works on instructions that have no register inputs (e.g. things > like > > "load immediate"). > > > > That said, we want to implement the full generality of remat and > > splitting > > and have them work together well in the same framework. The key to > > > this > > is getting the right data structure. > > > >> If that's the case, what are the primary issues? > > > > Evan is the best to respond to this at this point :) > > There are a lot of implementation details. Many of which I don't have > > clear understanding yet. Take splitting for example. The allocator > needs to pick a point and recompute conflicts. Can we do that with > the given infrastructure? Perhaps. But not without pain. The current > data structure is lacking detailed def-use info to properly determine > > the correct split point. The register allocator even treat > "fixed" (i.e. physical register) intervals separately from other > active ones. > > The point is, the current code needs a lot of massaging. If you are > interested (after your first rounds of refactoring), please consider > a replacement for LiveInterval that keeps more accurate info that can > be plugged into the same existing allocator. The next step is to do > away with the r2rmap and rep().While implementing Wimmer's linear scan, which is heavily based on splitting of live intervals, I already extended LiveInterval to contain more detailed info about all def-use occurrences. I also implemented the functionality for splitting a live interval at any place. Additionally, the logic for selecting an "optimal" splitting position is done according to Wimmer's paper. Classes that I changed were: LiveInterval, LiveIntervalAnalysis, RegAllocLinearScan. To avoid conflicts with recent changes by Evan, I did the modifications on my copies of these classes which I named WimmerLiveInterval, WimmerLiveIntervalAnalysis, WimmerRegAllocLinearScan. I'm currently working on documenting the code and adding comments as well as some refactoring to make it cleaner. I plan to contribute the code rather soon, hopefully in May. In case of interest, I could post the code for my WimmerLiveInterval and WimmerLiveIntervalAnalysis classes even earlier. Overall, this register allocator is rather functional already, at least for X86. The remaining issues with the allocator are: - Better coalescing decisions (here some insights from Fernando's paper and may be some others could help) - Reducing the number of moves between splitted parts of live intervals, when they are allocated to different physical registers. - Use of spill weights information and other heuristics for selection of a live interval to spill - currently this is not taken into account at all, as in Wimmer's paper. - Handling of rematerialization - Better handling of memory folding - Wimmer's linear scan does it during register allocation time and only conditionally, and not in LiveIntervalAnalysis as it is done now. It distinguishes between SHOULD and MUST occurrences of the register in instructions. But current LLVM infrastructure for memory folding does it unconditionally. This requires probably some minor changes, e.g. providing different functions for asking if folding is potentially possible and for doing memory folding later, if the decision to do so is taken. -Roman __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Evan Cheng wrote:> the given infrastructure? Perhaps. But not without pain. The current > data structure is lacking detailed def-use info to properly determine > the correct split point. The register allocator even treat > "fixed" (i.e. physical register) intervals separately from other > active ones. > > The point is, the current code needs a lot of massaging. If you are > interested (after your first rounds of refactoring), please consider > a replacement for LiveInterval that keeps more accurate info that can > be plugged into the same existing allocator. The next step is to do > away with the r2rmap and rep().This gets back to my questions about what LiveIntervals lacks. - What "data structure" are you referring to above? - What specific def-use info is needed that we don't have access to currently? - A couple of weeks ago I asked about the comment in LiveIntervalAnalysis referring to "conservative" computation. What's conservative about it? What information does it lack that a traditional live variable/live range analysis provides? The answers I got back said essentially that it's not conservative at all and can't be improved upon. I'm getting a different sense from you so I'd like to understand this better. -Dave