We have two features for register allocation we'd like to contribute if folks think they are worthwhile. We want to get a read on whether they will be useful to people. The first features backschedules reloads during the spilling phase. As reloads are generated, we have some very simple code to try to schedule them as far ahead of the use as possible. The second features modifies linearscan to try to spread register usage out a bit. Rather than always grabbing the first free register in the allocatable list, it remembers the last few registers recently assigned and does not reuse them unless there are no other registers available. This tends to help the backscheduling code by distributing register usage and providing more scheduling freedom. It also can induce spilling where none was there before if the allocator has "just enough" registers. We haven't noticed any serious performance problems in practice. With both patches, we have seen performance improvements on some codes. I know there's some work on post-ra scheduling going on which would probably supercede the reload backscheduling code. If that's coming soon, there's probably not much point in contributing it. The "round-robin" register assignment would help and post-ra scheduler. What's the community's opinion on whether these two features are worth committing to the public repository? -Dave
On Jul 23, 2009, at 12:42 PM, David Greene wrote:> We have two features for register allocation we'd like to contribute > if folks > think they are worthwhile. We want to get a read on whether they > will be > useful to people. > > The first features backschedules reloads during the spilling phase. > As > reloads are generated, we have some very simple code to try to > schedule them > as far ahead of the use as possible.Ok.> > The second features modifies linearscan to try to spread register > usage out a > bit. Rather than always grabbing the first free register in the > allocatable > list, it remembers the last few registers recently assigned and does > not reuse > them unless there are no other registers available. This tends to > help the > backscheduling code by distributing register usage and providing more > scheduling freedom. It also can induce spilling where none was > there before > if the allocator has "just enough" registers. We haven't noticed > any serious > performance problems in practice.Ok. As with any heuristics change, some tests will benefit, some will suffer. I am ok with both sets of changes assuming there are ways to control them.> > With both patches, we have seen performance improvements on some > codes. > > I know there's some work on post-ra scheduling going on which would > probably > supercede the reload backscheduling code. If that's coming soon, > there's > probably not much point in contributing it. The "round-robin" > register > assignment would help and post-ra scheduler.Post-ra scheduling has been working for a while. The reason it's not turned on for x86 is it's not helping much (1 or 2%) while the compile time cost is too high (~9% codegen time). I assume you guys are doing your experiments using AMD processors. It could be Intel's uArch is just not benefiting from the load scheduling. Round-robin register assignment probably will help post-ra scheduling. However, for small functions it may end up increase the number of registers used. That can be bad for performance.> > What's the community's opinion on whether these two features are worth > committing to the public repository?I welcome the features as long as we can add them as llc-beta first. Once we have some more testing across all the platforms, we can then decide whether they can be turned on. Is that ok? Evan> > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thursday 23 July 2009 18:07, Evan Cheng wrote:> Ok. As with any heuristics change, some tests will benefit, some will > suffer. I am ok with both sets of changes assuming there are ways to > control them.Yep, we have flags.> Post-ra scheduling has been working for a while. The reason it's not > turned on for x86 is it's not helping much (1 or 2%) while the compile > time cost is too high (~9% codegen time). I assume you guys are doing > your experiments using AMD processors. It could be Intel's uArch is > just not benefiting from the load scheduling.Yes, I can imagine there would be differences here. The memory architectures are quite different.> Round-robin register assignment probably will help post-ra scheduling. > However, for small functions it may end up increase the number of > registers used. That can be bad for performance.Correct. As I said, we haven't noticed any degredation. But our code base is very different from yours. :)> > What's the community's opinion on whether these two features are worth > > committing to the public repository? > > I welcome the features as long as we can add them as llc-beta first. > Once we have some more testing across all the platforms, we can then > decide whether they can be turned on. Is that ok?Fine with me. How do I do this? Just put them under some flags that default to false? -Dave
David Greene
2009-Jul-24 04:17 UTC
[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]
On Friday 24 July 2009 06:46:24 Mark Shannon wrote:> Hi David, > > What effect is there on compile time?None. These are very simple changes. At most each reload takes a wee bit longer because the code walks backward through the BB to schedule. -Dave
Mark Shannon
2009-Jul-24 11:46 UTC
[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]
Hi David, What effect is there on compile time? David Greene wrote:> We have two features for register allocation we'd like to contribute if folks > think they are worthwhile. We want to get a read on whether they will be > useful to people. > > The first features backschedules reloads during the spilling phase. As > reloads are generated, we have some very simple code to try to schedule them > as far ahead of the use as possible. > > The second features modifies linearscan to try to spread register usage out a > bit. Rather than always grabbing the first free register in the allocatable > list, it remembers the last few registers recently assigned and does not reuse > them unless there are no other registers available. This tends to help the > backscheduling code by distributing register usage and providing more > scheduling freedom. It also can induce spilling where none was there before > if the allocator has "just enough" registers. We haven't noticed any serious > performance problems in practice. > > With both patches, we have seen performance improvements on some codes. > > I know there's some work on post-ra scheduling going on which would probably > supercede the reload backscheduling code. If that's coming soon, there's > probably not much point in contributing it. The "round-robin" register > assignment would help and post-ra scheduler. > > What's the community's opinion on whether these two features are worth > committing to the public repository? > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Alireza.Moshtaghi at microchip.com
2009-Jul-24 17:08 UTC
[LLVMdev] Two Regalloc Enhancements
These seem to be the exact opposite of what the PIC16 port requires. However, since the behavior is controllable I don't think they will break our port. Just a little background, PIC16 only has one register; so preloading it is just going to cause too many useless spills... Regards, Ali> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]On> Behalf Of David Greene > Sent: Thursday, July 23, 2009 12:42 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] Two Regalloc Enhancements > > We have two features for register allocation we'd like to contributeif> folks > think they are worthwhile. We want to get a read on whether they willbe> useful to people. > > The first features backschedules reloads during the spilling phase.As> reloads are generated, we have some very simple code to try toschedule> them > as far ahead of the use as possible. > > The second features modifies linearscan to try to spread registerusage> out a > bit. Rather than always grabbing the first free register in the > allocatable > list, it remembers the last few registers recently assigned and doesnot> reuse > them unless there are no other registers available. This tends tohelp> the > backscheduling code by distributing register usage and providing more > scheduling freedom. It also can induce spilling where none was there > before > if the allocator has "just enough" registers. We haven't noticed any > serious > performance problems in practice. > > With both patches, we have seen performance improvements on somecodes.> > I know there's some work on post-ra scheduling going on which would > probably > supercede the reload backscheduling code. If that's coming soon,there's> probably not much point in contributing it. The "round-robin"register> assignment would help and post-ra scheduler. > > What's the community's opinion on whether these two features are worth > committing to the public repository? > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reasonably Related Threads
- [LLVMdev] Two Regalloc Enhancements
- [LLVMdev] Two Regalloc Enhancements
- [LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]
- [LLVMdev] Enabling MI Scheduler on x86 (was Experimental Evaluation of the Schedulers in LLVM 3.3)
- AjaxScaffold 2.2.0 released with graceful JS degredation