thr3ads.net - llvm dev - [LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED] [Jul 2009]

If this information is useful, please help other people find it:
Share via:

David Greene

2009-Jul-23 19:42 UTC

[LLVMdev] Two Regalloc Enhancements

We have two features for register allocation we'd like to contribute if
folks
think they are worthwhile.  We want to get a read on whether they will be 
useful to people.

The first features backschedules reloads during the spilling phase.  As 
reloads are generated, we have some very simple code to try to schedule them 
as far ahead of the use as possible.

The second features modifies linearscan to try to spread register usage out a 
bit.  Rather than always grabbing the first free register in the allocatable 
list, it remembers the last few registers recently assigned and does not reuse 
them unless there are no other registers available.  This tends to help the 
backscheduling code by distributing register usage and providing more 
scheduling freedom.  It also can induce spilling where none was there before 
if the allocator has "just enough" registers.  We haven't noticed
any serious
performance problems in practice.

With both patches, we have seen performance improvements on some codes.

I know there's some work on post-ra scheduling going on which would probably
supercede the reload backscheduling code.  If that's coming soon,
there's
probably not much point in contributing it.  The "round-robin"
register
assignment would help and post-ra scheduler.

What's the community's opinion on whether these two features are worth
committing to the public repository?

                                   -Dave

Evan Cheng

2009-Jul-23 23:07 UTC

head link

[LLVMdev] Two Regalloc Enhancements

On Jul 23, 2009, at 12:42 PM, David Greene wrote:
> We have two features for register allocation we'd like to contribute  
> if folks
> think they are worthwhile.  We want to get a read on whether they  
> will be
> useful to people.
>
> The first features backschedules reloads during the spilling phase.   
> As
> reloads are generated, we have some very simple code to try to  
> schedule them
> as far ahead of the use as possible.
Ok.
>
> The second features modifies linearscan to try to spread register  
> usage out a
> bit.  Rather than always grabbing the first free register in the  
> allocatable
> list, it remembers the last few registers recently assigned and does  
> not reuse
> them unless there are no other registers available.  This tends to  
> help the
> backscheduling code by distributing register usage and providing more
> scheduling freedom.  It also can induce spilling where none was  
> there before
> if the allocator has "just enough" registers.  We haven't
noticed
> any serious
> performance problems in practice.
Ok. As with any heuristics change, some tests will benefit, some will  
suffer. I am ok with both sets of changes assuming there are ways to  
control them.
>
> With both patches, we have seen performance improvements on some  
> codes.
>
> I know there's some work on post-ra scheduling going on which would  
> probably
> supercede the reload backscheduling code.  If that's coming soon,  
> there's
> probably not much point in contributing it.  The "round-robin"  
> register
> assignment would help and post-ra scheduler.
Post-ra scheduling has been working for a while. The reason it's not  
turned on for x86 is it's not helping much (1 or 2%) while the compile  
time cost is too high (~9% codegen time). I assume you guys are doing  
your experiments using AMD processors. It could be Intel's uArch is  
just not benefiting from the load scheduling.

Round-robin register assignment probably will help post-ra scheduling.  
However, for small functions it may end up increase the number of  
registers used. That can be bad for performance.
>
> What's the community's opinion on whether these two features are
worth
> committing to the public repository?
I welcome the features as long as we can add them as llc-beta first.  
Once we have some more testing across all the platforms, we can then  
decide whether they can be turned on. Is that ok?

Evan
>
>                                   -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Greene

2009-Jul-23 23:23 UTC

head link

[LLVMdev] Two Regalloc Enhancements

On Thursday 23 July 2009 18:07, Evan Cheng wrote:
> Ok. As with any heuristics change, some tests will benefit, some will
> suffer. I am ok with both sets of changes assuming there are ways to
> control them.
Yep, we have flags.
> Post-ra scheduling has been working for a while. The reason it's not
> turned on for x86 is it's not helping much (1 or 2%) while the compile
> time cost is too high (~9% codegen time). I assume you guys are doing
> your experiments using AMD processors. It could be Intel's uArch is
> just not benefiting from the load scheduling.
Yes, I can imagine there would be differences here.  The memory architectures 
are quite different.
> Round-robin register assignment probably will help post-ra scheduling.
> However, for small functions it may end up increase the number of
> registers used. That can be bad for performance.
Correct.  As I said, we haven't noticed any degredation.  But our code base
is
very different from yours.  :)
> > What's the community's opinion on whether these two features
are worth
> > committing to the public repository?
>
> I welcome the features as long as we can add them as llc-beta first.
> Once we have some more testing across all the platforms, we can then
> decide whether they can be turned on. Is that ok?
Fine with me.  How do I do this?  Just put them under some flags that default 
to false?

                                  -Dave

David Greene

2009-Jul-24 04:17 UTC

head link

[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]

On Friday 24 July 2009 06:46:24 Mark Shannon wrote:> Hi David,
>
> What effect is there on compile time?
None.  These are very simple changes.  At most each
reload takes a wee bit longer because the code walks
backward through the BB to schedule.

                                      -Dave

Mark Shannon

2009-Jul-24 11:46 UTC

head link

[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]

Hi David,

What effect is there on compile time?

David Greene wrote:> We have two features for register allocation we'd like to contribute if
folks
> think they are worthwhile.  We want to get a read on whether they will be 
> useful to people.
> 
> The first features backschedules reloads during the spilling phase.  As 
> reloads are generated, we have some very simple code to try to schedule
them
> as far ahead of the use as possible.
> 
> The second features modifies linearscan to try to spread register usage out
a
> bit.  Rather than always grabbing the first free register in the
allocatable
> list, it remembers the last few registers recently assigned and does not
reuse
> them unless there are no other registers available.  This tends to help the
> backscheduling code by distributing register usage and providing more 
> scheduling freedom.  It also can induce spilling where none was there
before
> if the allocator has "just enough" registers.  We haven't
noticed any serious
> performance problems in practice.
> 
> With both patches, we have seen performance improvements on some codes.
> 
> I know there's some work on post-ra scheduling going on which would
probably
> supercede the reload backscheduling code.  If that's coming soon,
there's
> probably not much point in contributing it.  The "round-robin"
register
> assignment would help and post-ra scheduler.
> 
> What's the community's opinion on whether these two features are
worth
> committing to the public repository?
> 
>                                    -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Alireza.Moshtaghi at microchip.com

2009-Jul-24 17:08 UTC

head link

[LLVMdev] Two Regalloc Enhancements

These seem to be the exact opposite of what the PIC16 port requires.
However, since the behavior is controllable I don't think they will
break our port.

Just a little background, PIC16 only has one register; so preloading it
is just going to cause too many useless spills...

Regards,
Ali
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
On> Behalf Of David Greene
> Sent: Thursday, July 23, 2009 12:42 PM
> To: llvmdev at cs.uiuc.edu
> Subject: [LLVMdev] Two Regalloc Enhancements
> 
> We have two features for register allocation we'd like to contribute
if> folks
> think they are worthwhile.  We want to get a read on whether they will
be> useful to people.
> 
> The first features backschedules reloads during the spilling phase.
As> reloads are generated, we have some very simple code to try to
schedule> them
> as far ahead of the use as possible.
> 
> The second features modifies linearscan to try to spread register
usage> out a
> bit.  Rather than always grabbing the first free register in the
> allocatable
> list, it remembers the last few registers recently assigned and does
not> reuse
> them unless there are no other registers available.  This tends to
help> the
> backscheduling code by distributing register usage and providing more
> scheduling freedom.  It also can induce spilling where none was there
> before
> if the allocator has "just enough" registers.  We haven't
noticed any
> serious
> performance problems in practice.
> 
> With both patches, we have seen performance improvements on some
codes.> 
> I know there's some work on post-ra scheduling going on which would
> probably
> supercede the reload backscheduling code.  If that's coming soon,
there's> probably not much point in contributing it.  The "round-robin"
register> assignment would help and post-ra scheduler.
> 
> What's the community's opinion on whether these two features are
worth
> committing to the public repository?
> 
>                                    -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jul 2009 - [LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]

[LLVMdev] Two Regalloc Enhancements

[LLVMdev] Two Regalloc Enhancements

[LLVMdev] Two Regalloc Enhancements

[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]

[LLVMdev] Two Regalloc Enhancements[MESSAGE NOT SCANNED]

[LLVMdev] Two Regalloc Enhancements

Possibly Parallel Threads