thr3ads.net - search: "costperus"

Displaying 13 results from an estimated 13 matches for "costperus".

Did you mean: costperuse

Dynamically determine the CostPerUse value in the register allocator.

2020 May 30

Dynamically determine the CostPerUse value in the register allocator.

I dont know the history behind CostPerUse word so I may be missing the background associated with it. It seems that it's misnomer for what it is intended. At first sight, the word indicates that the cost is a function of uses of the register - more the uses more the cost. How do we want to define the value of CostPerUse. Should it be...

Dynamically determine the CostPerUse value in the register allocator.

2020 May 29

Dynamically determine the CostPerUse value in the register allocator.

[AMD Official Use Only - Internal Distribution Only] Hi All, For the AMDGPU architecture, during RA, we prefer to have a cost associated with the registers (CostPerUse) based on a target entity (for instance, the Calling Convention of the current MachineFunction). Presently CostPerUse is a one-time static value (either zero or a positive value) generated through table-gen. The current implementation doesn't allow us to control the reg-cost on the fly. The A...

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

On May 3, 2011, at 12:03 PM, David A. Greene wrote: >> >> The greedy allocator is trying to pick registers so inner loops are as >> small as possible, but that is not always the right thing to do. > > How does it balance that against spill cost? I added the CostPerUse field to the register descriptions. The allocator will try to minimize the spill weight assigned to registers with a CostPerUse. It does it by swapping physical register assignments, it won't do it if it requires extra spilling. This is actually the cause of the n-body regression. The benchma...

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

...Stoklund Olesen <stoklund at 2pi.dk> writes: >>> The greedy allocator is trying to pick registers so inner loops are as >>> small as possible, but that is not always the right thing to do. >> >> How does it balance that against spill cost? > > I added the CostPerUse field to the register descriptions. The > allocator will try to minimize the spill weight assigned to registers > with a CostPerUse. It does it by swapping physical register > assignments, it won't do it if it requires extra spilling. CostPerUse models the encoding size of the regist...

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

...t 2pi.dk> writes: > >>>> The greedy allocator is trying to pick registers so inner loops are as >>>> small as possible, but that is not always the right thing to do. >>> >>> How does it balance that against spill cost? >> >> I added the CostPerUse field to the register descriptions. The >> allocator will try to minimize the spill weight assigned to registers >> with a CostPerUse. It does it by swapping physical register >> assignments, it won't do it if it requires extra spilling. > > CostPerUse models the encod...

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

...worth it to undo LICM just for that? In this > case, probably. In general, no. Ah, so you're saying the regression is due to the inner loop icache footprint increasing. Ok, that makes total sense to me. I agree this is a difficult thing to get right in a general sort of way. Perhaps the CostPerUse (or whatwever heuristics use it) can factor in the loop body size so that tight loops are favored for smaller encodings. -Dave

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: >> Yikes! Do we know why these codes got so much worse? Even 5% is a big >> deal on x86. > > On x86-64, n-body and puzzle have the exact same instructions as with > linear scan. The only difference is the choice of registers. This > causes some loops to be a few bytes longer or shorter which can easily > change

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

On May 3, 2011, at 9:19 AM, David A. Greene wrote: > Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > >> +10.0% SingleSource/Benchmarks/CoyoteBench/huffbench >> +12.0% SingleSource/Benchmarks/McGill/chomp >> +18.0% SingleSource/Benchmarks/BenchmarkGame/n-body >> +45.5% SingleSource/Benchmarks/BenchmarkGame/puzzle >> +10.0%

[LLVMdev] Greedy register allocation

2011 May 04

[LLVMdev] Greedy register allocation

...or that? In this >> case, probably. In general, no. > > Ah, so you're saying the regression is due to the inner loop icache > footprint increasing. Ok, that makes total sense to me. I agree this > is a difficult thing to get right in a general sort of way. Perhaps the > CostPerUse (or whatwever heuristics use it) can factor in the loop body > size so that tight loops are favored for smaller encodings. It is almost certainly that the inner loop doesn't fit in the processors predecode loop buffer. Modern intel X86 chips have a buffer that can hold a very small number...

LLVM Weekly - #230, May 28th 2018

2018 May 28

LLVM Weekly - #230, May 28th 2018

...g/rL332920). * The llvm-exegesis analysis output now uses HTML in order to produce a more readable report. [r332979](https://reviews.llvm.org/rL332979). * The DEBUG macro has been removed now all in-tree users have been updated to `LLVM_DEBUG`. [r333091](https://reviews.llvm.org/rL333091). * The CostPerUse attribute is now set on RISC-V registers which are unlikely to be accessible in compressed (16-bit) instructions. [r333132](https://reviews.llvm.org/rL333132). * The LoopInstSimplify pass has been restored and will see improvements and integration into the loop pass pipeline. [r333250](https://re...

[LLVMdev] Greedy register allocation

2011 May 04

[LLVMdev] Greedy register allocation

...;> case, probably. In general, no. >> >> Ah, so you're saying the regression is due to the inner loop icache >> footprint increasing. Ok, that makes total sense to me. I agree this >> is a difficult thing to get right in a general sort of way. Perhaps the >> CostPerUse (or whatwever heuristics use it) can factor in the loop body >> size so that tight loops are favored for smaller encodings. > > It is almost certainly that the inner loop doesn't fit in the processors predecode loop buffer. Modern intel X86 chips have a buffer that can hold a ver...

[LLVMdev] [global-isel] Random comments on Proposal for a global instruction selector

2013 Aug 09

[LLVMdev] [global-isel] Random comments on Proposal for a global instruction selector

On 9 Aug 2013, at 00:18, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > I am hoping that this proposal will generate a lot of feedback, and there are many different topics to discuss. When replying to this email, please change the subject header to something more specific, but keep the [global-isel] tag. Subject changed, but I'm not sure if helps... Overall, I really like this

[LLVMdev] [global-isel] Proposal for a global instruction selector

2013 Aug 08

[LLVMdev] [global-isel] Proposal for a global instruction selector

I am hoping that this proposal will generate a lot of feedback, and there are many different topics to discuss. When replying to this email, please change the subject header to something more specific, but keep the [global-isel] tag. Thanks, /jakob Proposal for a Global Instruction Selector It is becoming evident that we need a replacement for the SelectionDAG-based instruction selector. This

search for: costperus