search for: whatwever

Displaying 4 results from an estimated 4 matches for "whatwever".

Did you mean: whatever
2011 May 03
0
[LLVMdev] Greedy register allocation
...do LICM just for that? In this > case, probably. In general, no. Ah, so you're saying the regression is due to the inner loop icache footprint increasing. Ok, that makes total sense to me. I agree this is a difficult thing to get right in a general sort of way. Perhaps the CostPerUse (or whatwever heuristics use it) can factor in the loop body size so that tight loops are favored for smaller encodings. -Dave
2011 May 03
2
[LLVMdev] Greedy register allocation
On May 3, 2011, at 3:23 PM, David A. Greene wrote: > Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > >>>> The greedy allocator is trying to pick registers so inner loops are as >>>> small as possible, but that is not always the right thing to do. >>> >>> How does it balance that against spill cost? >> >> I added the
2011 May 04
4
[LLVMdev] Greedy register allocation
...s >> case, probably. In general, no. > > Ah, so you're saying the regression is due to the inner loop icache > footprint increasing. Ok, that makes total sense to me. I agree this > is a difficult thing to get right in a general sort of way. Perhaps the > CostPerUse (or whatwever heuristics use it) can factor in the loop body > size so that tight loops are favored for smaller encodings. It is almost certainly that the inner loop doesn't fit in the processors predecode loop buffer. Modern intel X86 chips have a buffer that can hold a very small number of instruction...
2011 May 04
0
[LLVMdev] Greedy register allocation
...bably. In general, no. >> >> Ah, so you're saying the regression is due to the inner loop icache >> footprint increasing. Ok, that makes total sense to me. I agree this >> is a difficult thing to get right in a general sort of way. Perhaps the >> CostPerUse (or whatwever heuristics use it) can factor in the loop body >> size so that tight loops are favored for smaller encodings. > > It is almost certainly that the inner loop doesn't fit in the processors predecode loop buffer. Modern intel X86 chips have a buffer that can hold a very small number...