similar to: Register Allocators Documentation

Displaying 20 results from an estimated 7000 matches similar to: "Register Allocators Documentation"

2018 Sep 11
2
linear-scan RA
Yes, I quite liked the things I've read about the PBQP allocator. Given what the hardware folks have to go through to get 1% improvements in scalar code, spending 20% (or whatever) compile time (under control of a flag) seems like nothing. And falling back on "average code" is a little disingenuous. People looking for performance don't care about average code; they care about
2011 Apr 26
2
[LLVMdev] Register pairing in PBQP
Hi. Im currently investigating LLVM's implementation of PBQP as a part of a bachelors thesis im doing on register allocation for regular architectures. In particullar, im looking at the possibility for improving the spill rate of PBQP for a particular DSP architecture, by using register pairing. >From reading the source code of lib/CodeGen/RegAllocPBQP.cpp i conclude that support for
2018 Sep 11
2
linear-scan RA
> On Sep 10, 2018, at 5:25 PM, Matthias Braun <mbraun at apple.com> wrote: > > > >> On Sep 10, 2018, at 5:11 PM, Preston Briggs <preston.briggs at gmail.com <mailto:preston.briggs at gmail.com>> wrote: >> >> The phi instruction is irrelevant; just the way I think about things. >> The question is if the allocator believes that t0 and t2
2018 Sep 11
2
linear-scan RA
Hi, Using Chaitin's approach, removing a copy via coalescing could expose more opportunities for coalescing. So he would iteratively rebuild the interference graph and check for more opportunities. Chaitin was also careful to make sure that the source and destination of a copy didn't interfere unnecessarily (because of the copy alone); that is, his approach to interference was very
2018 Sep 11
2
linear-scan RA
> On Sep 11, 2018, at 11:42 AM, Quentin Colombet <quentin.colombet at gmail.com> wrote: > > Le mar. 11 sept. 2018 à 11:23, Preston Briggs > <preston.briggs at gmail.com> a écrit : >> >> Yes, I quite liked the things I've read about the PBQP allocator. >> >> Given what the hardware folks have to go through to get 1% improvements in scalar code,
2018 Sep 10
2
linear-scan RA
How precise is the interference checking (to my mind, a great weakness of linear scan)? Is there way to do coalescing (the great strength of coloring)? I ask these questions because we (guys I work with) see loops where there's a little register juggling that seems unnecessary. Is there a paper that describes what y'all do? Thanks, Preston On Mon, Sep 10, 2018 at 9:57 AM, Matthias
2014 Mar 09
2
[LLVMdev] Evaluating the register allocators
Hello, I'm trying to evaluate the different register allocation algorithms on LLVM using the same level of optimizations. Using version 3.3 the current register allocators available to use are "basic, fast, greedy and pbqp". However, I'm facing the following issues: 1) I can't run basic and PBQP allocators using the command line flags of the dragonegg
2011 Dec 09
2
[LLVMdev] Spilling predicate registers
s/llvm-commits/llvmdev/ On Dec 9, 2011, at 12:58 PM, Arnold Schwaighofer wrote: > >> As Jakob pointed out to me, the core problem is that the current >> register scavenger implementation will only give you one register; for >> the PowerPC case, and it looks like for your case as well, we might >> really need two registers. In the short term, a reasonable solution
2018 Sep 10
2
linear-scan RA
> The underlying liveness datastructure is a list of ranges where each vreg is alive > (ranges in terms of instructions numbered). I remember a couple of later linear scan > papers describing the same thing (Traub et.al. being the first if I remember correctly). > That should be as accurate as you can get in terms of liveness information. It depends on the details. For example, given
2011 Dec 09
0
[LLVMdev] Spilling predicate registers
> I am not sure extending the scavenger is the right way to go about this. > > There are two different situations where we might need extra registers to > spill something: > > 1. When spilling a weird register class like predicate registers, we > already known during register allocation that we will need a scratch GPR > to assist with the spill. > > 2. When spilling
2018 Sep 11
2
linear-scan RA
The phi instruction is irrelevant; just the way I think about things. The question is if the allocator believes that t0 and t2 interfere. Perhaps the coalescing example was too simple. In the general case, we can't coalesce without a notion of interference. My worry is that looking at interference by ranges of instruction numbers leads to inaccuracies when a range is introduced by a copy.
2017 Jan 21
12
[GlobalISel] Quick Status
Hi all, Following the thread from http://lists.llvm.org/pipermail/llvm-dev/2017-January/109029.html, I am sending this email to give a status on GlobalISel progress and situation. We are pushing GlobalISel from the state of prototype to a production quality framework. We welcome help with patches, reviews, feedbacks and so on. As explained during the last developer meeting, we are aiming at
2011 Sep 19
6
[LLVMdev] Greedy Register Allocation in LLVM 3.0
I just uploaded a blog post outlining the new register allocation algorithm in LLVM 3.0. http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html Please direct comments here. /jakob
2011 Sep 26
0
[LLVMdev] Greedy Register Allocation in LLVM 3.0
Hi Jakob, Thanks for a very interesting description of the new register allocation algorithm in LLVM!!! Could you elaborate a bit on the following topics: 1) Do you have any plans to publish something more formal and detailed about this algorithm, e.g. a paper or something similar?  It would be nice to better understand how this algorithm relates to well-known algorithms described in the
2014 Oct 13
2
[LLVMdev] Problem of stack slot coloring
Hi, Can anyone help me with the stack slot coloring optimization? This corresponding file is /lib/codegen/stackslotcoloring.cpp. It is said this optimization was for stack slot overlay for frame size reduction, after register allocation phase. And this transformation pass relies on the LiveStack analysis pass. How, when checking the source code, it seems the LiveStack analysis has not been
2019 Feb 21
2
How to get Greedy RA to not spill results of trivially rematerializable instructions
I have encountered a rather odd situation with Greedy where it will end up spilling a register that was populated with a zero (with a trivially rematerializable load-immediate instruction). In fact, it spills 3 such values (LICM moves stuff out of a loop, register coalescer replaces copies with load-immediates and then Greedy spills them). I personally can't think of a situation where a spill
2019 Feb 21
2
How to get Greedy RA to not spill results of trivially rematerializable instructions
I do have a reproducer, but it's not for the faint of heart :) This is from a large and messy C file (Perlbench's regexec.c), reduced by bugpoint down to 1050 lines of IR. Perhaps I can paste it on pastebin. Just for fun, I added some debug dumps for machine instructions that spill registers (i.e. return non-zero from MachineInstr::getFoldedSpillSize()) that are fed by load-immediates and
2011 Sep 27
5
[LLVMdev] Greedy Register Allocation in LLVM 3.0
On Sep 26, 2011, at 4:22 AM, Leo Romanoff wrote: > Hi Jakob, > > Thanks for a very interesting description of the new register allocation algorithm in LLVM!!! > > Could you elaborate a bit on the following topics: > > 1) Do you have any plans to publish something more formal and detailed about this algorithm, e.g. a paper or something similar? It would be nice to better
2014 Oct 14
2
[LLVMdev] Problem of stack slot coloring
Hal's advice helps me a lot to understand the implementation much better. Thanks so much! So, now I am able to state my problem more clearly: 1) There are two kinds of locals, i.e., the local variables originated from the source code (like C/C++), and the compilation generated temporaries. After instruction selection phase, the former is seen as frame indexes, while the latter is seen as
2019 Feb 21
2
How to get Greedy RA to not spill results of trivially rematerializable instructions
Thanks for the reduced test case, I’ll try to take a look by the end of the week. > On Feb 20, 2019, at 6:53 PM, Nemanja Ivanovic <nemanja.i.ibm at gmail.com> wrote: > > Finally managed to reduce this to something manageable: https://godbolt.org/z/Hw529k <https://godbolt.org/z/Hw529k> > > On line 40 of the output, we have a load-immediate to put zero into R3. Then we