thr3ads.net - similar to: "Dynamically determine the CostPerUse value in the register allocator."

Displaying 20 results from an estimated 200 matches similar to: "Dynamically determine the CostPerUse value in the register allocator."

Dynamically determine the CostPerUse value in the register allocator.

2020 May 30

Dynamically determine the CostPerUse value in the register allocator.

I dont know the history behind CostPerUse word so I may be missing the background associated with it. It seems that it's misnomer for what it is intended. At first sight, the word indicates that the cost is a function of uses of the register - more the uses more the cost. How do we want to define the value of CostPerUse. Should it be a function of uses? or just the target? On Sat, May 30,

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

On May 3, 2011, at 12:03 PM, David A. Greene wrote: >> >> The greedy allocator is trying to pick registers so inner loops are as >> small as possible, but that is not always the right thing to do. > > How does it balance that against spill cost? I added the CostPerUse field to the register descriptions. The allocator will try to minimize the spill weight assigned to

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: >>> The greedy allocator is trying to pick registers so inner loops are as >>> small as possible, but that is not always the right thing to do. >> >> How does it balance that against spill cost? > > I added the CostPerUse field to the register descriptions. The > allocator will try to minimize the

[LLVMdev] Data sharing between two ALUs and avoiding illegal copies

2012 Oct 26

[LLVMdev] Data sharing between two ALUs and avoiding illegal copies

Hi, I'm working on support for the latest generation of AMD GPUs (Southern Islands) in the R600 backend, and I need some advice on how to handle interactions between two different ALUs. The processors on Southern Islands GPUs are grouped into compute units, which contain 1 Scalar ALU (sALU) and 64 Vector ALUs (vALU). The sALU is mainly responsible for flow control (implemented using

Question about thinLTO

2017 Jul 13

Question about thinLTO

On Thu, Jul 13, 2017 at 8:37 AM, Christudasan D <xander.cd at gmail.com> wrote: > > Hi Teresa, > > Yes, we plan to have our code at CG directly. > We use our own linker. That's the pain. We might only get a partial > benefit of thinLTO which occurs at compile time. > There is no compile-time only benefit of ThinLTO. You'll need the linker to interface with the

Question about thinLTO

2017 Jul 13

Question about thinLTO

On Thu, Jul 13, 2017 at 2:54 AM, Christudasan D <xander.cd at gmail.com> wrote: > Thank you Teresa. > > Yes, I would like to save the IR (*.bc and/or *.ll) after all > optimizations (especially thinLTO) are done and call *llc* separately. > Is there any specific document available online to see more about this > feature and various command-line switches that a compiler

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

On May 3, 2011, at 3:23 PM, David A. Greene wrote: > Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > >>>> The greedy allocator is trying to pick registers so inner loops are as >>>> small as possible, but that is not always the right thing to do. >>> >>> How does it balance that against spill cost? >> >> I added the

[LAA] RtCheck on pointers of different address spaces.

2020 Jul 26

[LAA] RtCheck on pointers of different address spaces.

Hi Stefanos, Attached the testcase. I tried to reduce it further, but the problem goes away when I remove the instructions further. There is a nested loop and the fault occurs while processing the inner loop (for.body) To reproduce the crash: opt -O3 testcase.ll -o out.ll > `groupChecks()` will only try to group pointers that are on the same alias set. If that’s true, the RT check

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: >> Yikes! Do we know why these codes got so much worse? Even 5% is a big >> deal on x86. > > On x86-64, n-body and puzzle have the exact same instructions as with > linear scan. The only difference is the choice of registers. This > causes some loops to be a few bytes longer or shorter which can easily > change

Question about thinLTO

2017 Jul 12

Question about thinLTO

On Wed, Jul 12, 2017 at 10:19 AM, Teresa Johnson <tejohnson at google.com> wrote: > Hi Christu, > > Thanks for the note! > > On Wed, Jul 12, 2017 at 9:56 AM, Christudasan D via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello, >> >> >> >> My impression on *thinLTO* when I first heard of it, (EuroLLVM2015) was >> about

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

On May 3, 2011, at 9:19 AM, David A. Greene wrote: > Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > >> +10.0% SingleSource/Benchmarks/CoyoteBench/huffbench >> +12.0% SingleSource/Benchmarks/McGill/chomp >> +18.0% SingleSource/Benchmarks/BenchmarkGame/n-body >> +45.5% SingleSource/Benchmarks/BenchmarkGame/puzzle >> +10.0%

[LLVMdev] Greedy register allocation

2011 May 03

[LLVMdev] Greedy register allocation

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: >> That was my initial reaction. Splitting should have at least >> rematerialized the value just before header2. That should significantly >> improve things. This is a classic motivational case for live range >> splitting. > > Well, not really. Note there there are plenty of registers available > and no

Question about thinLTO

2017 Jul 12

Question about thinLTO

Hello, My impression on *thinLTO* when I first heard of it, (EuroLLVM2015) was about achieving Cross Module Optimization (CMO) at the IR level. Having parallel front-end compilation & initial optimization first, a thin-link of individual input units, more optimization by calling opt again on the combined IR, and finally the target codegen using llc. A transformation similar to the

How to describe the RegisterInfo?

2016 Aug 23

How to describe the RegisterInfo?

Hi Escha, Great to have your comment! Do you have any specific reason for not doing like this? I am not sure whether I understand your point correctly. For "just model one thread", do you mean "only considering ONE of the 8/16 working lanes that running in lock-step way"?? For my case, may be something like I only need to define r0~r127 as register for i32 register (each r#

[LLVMdev] Greedy register allocation

2011 May 04

[LLVMdev] Greedy register allocation

On May 3, 2011, at 4:08 PM, David A. Greene wrote: >> >> It's just that an REX prefix is required on some instructions when >> %xmm8 is used. Is it worth it to undo LICM just for that? In this >> case, probably. In general, no. > > Ah, so you're saying the regression is due to the inner loop icache > footprint increasing. Ok, that makes total sense to

[LAA] RtCheck on pointers of different address spaces.

2020 Jul 26

[LAA] RtCheck on pointers of different address spaces.

Hello, I Have a question related to the RT check on pointers during Loop Access Analysis pass. There is a testcase with loop code that consist of 4 different memory operations referring two global objects of different address spaces. One from global constant (address space 4, addr_size = 64) and the other from local, LDS (address space 3, addr_size= 32). (Details of various address spaces

Asssistance

2015 Mar 03

Asssistance

Hi to All, I am building a package in R and whenever I run command "R CMD build OAR" in the terminal, I get the following error: * checking for file ?OAR/DESCRIPTION? ... OK * preparing ?OAR?: * checking DESCRIPTION meta-information ... ERROR Malformed Depends or Suggests or Imports or Enhances field. Offending entries: R (>=3.0.2) Entries must be names of packages optionally

equivalent to SAS genmod code in R?

2009 Feb 13

equivalent to SAS genmod code in R?

Hello, I have to run a general linear mixed model which looks at 2 dependent variables at the same time (var1 divided by var2). I have tryed to search for such a kind of model structure but since I just started using R my search was not successful. Especielly since I only have an old SAS GENMOD code structure from my project supervisor as an indication. My question is no, does there exist a code

null model for a single species?

2011 Mar 07

null model for a single species?

Dear List members, I would like to test whether an observed occupancy of lakes in a landscape has occurred randomly (by chance) or not. How can I do that? The problem is that it concerns only a single species and I would like to use binary data only. At first I thought of generating null models and test the observed occupancy against the randomly generated one. However, this needs more than one

How to describe the RegisterInfo?

2016 Aug 23

How to describe the RegisterInfo?

Yes, the arch is just as you said, something like AMD GPU, but Intel GPU don't have separate register file for 'scalar/vector'. In fact my idea of defining the register tuples was borrowed from SIRegisterInfo.td in AMD GPU. But seems that AMD GPU mainly support i32/i64 register type, while Intel GPU also support byte/short register type. So I have to start defining the registers from

similar to: Dynamically determine the CostPerUse value in the register allocator.