thr3ads.net - search: "regcoalescing"

Displaying 20 results from an estimated 20 matches for "regcoalescing".

2012 Apr 27

[LLVMdev] PreRASched

Hi, I wonder when the preRASched pass is planned to be available? I wonder how one would best try to implement a pass in between RegCoalescer and RA. After RegCoalescer, the LiveVariables information seems broken (there are no Kills anywhere), and LiveVariables can't be rerun after SSA form is left. So, how could one rebuild LiveIntevals? For register allocation purposes - what would be the

[LLVMdev] Very slow performance of lli on x86

2009 Nov 15

[LLVMdev] Very slow performance of lli on x86

On Nov 14, 2009, at 11:52 PM, Prasanth J wrote: > step 4: > running monolith.bc for 10000 iterations using lli tool and measured the time. How are you doing this? -eric

[LLVMdev] Very slow performance of lli on x86

2009 Nov 16

[LLVMdev] Very slow performance of lli on x86

...uce - Number of loop terminating conds optimized 1 machine-licm - Number of machine instructions hoisted out of loops 4 phielim - Number of atomic phis lowered 2 regalloc - Number of copies coalesced 27 regalloc - Number of iterations performed 3 regcoalescing - Number of cross class joins performed 44 regcoalescing - Number of identity moves eliminated after coalescing 1 regcoalescing - Number of instructions re-materialized 40 regcoalescing - Number of interval joins performed 2 scalar-evolution - Number of loops with predicta...

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

...coalescer should work since the regclasses overlap completely. Cross class coalescing also has some heuristics to prevent it from creating very small register classes. It is possible that it doesn't want to use PTRREGS because it only has 3 registers. You can look at the output of -debug-only=regcoalescing to see what is going on. /jakob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110326/b0884996/attachment.html>

[LLVMdev] Possible missed optimization?

2010 Sep 04

[LLVMdev] Possible missed optimization?

...same result. GCC is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it. In LLVM, copies are coalesced away as much as possible before registers are allocated, so the allocation order wouldn't affect it. Try looking at the output of -debug-only=regcoalescing to see what is going wrong. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1929 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100904/f9ef1c08/attachment.bin>

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

Hello Jakob, thanks for the reply. The three regclasses involved here are all subsets from each other and aren't disjoint. These are the basic descriptions of the regclasses involved to show what i mean: DREGS: R31R30, R29R28 down to R1R0 (16 regs) DLDREGS: R31R30, R29R28 down to R17R16 (8 regs) PTRREGS: R31R30, R29R28, R27R26 (3 regs) All classes intersect each other

[LLVMdev] Very slow performance of lli on x86

2009 Nov 15

[LLVMdev] Very slow performance of lli on x86

Hi all, LLVM is built without debug enabled. Also i am not forcing lli to use interpreter mode. so i dont think the reason is not because of debug build or interpreter mode. *step 1: * compiled the 3 files (generic_replica.c ,xacc.c and dacc.c) with clang-cc to llvm bytecode files using -emit-llvm-bc and (-O0/-O3) options *step 2:* bytecode obtained from step 1 (generic_replica.bc, xacc.bc and

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

> > You can look at the output of -debug-only=regcoalescing to see what is > going on. > > This is the debug output i've got, some information is a bit cryptic for me so next is what i understood: ********** SIMPLE REGISTER COALESCING ********** ********** Function: foo ********** JOINING INTERVALS *********** entry: 16L %vreg0<def> =...

[LLVMdev] Possible missed optimization? 2.0

2010 Sep 09

[LLVMdev] Possible missed optimization? 2.0

...mixing upper and lower parts from a start we can save r8 in the first example and a later move, notice that the second version stores directly the result of a.low*b.low into R15:R14. I'm unsure if this is related to http://llvm.org/bugs/show_bug.cgi?id=8112 I've attached a txt file with the regcoalescing output incase it's useful like requested in the previous emails. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100909/2836177c/attachment.html> -------------- next part -------------- *******...

[LLVMdev] Possible missed optimization?

2010 Sep 04

[LLVMdev] Possible missed optimization?

Indeed, i've marked it as commutable: let isCommutable = 1, isTwoAddress = 1 in def XORRdRr : FRdRr<0b0010, 0b01, (outs GPR8:$dst), (ins GPR8:$src1, GPR8:$src2), "xor\t$dst, $src2", [(set GPR8:$dst, (xor GPR8:$src1, GPR8:$src2))]>; -------------- next part -------------- An HTML

[LLVMdev] Possible missed optimization?

2010 Sep 04

[LLVMdev] Possible missed optimization?

I've noticed this pattern happening with other operators aswell, but used xor in this example. As i said before, i tried with different register allocation orders, but it will produce always the same result. GCC is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it. -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] Possible missed optimization? 2.0

2010 Sep 09

[LLVMdev] Possible missed optimization? 2.0

...g upper and lower parts from a start we can save r8 in the first example and a later move, notice that the second version stores directly the result of a.low*b.low into R15:R14. I'm unsure if this is related to http://llvm.org/bugs/show_bug.cgi?id=8112 > I've attached a txt file with the regcoalescing output incase it's useful like requested in the previous emails. I haven't looked closely, but on the surface it doesn't sound like a coalescing issue. It sounds like you want different scheduling, or even different selection DAGs. Does the -view-*-dags output look correct?

[LLVMdev] instructions requiring specific physical registers for operands

2012 May 09

[LLVMdev] instructions requiring specific physical registers for operands

Jim, > The an instruction that uses R0 and R1 as fixed input registers and R2 for output could define itself using those register classs: > def myInst : baseclass<…, (outs GPRr2:$dst), (ins GPRr0:$src1, GPRr1:$src2), …> > Use those reg classes in pattern to match also, and things should just work. The register allocator can take care of any reg-to-reg copies that are required. As

[LLVMdev] Possible missed optimization?

2011 Mar 28

[LLVMdev] Possible missed optimization?

On Mar 26, 2011, at 4:09 PM, Borja Ferrer wrote: > You can look at the output of -debug-only=regcoalescing to see what is going on. > > This is the debug output i've got, some information is a bit cryptic for me so next is what i understood: > > ********** SIMPLE REGISTER COALESCING ********** > ********** Function: foo > ********** JOINING INTERVALS *********** > entry: >...

[LLVMdev] instructions requiring specific physical registers for operands

2012 May 09

[LLVMdev] instructions requiring specific physical registers for operands

On May 9, 2012, at 4:27 AM, Anton Korobeynikov wrote: > Hello Jonas, > >> I wonder, what would be the best solution for instructions that require >> operands in a particular register, and even gives the result in a particular >> register? > You need to custom select such instruction. See e.g. div / idiv on x86 > as an example. That's often easiest, yes;

[LLVMdev] Possible missed optimization?

2010 Sep 05

[LLVMdev] Possible missed optimization?

...is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it. > > In LLVM, copies are coalesced away as much as possible before registers are allocated, so the allocation order wouldn't affect it. > > Try looking at the output of -debug-only=regcoalescing to see what is going wrong. If you want to take a look at this yourself, the issue is easy to reproduce with Thumb1: $ cat > test.c typedef unsigned long long t; t foo(t a, t b) { t a4 = b^a^18; return a4; } $ clang -cc1 -triple thumbv5-u-u -S -O2 test.c -o - [...] eors r1, r3 mov r3,...

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

...Vincent Lejeune wrote: > Hi, > > I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. > > The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : > > // BEFORE LOOP > ... Some COPYs.... > 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2 > 416B%vreg48<def> = COPY %vreg3<kill>; R600_Reg128:%vreg48,%vreg3 > 432B%vreg49<def> = COPY %vreg13<kill>; R600_Reg...

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

> > PHIElim and TwoAddress passes leave SSA form. > May be a missed something in your code but %vreg48 seems to be there > after PHI elimination. PHIElim tags those kind of registers as being > PHIJoin regs, updating LiveVariables pass, so the regcoalescer is aware > of them (some SSA info is still alive but the reg coalescer will > invalidate that information after

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg. If I look at the : %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6 instructions ; it gets joined to : 928B%vreg34<def> = COPY %vreg48:sel_y; when vreg6 and vreg48 are joined. It's right. But joining the following copy

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 24

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi, I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : // BEFORE LOOP ... Some COPYs.... 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2 416B%vreg48<def> = COPY %vreg3<kill>; R600_Reg128:%vreg48,%vreg3 432B%vreg49<def> = COPY %vreg13<kill>; R600_Reg32:%vreg49,%vreg13 Succes...

search for: regcoalescing