Displaying 20 results from an estimated 20 matches for "regcoalescing".
2012 Apr 27
1
[LLVMdev] PreRASched
Hi,
I wonder when the preRASched pass is planned to be available?
I wonder how one would best try to implement a pass in between RegCoalescer and RA. After RegCoalescer, the LiveVariables information seems broken (there are no Kills anywhere), and LiveVariables can't be rerun after SSA form is left. So, how could one rebuild LiveIntevals? For register allocation purposes - what would be the
2009 Nov 15
0
[LLVMdev] Very slow performance of lli on x86
On Nov 14, 2009, at 11:52 PM, Prasanth J wrote:
> step 4:
> running monolith.bc for 10000 iterations using lli tool and measured the time.
How are you doing this?
-eric
2009 Nov 16
1
[LLVMdev] Very slow performance of lli on x86
...uce - Number of loop terminating conds optimized
1 machine-licm - Number of machine instructions hoisted out of loops
4 phielim - Number of atomic phis lowered
2 regalloc - Number of copies coalesced
27 regalloc - Number of iterations performed
3 regcoalescing - Number of cross class joins performed
44 regcoalescing - Number of identity moves eliminated after
coalescing
1 regcoalescing - Number of instructions re-materialized
40 regcoalescing - Number of interval joins performed
2 scalar-evolution - Number of loops with predicta...
2011 Mar 26
0
[LLVMdev] Possible missed optimization?
...coalescer should work since the regclasses overlap completely.
Cross class coalescing also has some heuristics to prevent it from creating very small register classes. It is possible that it doesn't want to use PTRREGS because it only has 3 registers.
You can look at the output of -debug-only=regcoalescing to see what is going on.
/jakob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110326/b0884996/attachment.html>
2010 Sep 04
3
[LLVMdev] Possible missed optimization?
...same result. GCC is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it.
In LLVM, copies are coalesced away as much as possible before registers are allocated, so the allocation order wouldn't affect it.
Try looking at the output of -debug-only=regcoalescing to see what is going wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1929 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100904/f9ef1c08/attachment.bin>
2011 Mar 26
2
[LLVMdev] Possible missed optimization?
Hello Jakob, thanks for the reply. The three regclasses involved here are
all subsets from each other and aren't disjoint. These are the basic
descriptions of the regclasses involved to show what i mean:
DREGS: R31R30, R29R28 down to R1R0 (16 regs)
DLDREGS: R31R30, R29R28 down to R17R16 (8 regs)
PTRREGS: R31R30, R29R28, R27R26 (3 regs)
All classes intersect each other
2009 Nov 15
5
[LLVMdev] Very slow performance of lli on x86
Hi all,
LLVM is built without debug enabled. Also i am not forcing lli to use
interpreter mode. so i dont think the reason is not because of debug build
or interpreter mode.
*step 1: *
compiled the 3 files (generic_replica.c ,xacc.c and dacc.c) with clang-cc to
llvm bytecode files using -emit-llvm-bc and (-O0/-O3) options
*step 2:*
bytecode obtained from step 1 (generic_replica.bc, xacc.bc and
2011 Mar 26
2
[LLVMdev] Possible missed optimization?
>
> You can look at the output of -debug-only=regcoalescing to see what is
> going on.
>
> This is the debug output i've got, some information is a bit cryptic for me
so next is what i understood:
********** SIMPLE REGISTER COALESCING **********
********** Function: foo
********** JOINING INTERVALS ***********
entry:
16L %vreg0<def> =...
2010 Sep 09
2
[LLVMdev] Possible missed optimization? 2.0
...mixing upper and lower parts from a start we can save r8 in the first
example and a later move, notice that the second version stores directly the
result of a.low*b.low into R15:R14. I'm unsure if this is related to
http://llvm.org/bugs/show_bug.cgi?id=8112
I've attached a txt file with the regcoalescing output incase it's useful
like requested in the previous emails.
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100909/2836177c/attachment.html>
-------------- next part --------------
*******...
2010 Sep 04
1
[LLVMdev] Possible missed optimization?
Indeed, i've marked it as commutable:
let isCommutable = 1,
isTwoAddress = 1 in
def XORRdRr : FRdRr<0b0010,
0b01,
(outs GPR8:$dst),
(ins GPR8:$src1, GPR8:$src2),
"xor\t$dst, $src2",
[(set GPR8:$dst, (xor GPR8:$src1, GPR8:$src2))]>;
-------------- next part --------------
An HTML
2010 Sep 04
0
[LLVMdev] Possible missed optimization?
I've noticed this pattern happening with other operators aswell, but used
xor in this example. As i said before, i tried with different register
allocation orders, but it will produce always the same result. GCC is
emitting longer code, but since LLVM is so nearer to the optimal code
sequence i wanted to reach it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2010 Sep 09
0
[LLVMdev] Possible missed optimization? 2.0
...g upper and lower parts from a start we can save r8 in the first example and a later move, notice that the second version stores directly the result of a.low*b.low into R15:R14. I'm unsure if this is related to http://llvm.org/bugs/show_bug.cgi?id=8112
> I've attached a txt file with the regcoalescing output incase it's useful like requested in the previous emails.
I haven't looked closely, but on the surface it doesn't sound like a coalescing issue.
It sounds like you want different scheduling, or even different selection DAGs.
Does the -view-*-dags output look correct?
2012 May 09
0
[LLVMdev] instructions requiring specific physical registers for operands
Jim,
> The an instruction that uses R0 and R1 as fixed input registers and R2 for output could define itself using those register classs:
> def myInst : baseclass<…, (outs GPRr2:$dst), (ins GPRr0:$src1, GPRr1:$src2), …>
> Use those reg classes in pattern to match also, and things should just work. The register allocator can take care of any reg-to-reg copies that are required.
As
2011 Mar 28
0
[LLVMdev] Possible missed optimization?
On Mar 26, 2011, at 4:09 PM, Borja Ferrer wrote:
> You can look at the output of -debug-only=regcoalescing to see what is going on.
>
> This is the debug output i've got, some information is a bit cryptic for me so next is what i understood:
>
> ********** SIMPLE REGISTER COALESCING **********
> ********** Function: foo
> ********** JOINING INTERVALS ***********
> entry:
>...
2012 May 09
2
[LLVMdev] instructions requiring specific physical registers for operands
On May 9, 2012, at 4:27 AM, Anton Korobeynikov wrote:
> Hello Jonas,
>
>> I wonder, what would be the best solution for instructions that require
>> operands in a particular register, and even gives the result in a particular
>> register?
> You need to custom select such instruction. See e.g. div / idiv on x86
> as an example.
That's often easiest, yes;
2010 Sep 05
0
[LLVMdev] Possible missed optimization?
...is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it.
>
> In LLVM, copies are coalesced away as much as possible before registers are allocated, so the allocation order wouldn't affect it.
>
> Try looking at the output of -debug-only=regcoalescing to see what is going wrong.
If you want to take a look at this yourself, the issue is easy to
reproduce with Thumb1:
$ cat > test.c
typedef unsigned long long t;
t foo(t a, t b)
{
t a4 = b^a^18;
return a4;
}
$ clang -cc1 -triple thumbv5-u-u -S -O2 test.c -o -
[...]
eors r1, r3
mov r3,...
2012 Oct 25
0
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
...Vincent Lejeune wrote:
> Hi,
>
> I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below.
>
> The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is :
>
> // BEFORE LOOP
> ... Some COPYs....
> 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
> 416B%vreg48<def> = COPY %vreg3<kill>; R600_Reg128:%vreg48,%vreg3
> 432B%vreg49<def> = COPY %vreg13<kill>; R600_Reg...
2012 Oct 25
2
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
>
> PHIElim and TwoAddress passes leave SSA form.
> May be a missed something in your code but %vreg48 seems to be there
> after PHI elimination. PHIElim tags those kind of registers as being
> PHIJoin regs, updating LiveVariables pass, so the regcoalescer is aware
> of them (some SSA info is still alive but the reg coalescer will
> invalidate that information after
2012 Oct 25
0
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg.
If I look at the :
%vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6
instructions ; it gets joined to :
928B%vreg34<def> = COPY %vreg48:sel_y;
when vreg6 and vreg48 are joined. It's right.
But joining the following copy
2012 Oct 24
3
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi,
I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below.
The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is :
// BEFORE LOOP
... Some COPYs....
400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
416B%vreg48<def> = COPY %vreg3<kill>; R600_Reg128:%vreg48,%vreg3
432B%vreg49<def> = COPY %vreg13<kill>; R600_Reg32:%vreg49,%vreg13
Succes...