Alex Susu via llvm-dev
2016-Jun-28 19:43 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hello.
I am writing a back end in which I combined the existing BPF LLVM back end
with the
Mips MSA vector extensions (from the Mips back end)
I have encountered an error when compiling with llc: the instruction
selector uses a
vector register instead of a scalar register with type i64 .
I have the following part of LLVM IR program:
vector.body.preheader: ; preds =
%min.iters.checked
br label %vector.body
vector.body: ; preds =
%vector.body.preheader,
%vector.body
%index = phi i64 [ %index.next, %vector.body ], [ 0,
%vector.body.preheader ]
%vec.phi = phi <8 x i64> [ %0, %vector.body ], [ zeroinitializer,
%vector.body.preheader ]
The ASM code generated from it is the following:
LBB0_3: // %vector.body.preheader
REGVEC0 = 0
mov r0, 0
std -48(r10), r0
std -128(r10), REGVEC0
jmp LBB0_4
LBB0_4: // %vector.body
ldd REGVEC0, -128(r10)
ldd r0, -48(r10)
I am surprised that the BPF scalar instructions ldd and std use vector
register
REGVEC0, which have type v8i64.
For example, the TableGen definition of the LOAD instruction taken from
BPFInstrInfo.td is:
class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag
OpNode>
: LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode
ADDRri:$addr))]>;
So I am surprised that the instruction selector finds as match for operand
i64:$dst
the vector register REGVEC0, which has type v8i64 as defined below, inspired
from
lib/Target/Mips/MipsRegisterInfo.td:
def MSA128D: RegisterClass<"Connex", [v8i64], 512,
(sequence "Wd%u", 0, 31)>;
Can anybody help with an idea what I can do to fix this problem?
Below are a few possibly useful lines from the output of llc, related to
the instr.
selection and register allocation of the above piece of code:
===== Instruction selection ends:
Selected selection DAG: BB#3 'foo:vector.body.preheader'
SelectionDAG has 11 nodes:
t0: ch = EntryToken
t1: i64 = MOV_ri TargetConstant:i64<0>
t3: ch = CopyToReg t0, Register:i64 %vreg23, t1
t11: v8i64 = VLOAD_D TargetConstant:i64<0>
t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11
t8: ch = TokenFactor t3, t6
t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8
[...]
Spilling live registers at end of block.
Spilling %vreg31 in %R0 to stack slot #5
Spilling %vreg32 in %Wd0 to stack slot #6
BB#3: derived from LLVM BB %vector.body.preheader
Predecessors according to CFG: BB#2
%Wd0<def> = VLOAD_D 0
%R0<def> = MOV_ri 0
STD %R0<kill>, <fi#5>, 0
STD %Wd0<kill>, <fi#6>, 0
JMP <BB#4>
Successors according to CFG: BB#4(0)
[...]
>> JMP <BB#5>
Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1
<< JMP <BB#5>
Spilling live registers at end of block.
Spilling %vreg31 in %R1 to stack slot #5
Spilling %vreg32 in %Wd0 to stack slot #6
BB#4: derived from LLVM BB %vector.body
Predecessors according to CFG: BB#3 BB#4
%Wd0<def> = LDD <fi#6>, 0
%R0<def> = LDD <fi#5>, 0
INLINEASM <es:int index;
for (index = 0; index < N - (N % 8); index += 8) {.
_BEGIN_KERNEL(BatchNumber);
EXECUTE_IN_ALL(> [sideeffect] [attdialect]
INLINEASM <es:connex->writeDataToArray(&C[index],
/*numVectors*/ 1, /*offset*/
3);> [sideeffect] [attdialect]
%Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x
i64>*)](align=8)
%Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill>
INLINEASM <es: );.
_END_KERNEL(BatchNumber);
connex->executeKernel(TEST_PREFIX + to_string((long long
int)BatchNumber));
connex->executeKernel("waitfor");
connex->readReduction();
[...]
BB#6: derived from LLVM BB %for.body.preheader8
Predecessors according to CFG: BB#1 BB#2 BB#5
%R0<def> = LDD <fi#3>, 0
%R1<def> = MOV_ri 0
STD %R0<kill>, <fi#7>, 0
STD %R1<kill>, <fi#8>, 0
JMP <BB#7>
Successors according to CFG: BB#7(0)
Thank you,
Alex
Daniel Sanders via llvm-dev
2016-Jun-29 12:59 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hi, I vaguely remember hitting something like this when I was implementing MSA. IIRC, there was an optimization (in DAGCombine or somewhere around there) that was folding CopyToReg instructions into the load without checking whether the new register class was acceptable. I remember adding a target hook to limit this optimization based on the EVT's involved but I'm not sure if that's the patch that I upstreamed or if it was just an initial attempt at fixing it. I had a quick look for a likely hook in the Mips backend and couldn't find it so I'm probably remembering an initial attempt.> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Alex > Susu via llvm-dev > Sent: 28 June 2016 20:44 > To: llvm-dev > Subject: [llvm-dev] Instruction selection problem with type i64 - mistaken as > v8i64? > > Hello. > I am writing a back end in which I combined the existing BPF LLVM back > end with the > Mips MSA vector extensions (from the Mips back end) > I have encountered an error when compiling with llc: the instruction > selector uses a > vector register instead of a scalar register with type i64 . > > I have the following part of LLVM IR program: > vector.body.preheader: ; preds = %min.iters.checked > br label %vector.body > > vector.body: ; preds = %vector.body.preheader, > %vector.body > %index = phi i64 [ %index.next, %vector.body ], [ 0, > %vector.body.preheader ] > %vec.phi = phi <8 x i64> [ %0, %vector.body ], [ zeroinitializer, > %vector.body.preheader ] > > The ASM code generated from it is the following: > LBB0_3: // %vector.body.preheader > REGVEC0 = 0 > mov r0, 0 > std -48(r10), r0 > std -128(r10), REGVEC0 > jmp LBB0_4 > LBB0_4: // %vector.body > ldd REGVEC0, -128(r10) > ldd r0, -48(r10) > > I am surprised that the BPF scalar instructions ldd and std use vector > register > REGVEC0, which have type v8i64. > For example, the TableGen definition of the LOAD instruction taken from > BPFInstrInfo.td is: > class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag OpNode> > : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>; > > So I am surprised that the instruction selector finds as match for operand > i64:$dst > the vector register REGVEC0, which has type v8i64 as defined below, inspired > from > lib/Target/Mips/MipsRegisterInfo.td: > def MSA128D: RegisterClass<"Connex", [v8i64], 512, > (sequence "Wd%u", 0, 31)>; > > Can anybody help with an idea what I can do to fix this problem? > > Below are a few possibly useful lines from the output of llc, related to the > instr. > selection and register allocation of the above piece of code: > ===== Instruction selection ends: > Selected selection DAG: BB#3 'foo:vector.body.preheader' > SelectionDAG has 11 nodes: > t0: ch = EntryToken > t1: i64 = MOV_ri TargetConstant:i64<0> > t3: ch = CopyToReg t0, Register:i64 %vreg23, t1 > t11: v8i64 = VLOAD_D TargetConstant:i64<0> > t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11 > t8: ch = TokenFactor t3, t6 > t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8 > > [...] > > Spilling live registers at end of block. > Spilling %vreg31 in %R0 to stack slot #5 > Spilling %vreg32 in %Wd0 to stack slot #6 > BB#3: derived from LLVM BB %vector.body.preheader > Predecessors according to CFG: BB#2 > %Wd0<def> = VLOAD_D 0 > %R0<def> = MOV_ri 0 > STD %R0<kill>, <fi#5>, 0 > STD %Wd0<kill>, <fi#6>, 0 > JMP <BB#4> > Successors according to CFG: BB#4(0) > > [...] > > >> JMP <BB#5> > Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 > << JMP <BB#5> > Spilling live registers at end of block. > Spilling %vreg31 in %R1 to stack slot #5 > Spilling %vreg32 in %Wd0 to stack slot #6 > BB#4: derived from LLVM BB %vector.body > Predecessors according to CFG: BB#3 BB#4 > %Wd0<def> = LDD <fi#6>, 0 > %R0<def> = LDD <fi#5>, 0 > INLINEASM <es:int index; > for (index = 0; index < N - (N % 8); index += 8) {. > _BEGIN_KERNEL(BatchNumber); > EXECUTE_IN_ALL(> [sideeffect] [attdialect] > INLINEASM <es:connex->writeDataToArray(&C[index], /*numVectors*/ > 1, /*offset*/ > 3);> [sideeffect] [attdialect] > %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x i64>*)](align=8) > %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill> > INLINEASM <es: );. > _END_KERNEL(BatchNumber); > connex->executeKernel(TEST_PREFIX + to_string((long long > int)BatchNumber)); > connex->executeKernel("waitfor"); > connex->readReduction(); > > [...] > > > BB#6: derived from LLVM BB %for.body.preheader8 > Predecessors according to CFG: BB#1 BB#2 BB#5 > %R0<def> = LDD <fi#3>, 0 > %R1<def> = MOV_ri 0 > STD %R0<kill>, <fi#7>, 0 > STD %R1<kill>, <fi#8>, 0 > JMP <BB#7> > Successors according to CFG: BB#7(0) > > > > Thank you, > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alex Susu via llvm-dev
2016-Jul-16 20:00 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hello, Daniel,
I was almost to argue now that since it works in Mips and not in BPF
it's got to be
related to the back end code not the common source code. But I just updated my
local LLVM
to the latest 3.9 version at the beginning of July and the bug disappeared - so
I guess
somebody fixed this problem in the common code - probably around
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (I didn't manage to ).
If you know exactly what is the change please point it to me.
Thank you,
Alex
On 6/29/2016 3:59 PM, Daniel Sanders wrote:> Hi,
>
> I vaguely remember hitting something like this when I was implementing MSA.
IIRC, there
> was an optimization (in DAGCombine or somewhere around there) that was
folding
> CopyToReg instructions into the load without checking whether the new
register class
> was acceptable. I remember adding a target hook to limit this optimization
based on the
> EVT's involved but I'm not sure if that's the patch that I
upstreamed or if it was just
> an initial attempt at fixing it. I had a quick look for a likely hook in
the Mips
> backend and couldn't find it so I'm probably remembering an initial
attempt.
>
>> -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org] On
>> Behalf Of Alex Susu via llvm-dev Sent: 28 June 2016 20:44 To: llvm-dev
Subject:
>> [llvm-dev] Instruction selection problem with type i64 - mistaken as
v8i64?
>>
>> Hello. I am writing a back end in which I combined the existing BPF
LLVM back end
>> with the Mips MSA vector extensions (from the Mips back end) I have
encountered an
>> error when compiling with llc: the instruction selector uses a vector
register
>> instead of a scalar register with type i64 .
>>
>> I have the following part of LLVM IR program: vector.body.preheader:
>> ; preds = %min.iters.checked br label %vector.body
>>
>> vector.body: ; preds =
%vector.body.preheader,
>> %vector.body %index = phi i64 [ %index.next, %vector.body ], [ 0,
>> %vector.body.preheader ] %vec.phi = phi <8 x i64> [ %0,
%vector.body ], [
>> zeroinitializer, %vector.body.preheader ]
>>
>> The ASM code generated from it is the following: LBB0_3:
>> // %vector.body.preheader REGVEC0 = 0 mov r0, 0 std -48(r10),
r0 std
>> -128(r10), REGVEC0 jmp LBB0_4 LBB0_4:
//
>> %vector.body ldd REGVEC0, -128(r10) ldd r0, -48(r10)
>>
>> I am surprised that the BPF scalar instructions ldd and std use vector
register
>> REGVEC0, which have type v8i64. For example, the TableGen definition of
the LOAD
>> instruction taken from BPFInstrInfo.td is: class
LOADi64<bits<2> SizeOp, string
>> OpcodeStr, PatFrag OpNode> : LOAD<SizeOp, OpcodeStr, [(set
i64:$dst, (OpNode
>> ADDRri:$addr))]>;
>>
>> So I am surprised that the instruction selector finds as match for
operand i64:$dst
>> the vector register REGVEC0, which has type v8i64 as defined below,
inspired from
>> lib/Target/Mips/MipsRegisterInfo.td: def MSA128D:
RegisterClass<"Connex", [v8i64],
>> 512, (sequence "Wd%u", 0, 31)>;
>>
>> Can anybody help with an idea what I can do to fix this problem?
>>
>> Below are a few possibly useful lines from the output of llc, related
to the instr.
>> selection and register allocation of the above piece of code: =====
Instruction
>> selection ends: Selected selection DAG: BB#3
'foo:vector.body.preheader' SelectionDAG
>> has 11 nodes: t0: ch = EntryToken t1: i64 = MOV_ri
TargetConstant:i64<0> t3: ch >> CopyToReg t0, Register:i64 %vreg23,
t1 t11: v8i64 = VLOAD_D TargetConstant:i64<0> t6:
>> ch = CopyToReg t0, Register:v8i64 %vreg24, t11 t8: ch = TokenFactor t3,
t6 t9: ch >> JMP BasicBlock:ch<vector.body 0xa61440>, t8
>>
>> [...]
>>
>> Spilling live registers at end of block. Spilling %vreg31 in %R0 to
stack slot #5
>> Spilling %vreg32 in %Wd0 to stack slot #6 BB#3: derived from LLVM BB
>> %vector.body.preheader Predecessors according to CFG: BB#2
%Wd0<def> = VLOAD_D 0
>> %R0<def> = MOV_ri 0 STD %R0<kill>, <fi#5>, 0 STD
%Wd0<kill>, <fi#6>, 0 JMP <BB#4>
>> Successors according to CFG: BB#4(0)
>>
>> [...]
>>
>>>> JMP <BB#5>
>> Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 << JMP
<BB#5> Spilling live registers
>> at end of block. Spilling %vreg31 in %R1 to stack slot #5 Spilling
%vreg32 in %Wd0 to
>> stack slot #6 BB#4: derived from LLVM BB %vector.body Predecessors
according to CFG:
>> BB#3 BB#4 %Wd0<def> = LDD <fi#6>, 0 %R0<def> = LDD
<fi#5>, 0 INLINEASM <es:int
>> index; for (index = 0; index < N - (N % 8); index += 8) {.
>> _BEGIN_KERNEL(BatchNumber); EXECUTE_IN_ALL(> [sideeffect]
[attdialect] INLINEASM
>> <es:connex->writeDataToArray(&C[index], /*numVectors*/ 1,
/*offset*/ 3);>
>> [sideeffect] [attdialect] %Wd1<def> = LD_D 3; mem:LD64[inttoptr
(i64 3 to <8 x
>> i64>*)](align=8) %Wd0<def> = ADDV_D %Wd1<kill>,
%Wd0<kill> INLINEASM <es: );.
>> _END_KERNEL(BatchNumber); connex->executeKernel(TEST_PREFIX +
to_string((long long
>> int)BatchNumber)); connex->executeKernel("waitfor");
connex->readReduction();
>>
>> [...]
>>
>>
>> BB#6: derived from LLVM BB %for.body.preheader8 Predecessors according
to CFG: BB#1
>> BB#2 BB#5 %R0<def> = LDD <fi#3>, 0 %R1<def> = MOV_ri
0 STD %R0<kill>, <fi#7>, 0 STD
>> %R1<kill>, <fi#8>, 0 JMP <BB#7> Successors according
to CFG: BB#7(0)
>>
>>
>>
>> Thank you, Alex _______________________________________________ LLVM
Developers
>> mailing list llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev