Alex Susu via llvm-dev
2016-Jun-28 19:43 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hello. I am writing a back end in which I combined the existing BPF LLVM back end with the Mips MSA vector extensions (from the Mips back end) I have encountered an error when compiling with llc: the instruction selector uses a vector register instead of a scalar register with type i64 . I have the following part of LLVM IR program: vector.body.preheader: ; preds = %min.iters.checked br label %vector.body vector.body: ; preds = %vector.body.preheader, %vector.body %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.body.preheader ] %vec.phi = phi <8 x i64> [ %0, %vector.body ], [ zeroinitializer, %vector.body.preheader ] The ASM code generated from it is the following: LBB0_3: // %vector.body.preheader REGVEC0 = 0 mov r0, 0 std -48(r10), r0 std -128(r10), REGVEC0 jmp LBB0_4 LBB0_4: // %vector.body ldd REGVEC0, -128(r10) ldd r0, -48(r10) I am surprised that the BPF scalar instructions ldd and std use vector register REGVEC0, which have type v8i64. For example, the TableGen definition of the LOAD instruction taken from BPFInstrInfo.td is: class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag OpNode> : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>; So I am surprised that the instruction selector finds as match for operand i64:$dst the vector register REGVEC0, which has type v8i64 as defined below, inspired from lib/Target/Mips/MipsRegisterInfo.td: def MSA128D: RegisterClass<"Connex", [v8i64], 512, (sequence "Wd%u", 0, 31)>; Can anybody help with an idea what I can do to fix this problem? Below are a few possibly useful lines from the output of llc, related to the instr. selection and register allocation of the above piece of code: ===== Instruction selection ends: Selected selection DAG: BB#3 'foo:vector.body.preheader' SelectionDAG has 11 nodes: t0: ch = EntryToken t1: i64 = MOV_ri TargetConstant:i64<0> t3: ch = CopyToReg t0, Register:i64 %vreg23, t1 t11: v8i64 = VLOAD_D TargetConstant:i64<0> t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11 t8: ch = TokenFactor t3, t6 t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8 [...] Spilling live registers at end of block. Spilling %vreg31 in %R0 to stack slot #5 Spilling %vreg32 in %Wd0 to stack slot #6 BB#3: derived from LLVM BB %vector.body.preheader Predecessors according to CFG: BB#2 %Wd0<def> = VLOAD_D 0 %R0<def> = MOV_ri 0 STD %R0<kill>, <fi#5>, 0 STD %Wd0<kill>, <fi#6>, 0 JMP <BB#4> Successors according to CFG: BB#4(0) [...] >> JMP <BB#5> Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 << JMP <BB#5> Spilling live registers at end of block. Spilling %vreg31 in %R1 to stack slot #5 Spilling %vreg32 in %Wd0 to stack slot #6 BB#4: derived from LLVM BB %vector.body Predecessors according to CFG: BB#3 BB#4 %Wd0<def> = LDD <fi#6>, 0 %R0<def> = LDD <fi#5>, 0 INLINEASM <es:int index; for (index = 0; index < N - (N % 8); index += 8) {. _BEGIN_KERNEL(BatchNumber); EXECUTE_IN_ALL(> [sideeffect] [attdialect] INLINEASM <es:connex->writeDataToArray(&C[index], /*numVectors*/ 1, /*offset*/ 3);> [sideeffect] [attdialect] %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x i64>*)](align=8) %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill> INLINEASM <es: );. _END_KERNEL(BatchNumber); connex->executeKernel(TEST_PREFIX + to_string((long long int)BatchNumber)); connex->executeKernel("waitfor"); connex->readReduction(); [...] BB#6: derived from LLVM BB %for.body.preheader8 Predecessors according to CFG: BB#1 BB#2 BB#5 %R0<def> = LDD <fi#3>, 0 %R1<def> = MOV_ri 0 STD %R0<kill>, <fi#7>, 0 STD %R1<kill>, <fi#8>, 0 JMP <BB#7> Successors according to CFG: BB#7(0) Thank you, Alex
Daniel Sanders via llvm-dev
2016-Jun-29 12:59 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hi, I vaguely remember hitting something like this when I was implementing MSA. IIRC, there was an optimization (in DAGCombine or somewhere around there) that was folding CopyToReg instructions into the load without checking whether the new register class was acceptable. I remember adding a target hook to limit this optimization based on the EVT's involved but I'm not sure if that's the patch that I upstreamed or if it was just an initial attempt at fixing it. I had a quick look for a likely hook in the Mips backend and couldn't find it so I'm probably remembering an initial attempt.> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Alex > Susu via llvm-dev > Sent: 28 June 2016 20:44 > To: llvm-dev > Subject: [llvm-dev] Instruction selection problem with type i64 - mistaken as > v8i64? > > Hello. > I am writing a back end in which I combined the existing BPF LLVM back > end with the > Mips MSA vector extensions (from the Mips back end) > I have encountered an error when compiling with llc: the instruction > selector uses a > vector register instead of a scalar register with type i64 . > > I have the following part of LLVM IR program: > vector.body.preheader: ; preds = %min.iters.checked > br label %vector.body > > vector.body: ; preds = %vector.body.preheader, > %vector.body > %index = phi i64 [ %index.next, %vector.body ], [ 0, > %vector.body.preheader ] > %vec.phi = phi <8 x i64> [ %0, %vector.body ], [ zeroinitializer, > %vector.body.preheader ] > > The ASM code generated from it is the following: > LBB0_3: // %vector.body.preheader > REGVEC0 = 0 > mov r0, 0 > std -48(r10), r0 > std -128(r10), REGVEC0 > jmp LBB0_4 > LBB0_4: // %vector.body > ldd REGVEC0, -128(r10) > ldd r0, -48(r10) > > I am surprised that the BPF scalar instructions ldd and std use vector > register > REGVEC0, which have type v8i64. > For example, the TableGen definition of the LOAD instruction taken from > BPFInstrInfo.td is: > class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag OpNode> > : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>; > > So I am surprised that the instruction selector finds as match for operand > i64:$dst > the vector register REGVEC0, which has type v8i64 as defined below, inspired > from > lib/Target/Mips/MipsRegisterInfo.td: > def MSA128D: RegisterClass<"Connex", [v8i64], 512, > (sequence "Wd%u", 0, 31)>; > > Can anybody help with an idea what I can do to fix this problem? > > Below are a few possibly useful lines from the output of llc, related to the > instr. > selection and register allocation of the above piece of code: > ===== Instruction selection ends: > Selected selection DAG: BB#3 'foo:vector.body.preheader' > SelectionDAG has 11 nodes: > t0: ch = EntryToken > t1: i64 = MOV_ri TargetConstant:i64<0> > t3: ch = CopyToReg t0, Register:i64 %vreg23, t1 > t11: v8i64 = VLOAD_D TargetConstant:i64<0> > t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11 > t8: ch = TokenFactor t3, t6 > t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8 > > [...] > > Spilling live registers at end of block. > Spilling %vreg31 in %R0 to stack slot #5 > Spilling %vreg32 in %Wd0 to stack slot #6 > BB#3: derived from LLVM BB %vector.body.preheader > Predecessors according to CFG: BB#2 > %Wd0<def> = VLOAD_D 0 > %R0<def> = MOV_ri 0 > STD %R0<kill>, <fi#5>, 0 > STD %Wd0<kill>, <fi#6>, 0 > JMP <BB#4> > Successors according to CFG: BB#4(0) > > [...] > > >> JMP <BB#5> > Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 > << JMP <BB#5> > Spilling live registers at end of block. > Spilling %vreg31 in %R1 to stack slot #5 > Spilling %vreg32 in %Wd0 to stack slot #6 > BB#4: derived from LLVM BB %vector.body > Predecessors according to CFG: BB#3 BB#4 > %Wd0<def> = LDD <fi#6>, 0 > %R0<def> = LDD <fi#5>, 0 > INLINEASM <es:int index; > for (index = 0; index < N - (N % 8); index += 8) {. > _BEGIN_KERNEL(BatchNumber); > EXECUTE_IN_ALL(> [sideeffect] [attdialect] > INLINEASM <es:connex->writeDataToArray(&C[index], /*numVectors*/ > 1, /*offset*/ > 3);> [sideeffect] [attdialect] > %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x i64>*)](align=8) > %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill> > INLINEASM <es: );. > _END_KERNEL(BatchNumber); > connex->executeKernel(TEST_PREFIX + to_string((long long > int)BatchNumber)); > connex->executeKernel("waitfor"); > connex->readReduction(); > > [...] > > > BB#6: derived from LLVM BB %for.body.preheader8 > Predecessors according to CFG: BB#1 BB#2 BB#5 > %R0<def> = LDD <fi#3>, 0 > %R1<def> = MOV_ri 0 > STD %R0<kill>, <fi#7>, 0 > STD %R1<kill>, <fi#8>, 0 > JMP <BB#7> > Successors according to CFG: BB#7(0) > > > > Thank you, > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alex Susu via llvm-dev
2016-Jul-16 20:00 UTC
[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?
Hello, Daniel, I was almost to argue now that since it works in Mips and not in BPF it's got to be related to the back end code not the common source code. But I just updated my local LLVM to the latest 3.9 version at the beginning of July and the bug disappeared - so I guess somebody fixed this problem in the common code - probably around llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (I didn't manage to ). If you know exactly what is the change please point it to me. Thank you, Alex On 6/29/2016 3:59 PM, Daniel Sanders wrote:> Hi, > > I vaguely remember hitting something like this when I was implementing MSA. IIRC, there > was an optimization (in DAGCombine or somewhere around there) that was folding > CopyToReg instructions into the load without checking whether the new register class > was acceptable. I remember adding a target hook to limit this optimization based on the > EVT's involved but I'm not sure if that's the patch that I upstreamed or if it was just > an initial attempt at fixing it. I had a quick look for a likely hook in the Mips > backend and couldn't find it so I'm probably remembering an initial attempt. > >> -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On >> Behalf Of Alex Susu via llvm-dev Sent: 28 June 2016 20:44 To: llvm-dev Subject: >> [llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64? >> >> Hello. I am writing a back end in which I combined the existing BPF LLVM back end >> with the Mips MSA vector extensions (from the Mips back end) I have encountered an >> error when compiling with llc: the instruction selector uses a vector register >> instead of a scalar register with type i64 . >> >> I have the following part of LLVM IR program: vector.body.preheader: >> ; preds = %min.iters.checked br label %vector.body >> >> vector.body: ; preds = %vector.body.preheader, >> %vector.body %index = phi i64 [ %index.next, %vector.body ], [ 0, >> %vector.body.preheader ] %vec.phi = phi <8 x i64> [ %0, %vector.body ], [ >> zeroinitializer, %vector.body.preheader ] >> >> The ASM code generated from it is the following: LBB0_3: >> // %vector.body.preheader REGVEC0 = 0 mov r0, 0 std -48(r10), r0 std >> -128(r10), REGVEC0 jmp LBB0_4 LBB0_4: // >> %vector.body ldd REGVEC0, -128(r10) ldd r0, -48(r10) >> >> I am surprised that the BPF scalar instructions ldd and std use vector register >> REGVEC0, which have type v8i64. For example, the TableGen definition of the LOAD >> instruction taken from BPFInstrInfo.td is: class LOADi64<bits<2> SizeOp, string >> OpcodeStr, PatFrag OpNode> : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode >> ADDRri:$addr))]>; >> >> So I am surprised that the instruction selector finds as match for operand i64:$dst >> the vector register REGVEC0, which has type v8i64 as defined below, inspired from >> lib/Target/Mips/MipsRegisterInfo.td: def MSA128D: RegisterClass<"Connex", [v8i64], >> 512, (sequence "Wd%u", 0, 31)>; >> >> Can anybody help with an idea what I can do to fix this problem? >> >> Below are a few possibly useful lines from the output of llc, related to the instr. >> selection and register allocation of the above piece of code: ===== Instruction >> selection ends: Selected selection DAG: BB#3 'foo:vector.body.preheader' SelectionDAG >> has 11 nodes: t0: ch = EntryToken t1: i64 = MOV_ri TargetConstant:i64<0> t3: ch >> CopyToReg t0, Register:i64 %vreg23, t1 t11: v8i64 = VLOAD_D TargetConstant:i64<0> t6: >> ch = CopyToReg t0, Register:v8i64 %vreg24, t11 t8: ch = TokenFactor t3, t6 t9: ch >> JMP BasicBlock:ch<vector.body 0xa61440>, t8 >> >> [...] >> >> Spilling live registers at end of block. Spilling %vreg31 in %R0 to stack slot #5 >> Spilling %vreg32 in %Wd0 to stack slot #6 BB#3: derived from LLVM BB >> %vector.body.preheader Predecessors according to CFG: BB#2 %Wd0<def> = VLOAD_D 0 >> %R0<def> = MOV_ri 0 STD %R0<kill>, <fi#5>, 0 STD %Wd0<kill>, <fi#6>, 0 JMP <BB#4> >> Successors according to CFG: BB#4(0) >> >> [...] >> >>>> JMP <BB#5> >> Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 << JMP <BB#5> Spilling live registers >> at end of block. Spilling %vreg31 in %R1 to stack slot #5 Spilling %vreg32 in %Wd0 to >> stack slot #6 BB#4: derived from LLVM BB %vector.body Predecessors according to CFG: >> BB#3 BB#4 %Wd0<def> = LDD <fi#6>, 0 %R0<def> = LDD <fi#5>, 0 INLINEASM <es:int >> index; for (index = 0; index < N - (N % 8); index += 8) {. >> _BEGIN_KERNEL(BatchNumber); EXECUTE_IN_ALL(> [sideeffect] [attdialect] INLINEASM >> <es:connex->writeDataToArray(&C[index], /*numVectors*/ 1, /*offset*/ 3);> >> [sideeffect] [attdialect] %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x >> i64>*)](align=8) %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill> INLINEASM <es: );. >> _END_KERNEL(BatchNumber); connex->executeKernel(TEST_PREFIX + to_string((long long >> int)BatchNumber)); connex->executeKernel("waitfor"); connex->readReduction(); >> >> [...] >> >> >> BB#6: derived from LLVM BB %for.body.preheader8 Predecessors according to CFG: BB#1 >> BB#2 BB#5 %R0<def> = LDD <fi#3>, 0 %R1<def> = MOV_ri 0 STD %R0<kill>, <fi#7>, 0 STD >> %R1<kill>, <fi#8>, 0 JMP <BB#7> Successors according to CFG: BB#7(0) >> >> >> >> Thank you, Alex _______________________________________________ LLVM Developers >> mailing list llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev