thr3ads.net - llvm dev - [llvm-dev] Instruction selection problem with type i64

If this information is useful, please help other people find it:
Share via:

Alex Susu via llvm-dev

2016-Jun-28 19:43 UTC

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?

Hello.
     I am writing a back end in which I combined the existing BPF LLVM back end
with the
Mips MSA vector extensions (from the Mips back end)
     I have encountered an error when compiling with llc: the instruction
selector uses a
vector register instead of a scalar register with type i64 .

     I have the following part of LLVM IR program:
       vector.body.preheader:                            ; preds =
%min.iters.checked
         br label %vector.body

       vector.body:                                      ; preds =
%vector.body.preheader,
%vector.body
         %index = phi i64 [ %index.next, %vector.body ], [ 0,
%vector.body.preheader ]
         %vec.phi = phi <8 x i64> [ %0, %vector.body ], [ zeroinitializer,
%vector.body.preheader ]

     The ASM code generated from it is the following:
LBB0_3:                                 // %vector.body.preheader
         REGVEC0 = 0
         mov     r0, 0
         std     -48(r10), r0
         std     -128(r10), REGVEC0
         jmp     LBB0_4
LBB0_4:                                 // %vector.body
         ldd     REGVEC0, -128(r10)
         ldd     r0, -48(r10)

     I am surprised that the BPF scalar instructions ldd and std use vector
register
REGVEC0, which have type v8i64.
     For example, the TableGen definition of the LOAD instruction taken from 
BPFInstrInfo.td is:
       class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag
OpNode>
           : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode
ADDRri:$addr))]>;

     So I am surprised that the instruction selector finds as match for operand
i64:$dst
the vector register REGVEC0, which has type v8i64 as defined below, inspired
from
lib/Target/Mips/MipsRegisterInfo.td:
       def MSA128D: RegisterClass<"Connex", [v8i64], 512,
                            (sequence "Wd%u", 0, 31)>;

     Can anybody help with an idea what I can do to fix this problem?

     Below are a few possibly useful lines from the output of llc, related to
the instr.
selection and register allocation of the above piece of code:
===== Instruction selection ends:
Selected selection DAG: BB#3 'foo:vector.body.preheader'
SelectionDAG has 11 nodes:
   t0: ch = EntryToken
         t1: i64 = MOV_ri TargetConstant:i64<0>
       t3: ch = CopyToReg t0, Register:i64 %vreg23, t1
         t11: v8i64 = VLOAD_D TargetConstant:i64<0>
       t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11
     t8: ch = TokenFactor t3, t6
   t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8

[...]

Spilling live registers at end of block.
Spilling %vreg31 in %R0 to stack slot #5
Spilling %vreg32 in %Wd0 to stack slot #6
BB#3: derived from LLVM BB %vector.body.preheader
     Predecessors according to CFG: BB#2
         %Wd0<def> = VLOAD_D 0
         %R0<def> = MOV_ri 0
         STD %R0<kill>, <fi#5>, 0
         STD %Wd0<kill>, <fi#6>, 0
         JMP <BB#4>
     Successors according to CFG: BB#4(0)

[...]

 >> JMP <BB#5>
Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1
<< JMP <BB#5>
Spilling live registers at end of block.
Spilling %vreg31 in %R1 to stack slot #5
Spilling %vreg32 in %Wd0 to stack slot #6
BB#4: derived from LLVM BB %vector.body
     Predecessors according to CFG: BB#3 BB#4
         %Wd0<def> = LDD <fi#6>, 0
         %R0<def> = LDD <fi#5>, 0
         INLINEASM <es:int index;
for (index = 0;  index < N - (N % 8); index += 8) {.
     _BEGIN_KERNEL(BatchNumber);
         EXECUTE_IN_ALL(> [sideeffect] [attdialect]
         INLINEASM <es:connex->writeDataToArray(&C[index],
/*numVectors*/ 1, /*offset*/
3);> [sideeffect] [attdialect]
         %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x
i64>*)](align=8)
         %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill>
         INLINEASM <es:       );.
     _END_KERNEL(BatchNumber);
   connex->executeKernel(TEST_PREFIX + to_string((long long
int)BatchNumber));
   connex->executeKernel("waitfor");
   connex->readReduction();

[...]


BB#6: derived from LLVM BB %for.body.preheader8
     Predecessors according to CFG: BB#1 BB#2 BB#5
         %R0<def> = LDD <fi#3>, 0
         %R1<def> = MOV_ri 0
         STD %R0<kill>, <fi#7>, 0
         STD %R1<kill>, <fi#8>, 0
         JMP <BB#7>
     Successors according to CFG: BB#7(0)



   Thank you,
     Alex

Daniel Sanders via llvm-dev

2016-Jun-29 12:59 UTC

head link

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?

Hi,

I vaguely remember hitting something like this when I was implementing MSA.
IIRC, there was an optimization (in DAGCombine or somewhere around there) that
was folding CopyToReg instructions into the load without checking whether the
new register class was acceptable. I remember adding a target hook to limit this
optimization based on the EVT's involved but I'm not sure if that's
the patch that I upstreamed or if it was just an initial attempt at fixing it. I
had a quick look for a likely hook in the Mips backend and couldn't find it
so I'm probably remembering an initial attempt.
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Alex
> Susu via llvm-dev
> Sent: 28 June 2016 20:44
> To: llvm-dev
> Subject: [llvm-dev] Instruction selection problem with type i64 - mistaken
as
> v8i64?
> 
>    Hello.
>      I am writing a back end in which I combined the existing BPF LLVM back
> end with the
> Mips MSA vector extensions (from the Mips back end)
>      I have encountered an error when compiling with llc: the instruction
> selector uses a
> vector register instead of a scalar register with type i64 .
> 
>      I have the following part of LLVM IR program:
>        vector.body.preheader:                            ; preds =
%min.iters.checked
>          br label %vector.body
> 
>        vector.body:                                      ; preds =
%vector.body.preheader,
> %vector.body
>          %index = phi i64 [ %index.next, %vector.body ], [ 0,
> %vector.body.preheader ]
>          %vec.phi = phi <8 x i64> [ %0, %vector.body ], [
zeroinitializer,
> %vector.body.preheader ]
> 
>      The ASM code generated from it is the following:
> LBB0_3:                                 // %vector.body.preheader
>          REGVEC0 = 0
>          mov     r0, 0
>          std     -48(r10), r0
>          std     -128(r10), REGVEC0
>          jmp     LBB0_4
> LBB0_4:                                 // %vector.body
>          ldd     REGVEC0, -128(r10)
>          ldd     r0, -48(r10)
> 
>      I am surprised that the BPF scalar instructions ldd and std use vector
> register
> REGVEC0, which have type v8i64.
>      For example, the TableGen definition of the LOAD instruction taken
from
> BPFInstrInfo.td is:
>        class LOADi64<bits<2> SizeOp, string OpcodeStr, PatFrag
OpNode>
>            : LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode
ADDRri:$addr))]>;
> 
>      So I am surprised that the instruction selector finds as match for
operand
> i64:$dst
> the vector register REGVEC0, which has type v8i64 as defined below,
inspired
> from
> lib/Target/Mips/MipsRegisterInfo.td:
>        def MSA128D: RegisterClass<"Connex", [v8i64], 512,
>                             (sequence "Wd%u", 0, 31)>;
> 
>      Can anybody help with an idea what I can do to fix this problem?
> 
>      Below are a few possibly useful lines from the output of llc, related
to the
> instr.
> selection and register allocation of the above piece of code:
> ===== Instruction selection ends:
> Selected selection DAG: BB#3 'foo:vector.body.preheader'
> SelectionDAG has 11 nodes:
>    t0: ch = EntryToken
>          t1: i64 = MOV_ri TargetConstant:i64<0>
>        t3: ch = CopyToReg t0, Register:i64 %vreg23, t1
>          t11: v8i64 = VLOAD_D TargetConstant:i64<0>
>        t6: ch = CopyToReg t0, Register:v8i64 %vreg24, t11
>      t8: ch = TokenFactor t3, t6
>    t9: ch = JMP BasicBlock:ch<vector.body 0xa61440>, t8
> 
> [...]
> 
> Spilling live registers at end of block.
> Spilling %vreg31 in %R0 to stack slot #5
> Spilling %vreg32 in %Wd0 to stack slot #6
> BB#3: derived from LLVM BB %vector.body.preheader
>      Predecessors according to CFG: BB#2
>          %Wd0<def> = VLOAD_D 0
>          %R0<def> = MOV_ri 0
>          STD %R0<kill>, <fi#5>, 0
>          STD %Wd0<kill>, <fi#6>, 0
>          JMP <BB#4>
>      Successors according to CFG: BB#4(0)
> 
> [...]
> 
>  >> JMP <BB#5>
> Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1
> << JMP <BB#5>
> Spilling live registers at end of block.
> Spilling %vreg31 in %R1 to stack slot #5
> Spilling %vreg32 in %Wd0 to stack slot #6
> BB#4: derived from LLVM BB %vector.body
>      Predecessors according to CFG: BB#3 BB#4
>          %Wd0<def> = LDD <fi#6>, 0
>          %R0<def> = LDD <fi#5>, 0
>          INLINEASM <es:int index;
> for (index = 0;  index < N - (N % 8); index += 8) {.
>      _BEGIN_KERNEL(BatchNumber);
>          EXECUTE_IN_ALL(> [sideeffect] [attdialect]
>          INLINEASM <es:connex->writeDataToArray(&C[index],
/*numVectors*/
> 1, /*offset*/
> 3);> [sideeffect] [attdialect]
>          %Wd1<def> = LD_D 3; mem:LD64[inttoptr (i64 3 to <8 x
i64>*)](align=8)
>          %Wd0<def> = ADDV_D %Wd1<kill>, %Wd0<kill>
>          INLINEASM <es:       );.
>      _END_KERNEL(BatchNumber);
>    connex->executeKernel(TEST_PREFIX + to_string((long long
> int)BatchNumber));
>    connex->executeKernel("waitfor");
>    connex->readReduction();
> 
> [...]
> 
> 
> BB#6: derived from LLVM BB %for.body.preheader8
>      Predecessors according to CFG: BB#1 BB#2 BB#5
>          %R0<def> = LDD <fi#3>, 0
>          %R1<def> = MOV_ri 0
>          STD %R0<kill>, <fi#7>, 0
>          STD %R1<kill>, <fi#8>, 0
>          JMP <BB#7>
>      Successors according to CFG: BB#7(0)
> 
> 
> 
>    Thank you,
>      Alex
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Alex Susu via llvm-dev

2016-Jul-16 20:00 UTC

head link

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?

Hello, Daniel,
     I was almost to argue now that since it works in Mips and not in BPF
it's got to be
related to the back end code not the common source code. But I just updated my
local LLVM
to the latest 3.9 version at the beginning of July and the bug disappeared - so
I guess
somebody fixed this problem in the common code - probably around 
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (I didn't manage to ).
     If you know exactly what is the change please point it to me.

   Thank you,
     Alex

On 6/29/2016 3:59 PM, Daniel Sanders wrote:> Hi,
>
> I vaguely remember hitting something like this when I was implementing MSA.
IIRC, there
> was an optimization (in DAGCombine or somewhere around there) that was
folding
> CopyToReg instructions into the load without checking whether the new
register class
> was acceptable. I remember adding a target hook to limit this optimization
based on the
> EVT's involved but I'm not sure if that's the patch that I
upstreamed or if it was just
> an initial attempt at fixing it. I had a quick look for a likely hook in
the Mips
> backend and couldn't find it so I'm probably remembering an initial
attempt.
>
>> -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org] On
>> Behalf Of Alex Susu via llvm-dev Sent: 28 June 2016 20:44 To: llvm-dev
Subject:
>> [llvm-dev] Instruction selection problem with type i64 - mistaken as
v8i64?
>>
>> Hello. I am writing a back end in which I combined the existing BPF
LLVM back end
>> with the Mips MSA vector extensions (from the Mips back end) I have
encountered an
>> error when compiling with llc: the instruction selector uses a vector
register
>> instead of a scalar register with type i64 .
>>
>> I have the following part of LLVM IR program: vector.body.preheader:
>> ; preds = %min.iters.checked br label %vector.body
>>
>> vector.body:                                      ; preds =
%vector.body.preheader,
>> %vector.body %index = phi i64 [ %index.next, %vector.body ], [ 0,
>> %vector.body.preheader ] %vec.phi = phi <8 x i64> [ %0,
%vector.body ], [
>> zeroinitializer, %vector.body.preheader ]
>>
>> The ASM code generated from it is the following: LBB0_3:
>> // %vector.body.preheader REGVEC0 = 0 mov     r0, 0 std     -48(r10),
r0 std
>> -128(r10), REGVEC0 jmp     LBB0_4 LBB0_4:                              
//
>> %vector.body ldd     REGVEC0, -128(r10) ldd     r0, -48(r10)
>>
>> I am surprised that the BPF scalar instructions ldd and std use vector
register
>> REGVEC0, which have type v8i64. For example, the TableGen definition of
the LOAD
>> instruction taken from BPFInstrInfo.td is: class
LOADi64<bits<2> SizeOp, string
>> OpcodeStr, PatFrag OpNode> : LOAD<SizeOp, OpcodeStr, [(set
i64:$dst, (OpNode
>> ADDRri:$addr))]>;
>>
>> So I am surprised that the instruction selector finds as match for
operand i64:$dst
>> the vector register REGVEC0, which has type v8i64 as defined below,
inspired from
>> lib/Target/Mips/MipsRegisterInfo.td: def MSA128D:
RegisterClass<"Connex", [v8i64],
>> 512, (sequence "Wd%u", 0, 31)>;
>>
>> Can anybody help with an idea what I can do to fix this problem?
>>
>> Below are a few possibly useful lines from the output of llc, related
to the instr.
>> selection and register allocation of the above piece of code: =====
Instruction
>> selection ends: Selected selection DAG: BB#3
'foo:vector.body.preheader' SelectionDAG
>> has 11 nodes: t0: ch = EntryToken t1: i64 = MOV_ri
TargetConstant:i64<0> t3: ch >> CopyToReg t0, Register:i64 %vreg23,
t1 t11: v8i64 = VLOAD_D TargetConstant:i64<0> t6:
>> ch = CopyToReg t0, Register:v8i64 %vreg24, t11 t8: ch = TokenFactor t3,
t6 t9: ch >> JMP BasicBlock:ch<vector.body 0xa61440>, t8
>>
>> [...]
>>
>> Spilling live registers at end of block. Spilling %vreg31 in %R0 to
stack slot #5
>> Spilling %vreg32 in %Wd0 to stack slot #6 BB#3: derived from LLVM BB
>> %vector.body.preheader Predecessors according to CFG: BB#2
%Wd0<def> = VLOAD_D 0
>> %R0<def> = MOV_ri 0 STD %R0<kill>, <fi#5>, 0 STD
%Wd0<kill>, <fi#6>, 0 JMP <BB#4>
>> Successors according to CFG: BB#4(0)
>>
>> [...]
>>
>>>> JMP <BB#5>
>> Regs: R0 R1=%vreg31* R2=%vreg0 Wd0=%vreg32* Wd1 << JMP
<BB#5> Spilling live registers
>> at end of block. Spilling %vreg31 in %R1 to stack slot #5 Spilling
%vreg32 in %Wd0 to
>> stack slot #6 BB#4: derived from LLVM BB %vector.body Predecessors
according to CFG:
>> BB#3 BB#4 %Wd0<def> = LDD <fi#6>, 0 %R0<def> = LDD
<fi#5>, 0 INLINEASM <es:int
>> index; for (index = 0;  index < N - (N % 8); index += 8) {.
>> _BEGIN_KERNEL(BatchNumber); EXECUTE_IN_ALL(> [sideeffect]
[attdialect] INLINEASM
>> <es:connex->writeDataToArray(&C[index], /*numVectors*/ 1,
/*offset*/ 3);>
>> [sideeffect] [attdialect] %Wd1<def> = LD_D 3; mem:LD64[inttoptr
(i64 3 to <8 x
>> i64>*)](align=8) %Wd0<def> = ADDV_D %Wd1<kill>,
%Wd0<kill> INLINEASM <es:       );.
>> _END_KERNEL(BatchNumber); connex->executeKernel(TEST_PREFIX +
to_string((long long
>> int)BatchNumber)); connex->executeKernel("waitfor");
connex->readReduction();
>>
>> [...]
>>
>>
>> BB#6: derived from LLVM BB %for.body.preheader8 Predecessors according
to CFG: BB#1
>> BB#2 BB#5 %R0<def> = LDD <fi#3>, 0 %R1<def> = MOV_ri
0 STD %R0<kill>, <fi#7>, 0 STD
>> %R1<kill>, <fi#8>, 0 JMP <BB#7> Successors according
to CFG: BB#7(0)
>>
>>
>>
>> Thank you, Alex _______________________________________________ LLVM
Developers
>> mailing list llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - Jul 2016 - Instruction selection problem with type i64 - mistaken as v8i64?

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?

[llvm-dev] Instruction selection problem with type i64 - mistaken as v8i64?