Displaying 20 results from an estimated 20000 matches similar to: "VSHUFFLE Splitting"
2017 Jul 06
2
Error in v64i32 type in x86 backend
Hello,
i am experimenting with the increase in register/ vector width to 64
elements of 32 bits instead of 16 in x86 backend.
for eg.
i have a loop with 65 iterations;
if my IR generates v64i32 and 1 scalar, still the backend breaks the v64i32
into 4 v16i32. i want it to retain v64i32. like if there are 128 elements
in loop then it should break it into 2 v64i32 instructions.
in order to do this i
2017 Jul 07
2
Error in v64i32 type in x86 backend
Thank You.
On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes, that error is from instruction selection. I think your legalization
> changes worked fine.
>
> ~Craig
>
> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> also i further run the following command;
2017 Jul 07
2
Error in v64i32 type in x86 backend
also i further run the following command;
llc -debug filer-knl_o3.ll
and its output is attached here. by looking at the output can we say that
legalization runs fine and the error is due to instruction selection/
pattern matching which is not yet implemented?
so do i need to worry and try to correct it at this stage or should i move
forward to implement instruction selection/ pattern matching?
2017 Jul 08
2
Error in v64i32 type in x86 backend
Thank you;
i have changed as follows.is it fine now?
def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1,
VR2048:$src2),
"VADD_256B\t{$src, $dst|$dst, $src}", [(set VR2048:$dst,
(add VR2048:$src1, VR2048:$src2))]]>;
Also here i have changed class RI to I. Does it make any difference?
On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper
2017 Jul 08
2
Error in v64i32 type in x86 backend
Thank you. i understood how avx512 vector instructions are written in
x86instravx512. i need to define my vector instructions so i wrote;
def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
i32mem:$src),
"vmov_256B_rm\t{$src, $dst|$dst, $src}",
[(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32
addr:$src))))],
2017 Jul 07
2
Error in v64i32 type in x86 backend
Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and
http://llvm.org/docs/CodeGenerator.html ?
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector
describes how to define a store instruction.
-Eli
On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
> Please correct me i m stuck at this point.
>
> On Jul 6, 2017 5:18 PM, "hameeza ahmed"
2017 Jul 08
5
Error in v64i32 type in x86 backend
Thank You.
I have seen the opcode is 8 bits and all the combinations are already used
in llvm x86.
Now what to do?
On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes its an opcode conflict. You'll have to look through Intel documents
> and find an unused opcode. I've only added instructions based on a real
> spec so I don't know
2017 Aug 07
2
VBROADCAST Implementation Issues
Hello,
I did as you said,
Please tell me whether the following correct now??
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb),
(VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}},
$src2}"),
[(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
(GatherNode
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code.
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
i2048mem:$src),
"GATHER_256B\t{$src, $dst|$dst, $src}",
[(set VR_2048:$dst, (v64i32 (masked_gather
addr:$src)))],
IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B
2017 Aug 07
3
VBROADCAST Implementation Issues
Thank You. Still getting errors.I have modified my instructions as you said
as follows:
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64WM:$mask_wb),
(ins VR_2048:$src1, VK64WM:$mask, i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst} {${mask}}|${dst}
{${mask}}, $src2}",
[(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32
(masked_gather
2017 Jul 12
2
Using new types v32f32, v32f64 in llvm backend not possible
I would be very grateful if you specify whether there is some way to
allocate registers (different order) / from different register sets to the
same instruction based on the vector width/ no of iterations.
I have tried several alternatives but could not succeed.
Also I have asked this question many times but no one responds.
Is there something wrong with this??
Kindly guide me.
Thank You
On
2018 Apr 10
1
64 bit mask in x86vshuffle instruction
Please tell me whether the following implementation is correct.....
My target supports 64 bit mask means immediate(0-2^63)
I have implemented it but i dont know whether its correct or not. Please
see the changes below that i have made in x86isellowering.cpp
static SDValue lower2048BitVectorShuffle(const SDLoc &DL, ArrayRef<int>
Mask,
MVT VT,
2017 Jul 11
2
Using new types v32f32, v32f64 in llvm backend not possible
Thank you so much. it run fine.
Can you please resolve following issue;
I now have support for v2048i32
but my backend supports v64i32
so ultimately v2048i32 needs to be split into 32 v64i32 instructions. the
only difference between 2 is if its orginally v2048i32 i want my registers
assignment from REG_A set. if its v64i32 originally, then i want registers
from set REG_B.
How to accomplish
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Sorry to ask but what does it mean to put both?
On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote:
> Leave TA. Put both.
>
> ~Craig
>
> On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> You are right. But when i defined my instruction as follows:
>> def P_256B_VADD : I<0xE1,
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
You are right. But when i defined my instruction as follows:
def P_256B_VADD : I<0xE1, MRMDestReg, (outs VRP_2048:$dst), (ins
VRP_2048:$src1, VRPIM_2048:$src2),"P_256B_VADD\t{$src1, $src2, $dst|$dst,
$src1, $src2}", [(set VRP_2048:$dst, (add (v64i32 VRP_2048:$src1), (v64i32
VRP_2048:$src2)))]>, VEX_4V;
I get opcode conflicts? Then what to do?
On Tue, Sep 5, 2017 at 3:51 AM,
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Thank You.
I used EVEX_4V with all the instructions. I replaced TA and EVEX both with
EVEX_4V. Now, I am getting following error:
llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void
llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier():
Assertion `numPhysicalOperands >= 2 + additionalOperands &&
numPhysicalOperands <= 4 + additionalOperands &&
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
Thank You,
I changed TA to EVEX or EVEX_4V. But now i am getting following error:
Invalid prefix!
UNREACHABLE executed at
/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647!
On Tue, Sep 5, 2017 at 4:36 AM, Craig Topper <craig.topper at gmail.com> wrote:
> Not all instructions can use EVEX_4V. Move instructions in particular
> cannot because they don't have 2 sources.
>
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
I was getting same error when i keep both EVEX/EVEX_4V and TA. So, i
restored my original instructions and for that i have to include
bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp
then used this condition;
if(HasTA)
++SrcRegNum;
in order to emit binary correctly.
Is it right?
On Tue, Sep 5, 2017 at 5:45 AM, Craig Topper <craig.topper at gmail.com> wrote:
>
2017 Jul 07
2
Unhandled reg/opcode register encoding VR2048 Error in backend
Hello,
I m working towards backend.
Here i need to define vector load and stores for 64 i32 elements. so in
x86instrinfo.td i wrote;
def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
i32mem:$src),
"vmov_256B_rm\t{$src, $dst|$dst, $src}",
[(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32
addr:$src))))],
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Thank You.
My add instruction has TA as follows:
def P_256B_VADD : I<0xE1, MRMDestReg, (outs VRP_2048:$dst), (ins
VRP_2048:$src1, VRPIM_2048:$src2),"P_256B_VADD\t{$src1, $src2, $dst|$dst,
$src1, $src2}", [(set VRP_2048:$dst, (add (v64i32 VRP_2048:$src1), (v64i32
VRP_2048:$src2)))]>, TA;
so i defined;
bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp
then used