thr3ads.net - similar to: "VBROADCAST Implementation Issues"

Displaying 20 results from an estimated 200 matches similar to: "VBROADCAST Implementation Issues"

2017 Aug 07

VBROADCAST Implementation Issues

Hello, I did as you said, Please tell me whether the following correct now?? def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2), "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}}, $src2}"), [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32 (GatherNode

VBROADCAST Implementation Issues

2017 Aug 06

VBROADCAST Implementation Issues

i want to implement gather for v64i32. i wrote following code. def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins i2048mem:$src), "GATHER_256B\t{$src, $dst|$dst, $src}", [(set VR_2048:$dst, (v64i32 (masked_gather addr:$src)))], IIC_MOV_MEM>, TA; def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

2019 Dec 09

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

Sanjay, I'm looking at some missed optimizations caused by D70246. Here's a test case: define <4 x float> @f(i32 %t32, <4 x float>* %t24) { .entry: %t43 = insertelement <3 x i32> undef, i32 %t32, i32 2 %t44 = bitcast <3 x i32> %t43 to <3 x float> %t45 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32> <i32 0, i32 undef,

Node deletion during DAG Combination ?

2018 Jun 20

Node deletion during DAG Combination ?

Hi, I'm trying to optimize the 'extract_vector_elt' for my SIMD microcontroller. The idea is, during DAG combination, to merge load/extract sequence into an architecture specific node. During Instruction Selection, this specific node will be target selected to an architecture specific instruction. By 'combination of DAG nodes' I understand 'replacing a set of DAG nodes by

Instruction selection problems due to SelectionDAGBuilder

2016 Aug 02

Instruction selection problems due to SelectionDAGBuilder

Hello. I'm having problems at instruction selection with my back end with the following basic-block due to a vector add with immediate constant vector (obtained by vectorizing a simple C program doing vector sum map): vector.ph: ; preds = %vector.memcheck50 %.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0

64 bit mask in x86vshuffle instruction

2018 Apr 10

64 bit mask in x86vshuffle instruction

Please tell me whether the following implementation is correct..... My target supports 64 bit mask means immediate(0-2^63) I have implemented it but i dont know whether its correct or not. Please see the changes below that i have made in x86isellowering.cpp static SDValue lower2048BitVectorShuffle(const SDLoc &DL, ArrayRef<int> Mask, MVT VT,

Please explain "do.call" in this context, or critique to "stack this list faster"

2010 Sep 04

Please explain "do.call" in this context, or critique to "stack this list faster"

I've been doing some consulting with students who seem to come to R from SAS. They are usually pre-occupied with do loops and it is tough to persuade them to trust R lists rather than keeping 100s of named matrices floating around. Often it happens that there is a list with lots of matrices or data frames in it and we need to "stack those together". I thought it would be a simple

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

2017 Oct 13

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

Hello, I've hit an assertion in SelectionDAG where we try to merge 2 loads that have the same operands but their MMO flags differ. One is dereferenceable and one is not. I'm not sure what the underlying issue here is: 1) MDSDNode with the same operands should have the same flags set on their respective MMO. The fact the flags differ when the opcode,types,operands and address-space are

[LLVMdev] question on table gen TIED_TO constraint

2012 Jul 09

[LLVMdev] question on table gen TIED_TO constraint

I need to implement an instruction which has 2 read-write registers, so I added let Constraints = "$src1 = $dst, $mask = $mask_wb" in { ... def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb), (ins VR128:$src1, v128mem:$src2, VR128:$mask), ... } There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand. See the section about

[LLVMdev] question on table gen TIED_TO constraint

2012 Jul 10

[LLVMdev] question on table gen TIED_TO constraint

Yes, there is an easy way to fix this. MRMSrcMem assumes register, memory, vvvv register if VEX_4VOp3 is true and assumes register, vvvv register, memory if VEX_4V is true. I just need to change the flag from VEX_4VOp3 to VEX_4V. There are a few places where we assume only the 2nd operand can be tied-to: Desc->getOperandConstraint(1, MCOI::TIED_TO) != -1 (hard-coded index 1) I will fix those

[LLVMdev] question on table gen TIED_TO constraint

2012 Jul 10

[LLVMdev] question on table gen TIED_TO constraint

I don't think changing to VEX_4VOp3 to VEX_4V is the right fix. I think the fix is to increment CurOp twice at the start for these instructions so that only the input operands are used for encoding. Also, I just submitted a patch to revert the operand order for these instructions in the assembler/disassembler. Destination register should appear on the right and the mask should appear on the

[LLVMdev] question on table gen TIED_TO constraint

2012 Jul 10

[LLVMdev] question on table gen TIED_TO constraint

On Jul 9, 2012, at 4:15 PM, Manman Ren <mren at apple.com> wrote: > > I need to implement an instruction which has 2 read-write registers, so I added > let Constraints = "$src1 = $dst, $mask = $mask_wb" in { > ... > def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb), > (ins VR128:$src1, v128mem:$src2, VR128:$mask), > ... > }

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 18

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td, but it appears to be ignored. However, the condition was detected when specified as a predicate. So this doesn't work: def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr: $src)>, *Requires<[OptForSize**]>*; But this does: * let Predicates = [OptForSize]

Error in v64i32 type in x86 backend

2017 Jul 06

Error in v64i32 type in x86 backend

Hello, i am experimenting with the increase in register/ vector width to 64 elements of 32 bits instead of 16 in x86 backend. for eg. i have a loop with 65 iterations; if my IR generates v64i32 and 1 scalar, still the backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128 elements in loop then it should break it into 2 v64i32 instructions. in order to do this i

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and http://llvm.org/docs/CodeGenerator.html ? http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector describes how to define a store instruction. -Eli On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote: > Please correct me i m stuck at this point. > > On Jul 6, 2017 5:18 PM, "hameeza ahmed"

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

also i further run the following command; llc -debug filer-knl_o3.ll and its output is attached here. by looking at the output can we say that legalization runs fine and the error is due to instruction selection/ pattern matching which is not yet implemented? so do i need to worry and try to correct it at this stage or should i move forward to implement instruction selection/ pattern matching?

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 19

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: 19 September 2014 01:36 > To: Sanjay Patel > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] predicates vs. requirements [TableGen, > X86InstrInfo.td] > > On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote: >

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

Thank You. On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes, that error is from instruction selection. I think your legalization > changes worked fine. > > ~Craig > > On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> also i further run the following command;

Using new types v32f32, v32f64 in llvm backend not possible

2017 Jul 11

Using new types v32f32, v32f64 in llvm backend not possible

Thank you so much. it run fine. Can you please resolve following issue; I now have support for v2048i32 but my backend supports v64i32 so ultimately v2048i32 needs to be split into 32 v64i32 instructions. the only difference between 2 is if its orginally v2048i32 i want my registers assignment from REG_A set. if its v64i32 originally, then i want registers from set REG_B. How to accomplish

Error in v64i32 type in x86 backend

2017 Jul 08

Error in v64i32 type in x86 backend

Thank you. i understood how avx512 vector instructions are written in x86instravx512. i need to define my vector instructions so i wrote; def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins i32mem:$src), "vmov_256B_rm\t{$src, $dst|$dst, $src}", [(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32 addr:$src))))],

similar to: VBROADCAST Implementation Issues