Displaying 20 results from an estimated 200 matches similar to: "VBROADCAST Implementation Issues"
2017 Aug 07
2
VBROADCAST Implementation Issues
Hello,
I did as you said,
Please tell me whether the following correct now??
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb),
(VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}},
$src2}"),
[(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
(GatherNode
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code.
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
i2048mem:$src),
"GATHER_256B\t{$src, $dst|$dst, $src}",
[(set VR_2048:$dst, (v64i32 (masked_gather
addr:$src)))],
IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B
2019 Dec 09
2
[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs
Sanjay,
I'm looking at some missed optimizations caused by D70246. Here's a test case:
define <4 x float> @f(i32 %t32, <4 x float>* %t24) {
.entry:
%t43 = insertelement <3 x i32> undef, i32 %t32, i32 2
%t44 = bitcast <3 x i32> %t43 to <3 x float>
%t45 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32>
<i32 0, i32 undef,
2018 Jun 20
2
Node deletion during DAG Combination ?
Hi,
I'm trying to optimize the 'extract_vector_elt' for my SIMD microcontroller.
The idea is, during DAG combination, to merge load/extract sequence into an architecture specific node.
During Instruction Selection, this specific node will be target selected to an architecture specific instruction.
By 'combination of DAG nodes' I understand 'replacing a set of DAG nodes by
2016 Aug 02
2
Instruction selection problems due to SelectionDAGBuilder
Hello.
I'm having problems at instruction selection with my back end with the following
basic-block due to a vector add with immediate constant vector (obtained by vectorizing a
simple C program doing vector sum map):
vector.ph: ; preds = %vector.memcheck50
%.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0
2018 Apr 10
1
64 bit mask in x86vshuffle instruction
Please tell me whether the following implementation is correct.....
My target supports 64 bit mask means immediate(0-2^63)
I have implemented it but i dont know whether its correct or not. Please
see the changes below that i have made in x86isellowering.cpp
static SDValue lower2048BitVectorShuffle(const SDLoc &DL, ArrayRef<int>
Mask,
MVT VT,
2010 Sep 04
4
Please explain "do.call" in this context, or critique to "stack this list faster"
I've been doing some consulting with students who seem to come to R
from SAS. They are usually pre-occupied with do loops and it is tough
to persuade them to trust R lists rather than keeping 100s of named
matrices floating around.
Often it happens that there is a list with lots of matrices or data
frames in it and we need to "stack those together". I thought it
would be a simple
2017 Oct 13
2
[SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hello,
I've hit an assertion in SelectionDAG where we try to merge 2 loads
that have the same operands but their MMO flags differ. One is
dereferenceable and one is not. I'm not sure what the underlying issue
here is:
1) MDSDNode with the same operands should have the same flags set on
their respective MMO. The fact the flags differ when the
opcode,types,operands and address-space are
2012 Jul 09
2
[LLVMdev] question on table gen TIED_TO constraint
I need to implement an instruction which has 2 read-write registers, so I added
let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
...
def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
(ins VR128:$src1, v128mem:$src2, VR128:$mask),
...
}
There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand.
See the section about
2012 Jul 10
2
[LLVMdev] question on table gen TIED_TO constraint
Yes, there is an easy way to fix this.
MRMSrcMem assumes register, memory, vvvv register if VEX_4VOp3 is true and assumes register, vvvv register, memory if VEX_4V is true.
I just need to change the flag from VEX_4VOp3 to VEX_4V. There are a few places where we assume only the 2nd operand can be tied-to:
Desc->getOperandConstraint(1, MCOI::TIED_TO) != -1 (hard-coded index 1)
I will fix those
2012 Jul 10
0
[LLVMdev] question on table gen TIED_TO constraint
I don't think changing to VEX_4VOp3 to VEX_4V is the right fix. I think the
fix is to increment CurOp twice at the start for these instructions so that
only the input operands are used for encoding.
Also, I just submitted a patch to revert the operand order for these
instructions in the assembler/disassembler. Destination register should
appear on the right and the mask should appear on the
2012 Jul 10
0
[LLVMdev] question on table gen TIED_TO constraint
On Jul 9, 2012, at 4:15 PM, Manman Ren <mren at apple.com> wrote:
>
> I need to implement an instruction which has 2 read-write registers, so I added
> let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
> ...
> def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
> (ins VR128:$src1, v128mem:$src2, VR128:$mask),
> ...
> }
2014 Sep 18
3
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td,
but it appears to be ignored. However, the condition was detected when
specified as a predicate.
So this doesn't work:
def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr:
$src)>,
*Requires<[OptForSize**]>*;
But this does:
* let Predicates = [OptForSize]
2017 Jul 06
2
Error in v64i32 type in x86 backend
Hello,
i am experimenting with the increase in register/ vector width to 64
elements of 32 bits instead of 16 in x86 backend.
for eg.
i have a loop with 65 iterations;
if my IR generates v64i32 and 1 scalar, still the backend breaks the v64i32
into 4 v16i32. i want it to retain v64i32. like if there are 128 elements
in loop then it should break it into 2 v64i32 instructions.
in order to do this i
2017 Jul 07
2
Error in v64i32 type in x86 backend
Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and
http://llvm.org/docs/CodeGenerator.html ?
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector
describes how to define a store instruction.
-Eli
On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
> Please correct me i m stuck at this point.
>
> On Jul 6, 2017 5:18 PM, "hameeza ahmed"
2017 Jul 07
2
Error in v64i32 type in x86 backend
also i further run the following command;
llc -debug filer-knl_o3.ll
and its output is attached here. by looking at the output can we say that
legalization runs fine and the error is due to instruction selection/
pattern matching which is not yet implemented?
so do i need to worry and try to correct it at this stage or should i move
forward to implement instruction selection/ pattern matching?
2014 Sep 19
2
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Tom Stellard
> Sent: 19 September 2014 01:36
> To: Sanjay Patel
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] predicates vs. requirements [TableGen,
> X86InstrInfo.td]
>
> On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote:
>
2017 Jul 07
2
Error in v64i32 type in x86 backend
Thank You.
On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes, that error is from instruction selection. I think your legalization
> changes worked fine.
>
> ~Craig
>
> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> also i further run the following command;
2017 Jul 11
2
Using new types v32f32, v32f64 in llvm backend not possible
Thank you so much. it run fine.
Can you please resolve following issue;
I now have support for v2048i32
but my backend supports v64i32
so ultimately v2048i32 needs to be split into 32 v64i32 instructions. the
only difference between 2 is if its orginally v2048i32 i want my registers
assignment from REG_A set. if its v64i32 originally, then i want registers
from set REG_B.
How to accomplish
2017 Jul 08
2
Error in v64i32 type in x86 backend
Thank you. i understood how avx512 vector instructions are written in
x86instravx512. i need to define my vector instructions so i wrote;
def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins
i32mem:$src),
"vmov_256B_rm\t{$src, $dst|$dst, $src}",
[(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32
addr:$src))))],