search for: vector_shuffl

Displaying 20 results from an estimated 67 matches for "vector_shuffl".

Did you mean: vector_shuffle
2009 Feb 11
1
[LLVMdev] Prevent node from being combined
...to know if there is any way to avoid node from being combined. TargetLowering::PerformDAGCombine() is only called if DAGCombiner cannot combine a specific node. It seems that there is no chance to stop it from combining a node. I need the shuffle mask in the machine instruction but sometimes if a vector_shuffle can only return LHS or RHS, it's removed/combined so that I cannot match vector_shuffle in the instruction selector. If the vector_shuffle is combined, I have to write the instruction selector like these: def SUBvv: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1, imm:$mask1), [su...
2010 Aug 04
2
[LLVMdev] x86 Vector Shuffle Patterns
...ent over the old system and provides the context that code generation for AVX needs. This is great! I'm asking because I'm having some trouble converting some AVX patterns over to the new system. I'm getting this error from tblgen: VyPERM2F128PDirrmi: (set:isVoid VR256:v4i64:$dst, (vector_shuffle:v4i64 VR256:v4i64:$src1, (ld:v4i64 addr:iPTR:$src2)<<P:Predicate_unindexedload>><<P:Predicate_load>><<P:Predicate_memop>>)<<P:Predicate_vperm2f128>><<X:SHUFFLE_get_vperm2f128_imm>>) llvm/lib/Target/X86/X86InstrSIMD.td:1705:6: error: In Vy...
2009 Jan 05
2
[LLVMdev] Look-ahead instruction selection
In .td file, if the pattern to match the DAG is: (vector_shuffle (mul build_vector, build_vector)) is it possible to return 'mul' (SDNode*) instead of returning the first 'vector_shuffle'? It seems to me that the default instruction selector can only return the 'root' node of the pattern. Alex. -------------- next part --------------...
2009 Dec 02
5
[LLVMdev] Selecting Vector Shuffle of Different Types
...My attempt looks something like this: defm EXTRACTF128 : avx_fp_extract_vector_osta_node_mri_256<0x19, MRMDestReg, MRMDestMem, "extractf128", undef, X86f32, X86i32i8, // rr [(set VR128:$dst, (v4f32 (vector_shuffle (v8f32 undef), (v8f32 VR256:$src1), VEXTRACTF128_shuffle_mask:$src2)))]>; (This is simplified for the sake of exposition but this gets the idea across). TableGen reports a type contradition: VEXTRACTF128_256mri: (st:...
2009 Jan 05
0
[LLVMdev] Look-ahead instruction selection
On Mon, Jan 5, 2009 at 2:32 PM, Alex <alex.lavoro.propio at gmail.com> wrote: > In .td file, if the pattern to match the DAG is: > > (vector_shuffle (mul build_vector, build_vector)) > > is it possible to return 'mul' (SDNode*) instead of returning the first > 'vector_shuffle'? > > It seems to me that the default instruction selector can only return the > 'root' node of the pattern. The simplest thing...
2016 Mar 18
3
generate vectorized code
...a813b0 [ORD=9] [ID=16] 0x6a85388: v4i32 = add 0x6a81098, 0x6a81e00 [ORD=8] [ID=15] 0x6a81098: v4i32 = add 0x6a81bf0, 0x6a84168 [ORD=6] [ID=12] 0x6a81bf0: v4i32,ch = CopyFromReg 0x6a2b7f0, 0x6a819e0 [ORD=5] [ID=8] 0x6a819e0: v4i32 = Register %vreg4 [ID=1] 0x6a84168: v4i32 = vector_shuffle 0x6a81bf0, 0x6a857a8<2,3,u,u> [ORD=5] [ID=10] 0x6a81bf0: v4i32,ch = CopyFromReg 0x6a2b7f0, 0x6a819e0 [ORD=5] [ID=8] 0x6a819e0: v4i32 = Register %vreg4 [ID=1] 0x6a857a8: v4i32 = undef [ID=2] 0x6a81e00: v4i32 = vector_shuffle 0x6a81098, 0x6a857a8<1,u,u,u> [O...
2012 Mar 02
1
[LLVMdev] vector shuffle emulation/expand in backend?
...s? I've enabled vector registers with addRegisterClass(MVT::v2i32, TCE::V2I32RegsRegisterClass); addRegisterClass(MVT::v2f32, TCE::V2F32RegsRegisterClass); and created patterns for most vector instructions, including insert, extract and build. I've tried to say setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v2i32, Expand); setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v2f32, Expand); but this does not seem to do anything, I still get LLVM ERROR: Cannot select: 0x1fde870: v2i32 = vector_shuffle 0x1fdda70, 0x1fdea80<1,0> [ID=38] 0x1fdda70: v2i32 = add 0x1fddf70, 0x20540e0 [ORD=2811] [I...
2008 Oct 07
2
[LLVMdev] Making Sense of ISel DAG Output
...lection. Some parts of selection know how to clean up > nodes that become dead during selection, but my guess is that > it's missing some cases. Ok, as far as I can tell, here's what's happening. I have the following pattern: let AddedComplexity = 40 in { def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr: $src1))), (v2f64 (scalar_to_vector (loadf64 addr: $src2))), SHUFP_shuffle_mask:$sm)), (SHUFPDrri (v2f64 (MOVSD2PDrm addr:$src1)), (v2f64 (MOVSD2PDrm add...
2009 May 08
2
[LLVMdev] Question on tablegen
Dan, Thanks a lot. Using a modifier in the assembly string works for this case. I am trying to solve a related problem. I am trying to print out a set of "mov" ops for the vector_shuffle node. Since the source of the "mov" is from one of the sources to vector_shuffle, depending on the mask, I am not sure what assembly string to emit. For example, if I have d <- vector_shuffle s1, s2, <0,3> I want to emit mov d.x, s1.x mov d.y, s2.y For this, I need some thin...
2009 May 08
0
[LLVMdev] Question on tablegen
Manjunath, I had a very similar problem and I solved it using a custom vector shuffle and addition instead of mov. For example, Vector_shuffle s1, s2, <0,3> is mapped to a custom instruction where I transform the swizzle to a 32bit integer mask and an inverted mask. So I have dst, src0, src1, imm1, imm2 And I have my asm look similar to: Add dst, src0.imm1, src1.imm2 and then in the asm printer I intercept vector_shuffle and I con...
2009 May 19
1
[LLVMdev] TableGen pattern
Hello, I am trying to convert the subtree (vector_shuffle v2f32, v2f32 (build_vector imm1, imm2)) to a machine instruction that takes 2 v2f32's and 2 immediates. I tried the following table gen pattern : (set v2f32Reg:$dst, (vector_shuffle v2f32Reg:$src1, v2f32Reg:$src2, (build_vector imm:$c1, imm:...
2018 Apr 09
1
llvm-dev Digest, Vol 166, Issue 22
...lt t21, Constant:i32<0> // [c] t25: v2i16 = BUILD_VECTOR t27, t22 // [a c] t18: ch,glue = CopyToReg t0, Register:v2i16 %m0, t25 t19: ch = RTN t18 t20: ch = RTN_REG_HOLDER t19, Register:v2i16 %m0, t18:1 Creating new node: t28: v2i16 = undef Creating new node: t29: v2i16 = vector_shuffle<0,0> t26, undef:v2i16 After reduceBuildVecToShuffle SelectionDAG has 16 nodes: t0: ch = EntryToken t2: v4i16,ch = CopyFromReg t0, Register:v4i16 %0 // [a b c d] t27: i16 = extract_vector_elt t26, Constant:i32<0> t21: v2i16 = extract_subvector t2, Constant:i32&lt...
2016 Mar 18
2
generate vectorized code
...v4i32 = add 0x6a81098, 0x6a81e00 [ORD=8] [ID=15] >> 0x6a81098: v4i32 = add 0x6a81bf0, 0x6a84168 [ORD=6] [ID=12] >> 0x6a81bf0: v4i32,ch = CopyFromReg 0x6a2b7f0, 0x6a819e0 [ORD=5] [ID=8] >> 0x6a819e0: v4i32 = Register %vreg4 [ID=1] >> 0x6a84168: v4i32 = vector_shuffle 0x6a81bf0, 0x6a857a8<2,3,u,u> [ORD=5] [ID=10] >> 0x6a81bf0: v4i32,ch = CopyFromReg 0x6a2b7f0, 0x6a819e0 [ORD=5] [ID=8] >> 0x6a819e0: v4i32 = Register %vreg4 [ID=1] >> 0x6a857a8: v4i32 = undef [ID=2] >> 0x6a81e00: v4i32 = vector_shuffle 0x...
2008 Oct 07
0
[LLVMdev] Making Sense of ISel DAG Output
...ow to clean up >> nodes that become dead during selection, but my guess is that >> it's missing some cases. > > Ok, as far as I can tell, here's what's happening. > > I have the following pattern: > > let AddedComplexity = 40 in { > def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 > addr: > $src1))), > (v2f64 (scalar_to_vector (loadf64 > addr: > $src2))), > SHUFP_shuffle_mask:$sm)), > (SHUFPDrri (v2f64 (MOVSD2PDrm addr:$src1)), >...
2009 Dec 03
2
[LLVMdev] Duplicate Label in Generates ISel
...86/X86GenDAGISel.inc:91821: error: duplicate case value llvm/lib/Target/X86/X86GenDAGISel.inc:91442: error: previously used here This seems to happen because of a pattern I added for VEXTRACTF128 which uses extract_subreg: [(set DSTREGCLASS:$dst, (DSTTYPE (extract_subreg (vector_shuffle (SRCTYPE undef), (SRCTYPE SRCREGCLASS:$src1), VEXTRACTF128_shuffle_mask:$src2), x86_subreg_128bit)))], def x86_subreg_128bit : PatLeaf<(i32 1)>; Curiously, I have analogous patterns for VINSERTF128 that use...
2008 Oct 03
0
[LLVMdev] Making Sense of ISel DAG Output
On Fri, October 3, 2008 9:10 am, David Greene wrote: > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection >> loads have multiple uses, so even though you've managed to match a >> larger pattern that incorporates them, they still need to exist to >> satisfy some other users. > > Yes,
2008 Oct 03
3
[LLVMdev] Making Sense of ISel DAG Output
On Thursday 02 October 2008 19:32, Dan Gohman wrote: > Looking at your dump() output above, it looks like the pre-selection > loads have multiple uses, so even though you've managed to match a > larger pattern that incorporates them, they still need to exist to > satisfy some other users. Yes, I looked at that too. It looks like these other uses end up being chains to
2016 Mar 17
2
generate vectorized code
On Wed, Mar 16, 2016 at 6:38 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > > On Mar 16, 2016, at 5:38 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > > On Wed, Mar 16, 2016 at 11:48 AM, Mehdi Amini <mehdi.amini at apple.com> > wrote: > >> Hi Rail, >> >> Two hints to begin with: >> >> 1) Makes sure you example is
2009 Jan 06
1
[LLVMdev] Look-ahead instruction selection
...s? PS: For GPU guys, I am trying to match the writemask operation after an arithmetic operation. Eli Friedman-2 wrote: > > On Mon, Jan 5, 2009 at 2:32 PM, Alex <alex.lavoro.propio at gmail.com> wrote: >> In .td file, if the pattern to match the DAG is: >> >> (vector_shuffle (mul build_vector, build_vector)) >> >> is it possible to return 'mul' (SDNode*) instead of returning the first >> 'vector_shuffle'? >> >> It seems to me that the default instruction selector can only return the >> 'root' node of the patt...
2016 Mar 23
1
interpretation of dag output
...se> 0x2672810: v4i32 = Register %vreg4 [ID=-3] 0x2672a20: v4i32,ch = CopyFromReg 0x26438b0, 0x2672810 [ORD=5] [ID=-3] 0x26761c8: v4i32 = undef [ID=-3] 0x2672a20: <multiple use> 0x2672a20: <multiple use> 0x26761c8: <multiple use> 0x2674b88: v4i32 = vector_shuffle 0x2672a20, 0x26761c8<2,3,u,u> [ORD=5] [ID=-3] 0x2671ec8: v4i32 = add 0x2672a20, 0x2674b88 [ORD=6] [ID=-3] 0x2672600: i32 = Register %R11 [ID=-3] 0x26438b0: <multiple use> 0x26760c0: i32 = TargetFrameIndex<2> [ID=-3] 0x2674fa8: ch = lifetime.end...