search for: pshufb

Displaying 15 results from an estimated 15 matches for "pshufb".

Did you mean: vpshufb
2009 Dec 18
2
[LLVMdev] AVX Shuffles & PatLeaf Help Needed
...ns using only these types. Yeah, I figured that out after thinking a bit more. However, I think in this case we only want to lower to vNi32 since there are no immediate-mask shuffles in X86 that operate on smaller element types. Doing it at the byte level would just be more confusing, I think. PSHUFB is really a completely different instruction than PSHUFD, for example. -Dave
2009 Dec 18
0
[LLVMdev] AVX Shuffles & PatLeaf Help Needed
...t; Yeah, I figured that out after thinking a bit more. However, I think in this > case we only want to lower to vNi32 since there are no immediate-mask shuffles > in X86 that operate on smaller element types. Doing it at the byte level > would just be more confusing, I think. > > PSHUFB is really a completely different instruction than PSHUFD, for example. Aside from consuming one of its inputs, which is a regalloc problem, it isn't really different. It's just a one-input immediate shuffle, where the immediate is not encoded in the instruction. From the perspective of t...
2010 Oct 28
2
[LLVMdev] llvm 2.8 fixes?
...s? The bug is only present in the release branch, and I have a fix for it (see attachment). IMHO the bug is quite bad, and workarounds are actually ugly and generate inefficient code (couldn't really come up with anything which actually generated correct code which didn't require at least a pshufb). For reference, this is the bug in question: http://llvm.org/bugs/show_bug.cgi?id=8381 Here's also a short example: define <8 x i16> @broadcast_16(<8 x i16> %var1, <8 x i16> %var2) { entry: %0 = shufflevector <8 x i16> %var2, <8 x i16> undef, <8 x i32>...
2010 Oct 28
0
[LLVMdev] llvm 2.8 fixes?
...resent in the release branch, and I > have a fix for it (see attachment). IMHO the bug is quite bad, and > workarounds are actually ugly and generate inefficient code (couldn't > really come up with anything which actually generated correct code which > didn't require at least a pshufb). We have never done "dot" releases off the branch, but there is nothing that prevents doing it in the future, we just need someone to step up to be the release manager for the branch, define policies around what goes in, etc. -Chris
2012 Sep 05
0
[LLVMdev] branch on vector compare?
...seems to issue a pextrd for each element. For x64, it seems > to be the same for either. I suppose it's all academic seeing as the > ptest patch looks good. Yes <4 x i8> cast looks like a good idea. Just be careful though if you also need to target cpus without ssse3, IIRC without pshufb this will create some horrible code (could have been with older llvm version though). Though if you don't have ssse3 you also won't have pextrd, which means more shuffling to extract the values if you sign-extend them to <4 x i32> too (if you're targeting altivec, probably no such...
2012 Sep 04
2
[LLVMdev] branch on vector compare?
Roland Scheidegger <sroland <at> vmware.com> writes: > This looks quite similar to something I filed a bug on (12312). Michael > Liao submitted fixes for this, so I think > if you change it to > %16 = fcmp ogt <4 x float> %15, %cr > %17 = sext <4 x i1> %16 to <4 x i32> > %18 = bitcast <4 x i32> %17 to i128 > %19 = icmp ne i128 %18, 0
2009 Dec 18
0
[LLVMdev] AVX Shuffles & PatLeaf Help Needed
Hello, David > Can you expand on this with an example?  There seems to be an awful lot of > shuffle patterns and predicates in PPCInstrAltivec.td.  What do you mean by, > "Canonicalize to byte ops?"  Can you walk me through how that works with > Altivec? The basic idea is quite simple - lower everything to vNi8 and write all the patterns using only these types. -- With
2011 Oct 17
0
[LLVMdev] LLVM Build Bot failure on llmv-x86_64-ubuntu
...xmm1[1] movzwl __unnamed_2+8(%rip), %eax movd %eax, %xmm2 movzwl __unnamed_2+4(%rip), %eax movd %eax, %xmm1 punpckldq %xmm2, %xmm1 # xmm1 = xmm1[0],xmm2[0],xmm1[1],xmm2[1] punpckldq %xmm1, %xmm0 # xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] movd __unnamed_3(%rip), %xmm1 movss %xmm1, %xmm0 pshufb .LCPI0_0(%rip), %xmm0 movq %xmm0, __unnamed_2(%rip) ret .Ltmp1: .size __unnamed_1, .Ltmp1-__unnamed_1 .Ltmp2: .cfi_endproc .Leh_func_end0: .section ".note.GNU-stack","", at progbits Odd? Joe Abbey Software Architect Arxan Technologies, Inc. 1305 Cumberland Ave, Ste 215...
2009 Dec 17
3
[LLVMdev] AVX Shuffles & PatLeaf Help Needed
On Thursday 17 December 2009 17:16, Nate Begeman wrote: > David, this is probably the wrong approach, based on the accreted awfulness > of the X86 shuffle lowering code, Ha! I have no issue believing this statement. :) > The correct approach is probably a rewrite based around what > AltiVec does: Canonicalize to byte ops, and write all the patterns once > rather than having to
2011 Mar 27
2
[LLVMdev] Long-Term ISel Design
...What we eliminate is this: void X86TargetLowering::LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) { ... isShuffleMaskLegal(...) } bool X86TargetLowering::isShuffleMaskLegal(const SmallVectorImpl<int> &M, EVT VT) const { // FIXME: pshufb, blends, shifts. return (VT.getVectorNumElements() == 2 || ShuffleVectorSDNode::isSplatMask(&M[0], VT) || isMOVLMask(M, VT) || isSHUFPMask(M, VT) || ... } We git rid of this call to isSHUFPMask, which currently happens during legalize. Instead of tryi...
2019 Dec 09
2
[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs
...stCombine/X86/clmulqdq.ll > llvm/test/Transforms/InstCombine/X86/x86-avx2.ll > llvm/test/Transforms/InstCombine/X86/x86-avx512.ll > llvm/test/Transforms/InstCombine/X86/x86-f16c.ll > llvm/test/Transforms/InstCombine/X86/x86-pack.ll > llvm/test/Transforms/InstCombine/X86/x86-pshufb.ll > llvm/test/Transforms/InstCombine/X86/x86-sse.ll > llvm/test/Transforms/InstCombine/X86/x86-sse41.ll > llvm/test/Transforms/InstCombine/X86/x86-sse4a.ll > llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll > llvm/test/Transforms/InstCombine/X86/x86-vpermil.ll &gt...
2020 Aug 31
2
Proposal to remove MMX support.
On Mon, Aug 31, 2020 at 3:02 PM Eli Friedman <efriedma at quicinc.com> wrote: > Broadly speaking, I see two problems with implicitly enabling MMX > emulation on a target that has SSE2: > > > > 1. The interaction with inline asm. Inline asm can still have MMX > operands/results/clobbers, and can still put the processor in MMX mode. If > code is mixing MMX
2011 Apr 09
0
[LLVMdev] Long-Term ISel Design
...up as C++ code in X86ISelDagToDag, which would give us >> all of the problems we had before by moving to X86ISD nodes. > > bool X86TargetLowering::isShuffleMaskLegal(const SmallVectorImpl<int> &M, > EVT VT) const { > // FIXME: pshufb, blends, shifts. > return (VT.getVectorNumElements() == 2 || > ShuffleVectorSDNode::isSplatMask(&M[0], VT) || > isMOVLMask(M, VT) || > isSHUFPMask(M, VT) || > ... > } > > We git rid of this call to isSHUFPMask, which currently happen...
2011 Mar 18
0
[LLVMdev] Long-Term ISel Design
On Mar 17, 2011, at 9:32 AM, David A. Greene wrote: > Chris Lattner <clattner at apple.com> writes: >>> 1. We have special target-specific operators for certain shuffles in X86, >>> such as X86unpckl. > >> It also eliminates a lot of fragility. Before doing this, X86 >> legalize would have to be very careful to specifically form shuffles >> that
2011 Mar 17
2
[LLVMdev] Long-Term ISel Design
Chris Lattner <clattner at apple.com> writes: >> 1. We have special target-specific operators for certain shuffles in X86, >> such as X86unpckl. > It also eliminates a lot of fragility. Before doing this, X86 > legalize would have to be very careful to specifically form shuffles > that it knew isel would turn into (e.g.) unpck operations. Now > instead of