search for: pinsrd

Displaying 8 results from an estimated 8 matches for "pinsrd".

Did you mean: pinsrw
2015 Jul 27
3
[LLVMdev] i1* function argument on x86-64
I am running into a problem with 'i1*' as a function's argument which seems to have appeared since I switched to LLVM 3.6 (but can have other source, of course). If I look at the assembler that the MCJIT generates for an x86-64 target I see that the array 'i1*' is taken as a sequence of 1 bit wide elements. (I guess that's correct). However, I used to call the function
2012 Feb 17
0
[LLVMdev] Folding an insertelt chain
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote: > Hello, > > I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere
2012 Feb 17
3
[LLVMdev] Folding an insertelt chain
Hello, I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch. Regards, Ivan
2016 Aug 05
3
enabling interleaved access loop vectorization
...i,%rcx,4), %xmm4 paddd %xmm3, %xmm4 movdqu 8(%rdi,%rcx,4), %xmm3 paddd %xmm4, %xmm3 movdqa %xmm1, %xmm4 paddq %xmm4, %xmm4 movdqa %xmm0, %xmm5 paddq %xmm5, %xmm5 movd %xmm5, %rcx pextrq $1, %xmm5, %rdx movd %xmm4, %r8 pextrq $1, %xmm4, %r9 movd (%rdi,%rcx,4), %xmm4 # xmm4 = mem[0],zero,zero,zero pinsrd $1, (%rdi,%rdx,4), %xmm4 pinsrd $2, (%rdi,%r8,4), %xmm4 pinsrd $3, (%rdi,%r9,4), %xmm4 paddd %xmm3, %xmm4 movdqu %xmm4, (%rsi,%rax,4) addq $4, %rax paddq %xmm2, %xmm0 paddq %xmm2, %xmm1 cmpq $256, %rax # imm = 0x100 jne .LBB0_3 But the real point is that with interleaved access enable...
2016 May 26
2
enabling interleaved access loop vectorization
Interleaved access is not enabled on X86 yet. We looked at this feature and got into conclusion that interleaving (as loads + shuffles) is not always profitable on X86. We should provide the right cost which depends on number of shuffles. Number of shuffles depends on permutations (shuffle mask). And even if we estimate the number of shuffles, the shuffles are not generated in-place. Vectorizer
2016 Aug 05
2
enabling interleaved access loop vectorization
...%xmm1, %xmm4 > > paddq %xmm4, %xmm4 > > movdqa %xmm0, %xmm5 > > paddq %xmm5, %xmm5 > > movd %xmm5, %rcx > > pextrq $1, %xmm5, %rdx > > movd %xmm4, %r8 > > pextrq $1, %xmm4, %r9 > > movd (%rdi,%rcx,4), %xmm4 # xmm4 = mem[0],zero,zero,zero > > pinsrd $1, (%rdi,%rdx,4), %xmm4 > > pinsrd $2, (%rdi,%r8,4), %xmm4 > > pinsrd $3, (%rdi,%r9,4), %xmm4 > > paddd %xmm3, %xmm4 > > movdqu %xmm4, (%rsi,%rax,4) > > addq $4, %rax > > paddq %xmm2, %xmm0 > > paddq %xmm2, %xmm1 > > cmpq $256, %rax # im...
2016 May 26
0
enabling interleaved access loop vectorization
On 26 May 2016 at 19:12, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Is there a compile-time and/or potential runtime cost that makes > enableInterleavedAccessVectorization() default to 'false'? > > I notice that this is set to true for ARM, AArch64, and PPC. > > In particular, I'm wondering if there's a reason it's not enabled for
2013 Jun 24
1
[LLVMdev] DebugInfo: Missing non-trivially-copyable parameters in SelectionDAG
...dev/llvm/build/clang/debug/bin/./FileCheck /usr/local/google/home/blaikie/dev/llvm/src/test/CodeGen/X86/sse41.ll -check-prefix=X64 -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/home/blaikie/dev/llvm/src/test/CodeGen/X86/sse41.ll:10:8: error: expected string not found in input ; X32: pinsrd $1, 4(%esp), %xmm0 ^ <stdin>:4:12: note: scanning from here _pinsrd_1: ## @pinsrd_1 ^ <stdin>:7:2: note: possible intended match here pinsrd $1, %eax, %xmm0 ^ -- ******************** FAIL: LLVM :: CodeGen/X86/vec_set-F.ll (50 of 51) ******************** TEST 'LL...