thr3ads.net - search: "vpsrlq"

Displaying 6 results from an estimated 6 matches for "vpsrlq".

[LLVMdev] Can LLVM vectorize <2 x i32> type

2015 Jun 24

[LLVMdev] Can LLVM vectorize <2 x i32> type

Hi, Is LLVM be able to generate code for the following code? %mul = mul <2 x i32> %1, %2, where %1 and %2 are <2 x i32> type. I am running it on a Haswell processor with LLVM-3.4.2. It seems that it will generates really complicated code with vpaddq, vpmuludq, vpsllq, vpsrlq. Thanks, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150624/3ac7c2f4/attachment.html>

[LLVMdev] Can LLVM vectorize <2 x i32> type

2015 Jun 26

[LLVMdev] Can LLVM vectorize <2 x i32> type

...i128 %mskS54_D = icmp ne i128 %BCS54_D, 0 br i1 %mskS54_D, label %middle.block, label %vector.ph Now the assembly for the above IR code is: # BB#4: # %for.cond.preheader vmovdqa 144(%rsp), %xmm0 # 16-byte Reload vpmuludq %xmm7, %xmm0, %xmm2 vpsrlq $32, %xmm7, %xmm4 vpmuludq %xmm4, %xmm0, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpsrlq $32, %xmm0, %xmm4 vpmuludq %xmm7, %xmm4, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpextrq $1, %xmm2, %rax cltq vmovq %rax,...

KNL Assembly Code for Matrix Multiplication

2017 Jul 01

KNL Assembly Code for Matrix Multiplication

...indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14= 3200, 3600, 40000, ............28000. now as you said vpsrlq zmm15, zmm10, 32 ; will shift zmm10(=zmm22) each 64 bit element by 32bit so zmm15=? (can you compute the value of zmm15 here)? On Sat, Jul 1, 2017 at 4:45 AM, Craig Topper <craig.topper at gmail.com> wrote: > If you see a comment after an instruction that contains LCP in the &gt...

Lets do a 1.3.2 release

2016 Jan 18

Lets do a 1.3.2 release

Dave Yeo wrote: > Seems that the default binutils on OS/2 is too old to support AVX2, > attached patch works around this. Not the best solution as best would be > configure tests, but simple. Are you sure that these binutils support AVX and FMA? (Currently libFLAC doesn't contain AVX and FMA instructions). If they aren't supported then it's better to include them too into

Lets do a 1.3.2 release

2016 Jan 18

Lets do a 1.3.2 release

...ot much up on assembly, make[4]: Entering directory `K:/usr/local/src/flac/src/libFLAC' CC lpc_intrin_avx2.lo R:/tmp/ccwvrScM.s: Assembler messages: R:/tmp/ccwvrScM.s:495: Error: operand type mismatch for `vbroadcastss' ... R:/tmp/ccwvrScM.s:8773: Error: operand type mismatch for `vpsrlq' R:/tmp/ccwvrScM.s:8778: Error: no such instruction: `vpermd %ymm1,%ymm5,%ymm0' R:/tmp/ccwvrScM.s:8859: Error: operand type mismatch for `vpmovzxdq' ... Best to be safe so updated patch attached. I've also opened a ticket, http://trac.netlabs.org/rpm/ticket/165#ticket Dave -------...

[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.

2013 Oct 15

[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.

.../X86/avx-arith.ll Tue Oct 15 18:33:07 2013 >> @@ -240,15 +240,15 @@ define <16 x i16> @vpmullw(<16 x i16> %i >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> ; CHECK-NEXT: vpaddq %xmm >> -; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsrlq $32, %xmm >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> +; CHECK-NEXT: vpaddq %xmm >> +; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsrlq $32, %xmm >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> ; CHECK-NEXT...

search for: vpsrlq