search for: vpmuludq

Displaying 4 results from an estimated 4 matches for "vpmuludq".

2015 Jun 26
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
...tcast <2 x i64> %sextS54_D to i128 %mskS54_D = icmp ne i128 %BCS54_D, 0 br i1 %mskS54_D, label %middle.block, label %vector.ph Now the assembly for the above IR code is: # BB#4: # %for.cond.preheader vmovdqa 144(%rsp), %xmm0 # 16-byte Reload vpmuludq %xmm7, %xmm0, %xmm2 vpsrlq $32, %xmm7, %xmm4 vpmuludq %xmm4, %xmm0, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpsrlq $32, %xmm0, %xmm4 vpmuludq %xmm7, %xmm4, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpextrq $1, %xmm...
2015 Jun 24
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
Hi, Is LLVM be able to generate code for the following code? %mul = mul <2 x i32> %1, %2, where %1 and %2 are <2 x i32> type. I am running it on a Haswell processor with LLVM-3.4.2. It seems that it will generates really complicated code with vpaddq, vpmuludq, vpsllq, vpsrlq. Thanks, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150624/3ac7c2f4/attachment.html>
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You, It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 = [8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14= 3200, 3600, 40000, ............28000. now as you said vpsrlq zmm15, zmm10, 32 ; will shift zmm10(=zmm22) each 64 bit element by 32bit so zmm15=? (can you compute the value of zmm15 here)?...
2013 Oct 15
0
[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.
...================================================ >> --- llvm/trunk/test/CodeGen/X86/avx-arith.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/avx-arith.ll Tue Oct 15 18:33:07 2013 >> @@ -240,15 +240,15 @@ define <16 x i16> @vpmullw(<16 x i16> %i >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> ; CHECK-NEXT: vpaddq %xmm >> -; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsrlq $32, %xmm >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> +; CHECK-NEXT: vpaddq %xmm >> +; CHECK-NEXT: vpmu...