search for: vpsllq

Displaying 7 results from an estimated 7 matches for "vpsllq".

2015 Jun 24
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
Hi, Is LLVM be able to generate code for the following code? %mul = mul <2 x i32> %1, %2, where %1 and %2 are <2 x i32> type. I am running it on a Haswell processor with LLVM-3.4.2. It seems that it will generates really complicated code with vpaddq, vpmuludq, vpsllq, vpsrlq. Thanks, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150624/3ac7c2f4/attachment.html>
2015 Jun 26
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
...el %middle.block, label %vector.ph Now the assembly for the above IR code is: # BB#4: # %for.cond.preheader vmovdqa 144(%rsp), %xmm0 # 16-byte Reload vpmuludq %xmm7, %xmm0, %xmm2 vpsrlq $32, %xmm7, %xmm4 vpmuludq %xmm4, %xmm0, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpsrlq $32, %xmm0, %xmm4 vpmuludq %xmm7, %xmm4, %xmm4 vpsllq $32, %xmm4, %xmm4 vpaddq %xmm4, %xmm2, %xmm2 vpextrq $1, %xmm2, %rax cltq vmovq %rax, %xmm4 vmovq %xmm2, %rax cltq vmovq %rax, %xmm5...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...5]. >>>>> it will now become [0,0,0,0,8,9,10,11]...Am I correct? Please explain me >>>>> the purpose of this step.* >>>>> * vpmuludq zmm15, zmm15, zmm2 ; similarly **dont understand the >>>>> need for this step.* >>>>> * vpsllq zmm15, zmm15, 32 ; **dont understand the need for this >>>>> step* >>>>> * vpaddq zmm14, zmm14, zmm3 ; * >>>>> * vpaddq zmm14, zmm15, zmm14 ; **dont understand the need for this >>>>> step* >>>>> >>>> >&gt...
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
...%zextS39_D to i128 %mskS39_D = icmp ne i128 %BCS39_D, 0 br i1 %mskS39_D, label %if.then11, label %if.else -------------------------------------------- Assembly ----------------------------------------------------------------- # BB#3: # %if.then.i.i.i.i.i.i vpsllq $3, %xmm0, %xmm0 vpextrq $1, %xmm0, %rbx movq %rbx, %rdi vmovaps %xmm2, 96(%rsp) # 16-byte Spill vmovaps %xmm5, 64(%rsp) # 16-byte Spill vmovdqa %xmm6, 16(%rsp) # 16-byte Spill callq _Znam movq %rax, 128(%rsp) movq 16(%r12), %rsi...
2015 Jul 24
1
[LLVMdev] SIMD for sdiv <2 x i64>
...icmp ne i128 %BCS39_D, 0 > br i1 %mskS39_D, label %if.then11, label %if.else > > -------------------------------------------- Assembly > ----------------------------------------------------------------- > > # BB#3: # %if.then.i.i.i.i.i.i > vpsllq $3, %xmm0, %xmm0 > vpextrq $1, %xmm0, %rbx > movq %rbx, %rdi > vmovaps %xmm2, 96(%rsp) # 16-byte Spill > vmovaps %xmm5, 64(%rsp) # 16-byte Spill > vmovdqa %xmm6, 16(%rsp) # 16-byte Spill > callq _Znam > movq %rax, 128...
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
On 07/24/2015 03:42 AM, Benjamin Kramer wrote: >> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote: >> >> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing
2013 Oct 15
0
[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.
...=========== >> --- llvm/trunk/test/CodeGen/X86/avx-arith.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/avx-arith.ll Tue Oct 15 18:33:07 2013 >> @@ -240,15 +240,15 @@ define <16 x i16> @vpmullw(<16 x i16> %i >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> ; CHECK-NEXT: vpaddq %xmm >> -; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsrlq $32, %xmm >> ; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vpsllq $32, %xmm >> +; CHECK-NEXT: vpaddq %xmm >> +; CHECK-NEXT: vpmuludq %xmm >> ; CHECK-NEXT: vp...