Displaying 7 results from an estimated 7 matches for "pslld".
Did you mean:
pslldq
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
...fine <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) {
entry:
%add = fadd <4 x float> %a, %b
%sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get:
addps %xmm1, %xmm2
pslld $31, %xmm0
blendvps %xmm2, %xmm1
movaps %xmm1, %xmm0
ret
While this is correct and works, I'm unhappy with the pssld. Apparently,
LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
bit. But blendvps expects the MSB as mask bit and therefore the shift.
OK...
2013 Feb 26
0
[LLVMdev] passing vector of booleans to functions
...%mask, <4 x float> %a, <4 x float>
%b) {
> entry:
> %add = fadd <4 x float> %a, %b
> %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
> ret <4 x float> %sel
> }
>
> I will get:
>
> addps %xmm1, %xmm2
> pslld $31, %xmm0
> blendvps %xmm2, %xmm1
> movaps %xmm1, %xmm0
> ret
>
> While this is correct and works, I'm unhappy with the pssld. Apparently,
> LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
> bit. But blendvps expects the MSB as mas...
2014 Apr 10
3
[LLVMdev] Test failures with 3.4.1
...lchain/branches/llvm-toolchain-3.4-3.4+205824/test/CodeGen/X86/vec_shift4.ll
--
Exit Code: 1
Command Output (stderr):
--
/home/sylvestre/dev/debian/pkg-llvm/llvm-toolchain/branches/llvm-toolchain-3.4-3.4+205824/test/CodeGen/X86/vec_shift4.ll:6:10:
error: expected string not found in input
; CHECK: pslld
^
<stdin>:1:2: note: scanning from here
.file "<stdin>"
^
<stdin>:8:3: note: possible intended match here
vpsllvd %xmm1, %xmm0, %xmm0
^
--
********************
FAIL: LLVM :: CodeGen/X86/vshift-4.ll (4843 of 9333)
******************** TEST 'LLVM :: Cod...
2013 Feb 26
1
[LLVMdev] passing vector of booleans to functions
...entry:
> > %add = fadd <4 x float> %a, %b
> > %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a
> > ret <4 x float> %sel
> >
> > }
> >
> > I will get:
> >
> > addps %xmm1, %xmm2
> > pslld $31, %xmm0
> > blendvps %xmm2, %xmm1
> > movaps %xmm1, %xmm0
> > ret
> >
> > While this is correct and works, I'm unhappy with the pssld. Apparently,
> > LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask
> > bit....
2014 Apr 09
2
[LLVMdev] Test failures with 3.4.1
Hello,
Trying the 3.4.1 branch, I get following tests failing:
LLVM :: CodeGen/X86/2009-06-05-VZextByteShort.ll
LLVM :: CodeGen/X86/fma4-intrinsics-x86_64.ll
LLVM :: CodeGen/X86/fp-fast.ll
LLVM :: CodeGen/X86/vec_shift4.ll
LLVM :: CodeGen/X86/vshift-4.ll
I am testing on a Debian testing 64b.
Does it ring a bell?
Sylvestre
2005 Jul 20
1
MMX IDCT for theora-exp
Hello,
I'm attaching IDCT MMX patch. I reused IDCT from theora-a3-MMXd.zip.
It should work on 64bit X86 platform too.
Here is most used functions when playing video with jet aircrafts (gripen)
Ogg logical stream 310b2968 is Theora 720x480 29.97 fps video
Encoded frame content is 720x480 with 0x0 offset
I can play this video with like 200-300 frame drops on Athlon XP 1700+
CPU load (with
2013 Oct 15
0
[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.
.../X86/x86-shifts.ll (original)
>> +++ llvm/trunk/test/CodeGen/X86/x86-shifts.ll Tue Oct 15 18:33:07 2013
>> @@ -6,8 +6,8 @@
>> define <4 x i32> @shl4(<4 x i32> %A) nounwind {
>> entry:
>> ; CHECK: shl4
>> -; CHECK: padd
>> ; CHECK: pslld
>> +; CHECK: padd
>> ; CHECK: ret
>> %B = shl <4 x i32> %A, < i32 2, i32 2, i32 2, i32 2>
>> %C = shl <4 x i32> %A, < i32 1, i32 1, i32 1, i32 1>
>> @@ -67,8 +67,8 @@ entry:
>> define <8 x i16> @shl8(<8 x i16> %...