thr3ads.net - search: "pslld"

Displaying 7 results from an estimated 7 matches for "pslld".

Did you mean: pslldq

[LLVMdev] passing vector of booleans to functions

2013 Feb 26

[LLVMdev] passing vector of booleans to functions

...fine <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) { entry: %add = fadd <4 x float> %a, %b %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a ret <4 x float> %sel } I will get: addps %xmm1, %xmm2 pslld $31, %xmm0 blendvps %xmm2, %xmm1 movaps %xmm1, %xmm0 ret While this is correct and works, I'm unhappy with the pssld. Apparently, LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask bit. But blendvps expects the MSB as mask bit and therefore the shift. OK...

[LLVMdev] passing vector of booleans to functions

2013 Feb 26

[LLVMdev] passing vector of booleans to functions

...%mask, <4 x float> %a, <4 x float> %b) { > entry: > %add = fadd <4 x float> %a, %b > %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a > ret <4 x float> %sel > } > > I will get: > > addps %xmm1, %xmm2 > pslld $31, %xmm0 > blendvps %xmm2, %xmm1 > movaps %xmm1, %xmm0 > ret > > While this is correct and works, I'm unhappy with the pssld. Apparently, > LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask > bit. But blendvps expects the MSB as mas...

[LLVMdev] Test failures with 3.4.1

2014 Apr 10

[LLVMdev] Test failures with 3.4.1

...lchain/branches/llvm-toolchain-3.4-3.4+205824/test/CodeGen/X86/vec_shift4.ll -- Exit Code: 1 Command Output (stderr): -- /home/sylvestre/dev/debian/pkg-llvm/llvm-toolchain/branches/llvm-toolchain-3.4-3.4+205824/test/CodeGen/X86/vec_shift4.ll:6:10: error: expected string not found in input ; CHECK: pslld ^ <stdin>:1:2: note: scanning from here .file "<stdin>" ^ <stdin>:8:3: note: possible intended match here vpsllvd %xmm1, %xmm0, %xmm0 ^ -- ******************** FAIL: LLVM :: CodeGen/X86/vshift-4.ll (4843 of 9333) ******************** TEST 'LLVM :: Cod...

[LLVMdev] passing vector of booleans to functions

2013 Feb 26

[LLVMdev] passing vector of booleans to functions

...entry: > > %add = fadd <4 x float> %a, %b > > %sel = select <4 x i1> %mask, <4 x float> %add, <4 x float> %a > > ret <4 x float> %sel > > > > } > > > > I will get: > > > > addps %xmm1, %xmm2 > > pslld $31, %xmm0 > > blendvps %xmm2, %xmm1 > > movaps %xmm1, %xmm0 > > ret > > > > While this is correct and works, I'm unhappy with the pssld. Apparently, > > LLVM uses a <4 x i32> to hold the <4 x i1> while the LSB holds the mask > > bit....

[LLVMdev] Test failures with 3.4.1

2014 Apr 09

[LLVMdev] Test failures with 3.4.1

Hello, Trying the 3.4.1 branch, I get following tests failing: LLVM :: CodeGen/X86/2009-06-05-VZextByteShort.ll LLVM :: CodeGen/X86/fma4-intrinsics-x86_64.ll LLVM :: CodeGen/X86/fp-fast.ll LLVM :: CodeGen/X86/vec_shift4.ll LLVM :: CodeGen/X86/vshift-4.ll I am testing on a Debian testing 64b. Does it ring a bell? Sylvestre

MMX IDCT for theora-exp

2005 Jul 20

MMX IDCT for theora-exp

Hello, I'm attaching IDCT MMX patch. I reused IDCT from theora-a3-MMXd.zip. It should work on 64bit X86 platform too. Here is most used functions when playing video with jet aircrafts (gripen) Ogg logical stream 310b2968 is Theora 720x480 29.97 fps video Encoded frame content is 720x480 with 0x0 offset I can play this video with like 200-300 frame drops on Athlon XP 1700+ CPU load (with

[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.

2013 Oct 15

[LLVMdev] [llvm-commits] r192750 - Enable MI Sched for x86.

.../X86/x86-shifts.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/x86-shifts.ll Tue Oct 15 18:33:07 2013 >> @@ -6,8 +6,8 @@ >> define <4 x i32> @shl4(<4 x i32> %A) nounwind { >> entry: >> ; CHECK: shl4 >> -; CHECK: padd >> ; CHECK: pslld >> +; CHECK: padd >> ; CHECK: ret >> %B = shl <4 x i32> %A, < i32 2, i32 2, i32 2, i32 2> >> %C = shl <4 x i32> %A, < i32 1, i32 1, i32 1, i32 1> >> @@ -67,8 +67,8 @@ entry: >> define <8 x i16> @shl8(<8 x i16> %...

search for: pslld