search for: punpcklwd

Displaying 6 results from an estimated 6 matches for "punpcklwd".

2010 Aug 02
0
[LLVMdev] Register Allocation ERROR! Ran out of registers during register allocation!
...d ffmpeg-0.6 using Clang, error output: CC libavcodec/x86/mpegvideo_mmx.o fatal error: error in backend: Ran out of registers during register allocation! Please check your inline asm statement for invalid constraints: INLINEASM <es:movd %eax, %xmm3 pshuflw $$0, %xmm3, %xmm3 punpcklwd %xmm3, %xmm3 pxor %xmm7, %xmm7 pxor %xmm4, %xmm4 movdqa ($2), %xmm5 pxor %xmm6, %xmm6 psubw ($3), %xmm6 mov $$-128, %eax .align 1 << 4 1: movdqa ($1, %eax), %xmm0 movdqa %xmm0, %xmm1 pabsw %xmm0, %xmm0...
2010 Oct 28
2
[LLVMdev] llvm 2.8 fixes?
...w_bug.cgi?id=8381 Here's also a short example: define <8 x i16> @broadcast_16(<8 x i16> %var1, <8 x i16> %var2) { entry: %0 = shufflevector <8 x i16> %var2, <8 x i16> undef, <8 x i32> zeroinitializer ret <8 x i16> %0 } Which miscompiles badly to punpcklwd %xmm0, %xmm1 pshufd $0, %xmm1, %xmm0 ret (This happens for all similar broadcast shuffles, except if the reg containing the vector to shuffle happens to be xmm0 just by luck). Roland -------------- next part -------------- A non-text attachment was scrubbed... Name: shuf_fix.diff Type: tex...
2005 Jul 20
1
MMX IDCT for theora-exp
...1) I(2) I(3) is the transpose of r0 I(1) r2 r3. + J(4) J(5) J(6) J(7) is the transpose of r4 r5 r6 r7. + + Since r1 is free at entry, we calculate the Js first. */ + + + +#define Transpose ASM("\n#Transpose\n" \ + \ + " movq "r4","r1"\n" \ + " punpcklwd "r5","r4"\n" \ + " movq "r0","I(0)"\n" \ + " punpckhwd "r5","r1"\n" \ + " movq "r6","r0"\n" \ + " punpcklwd "r7","r6"\n" \ + " movq...
2005 Aug 17
2
MMX loop filter for theora-exp
...uot; /* 0 0 0 0 b a 9 8 */ \ +"movd (%0,%%esi),%%mm3\n" /* 0 0 0 0 f e d c */ \ +"punpcklbw %%mm1,%%mm0\n" /* mm0 = 7 3 6 2 5 1 4 0 */ \ +"punpcklbw %%mm3,%%mm2\n" /* mm2 = f b e a d 9 c 8 */ \ +"movq %%mm0,%%mm1\n" /* mm1 = 7 3 6 2 5 1 4 0 */ \ +"punpcklwd %%mm2,%%mm1\n" /* mm1 = d 9 5 1 c 8 4 0 */ \ +"punpckhwd %%mm2,%%mm0\n" /* mm0 = f b 7 3 e a 6 2 */ \ +"pxor %%mm7,%%mm7\n" \ +"movq %%mm1,%%mm5\n" /* mm5 = d 9 5 1 c 8 4 0 */ \ +"punpckhbw %%mm7,%%mm5\n" /* mm5 = 0 d 0 9 0 5 0 1 = pix[1]*/ \ +"pu...
2004 Aug 24
5
MMX/mmxext optimisations
quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin
2006 May 25
2
Compilation issues with s390
Hi all, I'm trying to compile asterisk on the mainframe (s390 / s390x) and I am running into issues. I was wondering if somebody could give a hand? I'm thinking that I should be able to do this. I have noticed that Debian even has binary RPM's out for Asterisk now. I'm trying to do this on SuSE SLES8 (with the 2.4 kernel). What I see is, an issue that arch=s390 isn't