Displaying 6 results from an estimated 6 matches for "punpcklwd".
2010 Aug 02
0
[LLVMdev] Register Allocation ERROR! Ran out of registers during register allocation!
...d
ffmpeg-0.6 using Clang, error output:
CC libavcodec/x86/mpegvideo_mmx.o
fatal error: error in backend: Ran out of registers during register
allocation!
Please check your inline asm statement for invalid constraints:
INLINEASM <es:movd %eax, %xmm3
pshuflw $$0, %xmm3, %xmm3
punpcklwd %xmm3, %xmm3
pxor %xmm7, %xmm7
pxor %xmm4, %xmm4
movdqa ($2), %xmm5
pxor %xmm6, %xmm6
psubw ($3), %xmm6
mov $$-128, %eax
.align 1 << 4
1:
movdqa ($1, %eax), %xmm0
movdqa %xmm0, %xmm1
pabsw %xmm0, %xmm0...
2010 Oct 28
2
[LLVMdev] llvm 2.8 fixes?
...w_bug.cgi?id=8381
Here's also a short example:
define <8 x i16> @broadcast_16(<8 x i16> %var1, <8 x i16> %var2) {
entry:
%0 = shufflevector <8 x i16> %var2, <8 x i16> undef, <8 x i32>
zeroinitializer
ret <8 x i16> %0
}
Which miscompiles badly to
punpcklwd %xmm0, %xmm1
pshufd $0, %xmm1, %xmm0
ret
(This happens for all similar broadcast shuffles, except if the reg
containing the vector to shuffle happens to be xmm0 just by luck).
Roland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shuf_fix.diff
Type: tex...
2005 Jul 20
1
MMX IDCT for theora-exp
...1) I(2) I(3) is the transpose of r0 I(1) r2 r3.
+ J(4) J(5) J(6) J(7) is the transpose of r4 r5 r6 r7.
+
+ Since r1 is free at entry, we calculate the Js first. */
+
+
+
+#define Transpose ASM("\n#Transpose\n" \
+ \
+ " movq "r4","r1"\n" \
+ " punpcklwd "r5","r4"\n" \
+ " movq "r0","I(0)"\n" \
+ " punpckhwd "r5","r1"\n" \
+ " movq "r6","r0"\n" \
+ " punpcklwd "r7","r6"\n" \
+ " movq...
2005 Aug 17
2
MMX loop filter for theora-exp
...uot; /* 0 0 0 0 b a 9 8 */ \
+"movd (%0,%%esi),%%mm3\n" /* 0 0 0 0 f e d c */ \
+"punpcklbw %%mm1,%%mm0\n" /* mm0 = 7 3 6 2 5 1 4 0 */ \
+"punpcklbw %%mm3,%%mm2\n" /* mm2 = f b e a d 9 c 8 */ \
+"movq %%mm0,%%mm1\n" /* mm1 = 7 3 6 2 5 1 4 0 */ \
+"punpcklwd %%mm2,%%mm1\n" /* mm1 = d 9 5 1 c 8 4 0 */ \
+"punpckhwd %%mm2,%%mm0\n" /* mm0 = f b 7 3 e a 6 2 */ \
+"pxor %%mm7,%%mm7\n" \
+"movq %%mm1,%%mm5\n" /* mm5 = d 9 5 1 c 8 4 0 */ \
+"punpckhbw %%mm7,%%mm5\n" /* mm5 = 0 d 0 9 0 5 0 1 = pix[1]*/ \
+"pu...
2004 Aug 24
5
MMX/mmxext optimisations
quite some speed improvement indeed.
attached the updated patch to apply to svn/trunk.
j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: theora-mmx.patch.gz
Type: application/x-gzip
Size: 8648 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin
2006 May 25
2
Compilation issues with s390
Hi all,
I'm trying to compile asterisk on the mainframe (s390 / s390x) and I am
running into issues. I was wondering if somebody could give a hand?
I'm thinking that I should be able to do this. I have noticed that Debian
even has binary RPM's out for Asterisk now. I'm trying to do this on SuSE
SLES8 (with the 2.4 kernel).
What I see is, an issue that arch=s390 isn't