search for: paddw

Displaying 5 results from an estimated 5 matches for "paddw".

Did you mean: paddr
2004 Aug 24
5
MMX/mmxext optimisations
quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin
2005 Aug 17
2
MMX loop filter for theora-exp
..." +"punpckhbw %%mm0,%%mm3\n" +"punpcklbw %%mm0,%%mm2\n" +"psubw %%mm5,%%mm3\n" +"psubw %%mm4,%%mm2\n" + /* mm3:mm2 = (_pix[_ystride*2]-_pix[_ystride]); */ +"PMULLW (V3),%%mm3\n" /* *3 */ +"PMULLW (V3),%%mm2\n" /* *3 */ +"paddw %%mm7,%%mm3\n" /* highpart */ +"paddw %%mm6,%%mm2\n"/* lowpart of _pix[0]-_pix[_ystride*3]+3*(_pix[_ystride*2]-_pix[_ystride]); */ +"paddw (V4),%%mm3\n" /* add 4 */ +"paddw (V4),%%mm2\n" /* add 4 */ +"psraw $3,%%mm3\n" /* >>3 f coefs high */...
2005 Jul 20
1
MMX IDCT for theora-exp
...t; \ + " movq " r1","r5"\n" \ + " pmulhw " r2","r1"\n" \ + " movq " I(1)","r3"\n" \ + " pmulhw " r7","r5"\n" \ + " movq " C(1)","r0"\n" \ + " paddw " r2","r4"\n" \ + " paddw " r7","r6"\n" \ + " paddw " r1","r2"\n" \ + " movq " J(7)","r1"\n" \ + " paddw " r5","r7"\n" \ + " movq " r0"...
2009 Oct 13
3
Proposal for replacing asm code with intrinsics
...l is to replace all functions in assembly with compiler intrinsic which compiles into 1-2 assembly instructions and are much easier to maintain. For example: _mm_sad_epu8(__m128, __m128) will be compiled in PSADBW instruction with compiler-allocated registers. And code like: psadbw mm4,mm5 paddw mm0,mm4 Can be re-written into _m64 mm0, mm4, mm5, mm6, mm7; //of course using meaningful names mm0= _mm_add_epi16(mm0, _mm_sad_pu8(mm4, mm5)); Compiler will replace variables with actual registers, ensuring better allocation and scheduling of them. So, benefits are: 1) Easier to read & unde...
2012 Nov 28
0
[LLVMdev] [llvm-commits] [dragonegg] r168787 - in /dragonegg/trunk: src/x86/Target.cpp src/x86/x86_builtins test/validator/c/copysignp.c
...Builder.CreateAnd(IntRHS, SignMask); > + Value *Abs = Builder.CreateAnd(IntLHS, ConstantExpr::getNot(SignMask)); > + Value *IntRes = Builder.CreateOr(Abs, Sign); > + Result = Builder.CreateBitCast(IntRes, VecTy); > + return true; > + } > case paddb: > case paddw: > case paddd: > > Modified: dragonegg/trunk/src/x86/x86_builtins > URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/x86/x86_builtins?rev=168787&r1=168786&r2=168787&view=diff > ==============================================================================...