search for: mm7

Displaying 20 results from an estimated 23 matches for "mm7".

Did you mean: mm
2004 Aug 24
5
MMX/mmxext optimisations
quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin
2005 Aug 17
2
MMX loop filter for theora-exp
...iped so we can load them directly to MMX regs + +TODO: some instruction stalls can be avoided + +*/ + +static void loop_filter_v_mmx(unsigned char *_pix,int _ystride,int *_bv){ + int y; + _pix-=_ystride*2; + +__asm__ __volatile__( +"pxor %%mm0,%%mm0\n" /* mm0 = 0 */ +"movq (%0),%%mm7\n" /* mm7 = _pix[0..8] */ +"lea (%1,%1,2),%%esi\n" /* esi = _ystride*3 */ +"movq (%0,%%esi),%%mm4\n" /* mm4 = _pix[0..8]+_ystride*3] */ +"movq %%mm7,%%mm6\n" /* mm6 = _pix[0..8] */ +"punpcklbw %%mm0,%%mm6\n" /* expand unsigned _pix[0..3] to 16 bits */ +&...
2004 Sep 10
2
An assembly optimization and fix
..._error_3 - mov edx, eax ; edx = error_4 - paddd mm1, mm5 ; [CR] total_error_3 += abs(error_3) ; total_error_2 += abs(error_2) - neg edx ; edx = -error_4 - cmovns eax, edx ; eax = abs(error_4) - movd mm5, eax ; mm5 = 0:abs(error_4) - paddd mm2, mm5 ; total_error_4 += abs(error_4) + movd mm7, [ebx] ; mm7 = 0:error_0 + add ebx, byte 4 + movq mm6, mm7 ; mm6 = 0:error_0 + psubd mm7, mm3 ; mm7 = :error_1 + punpckldq mm6, mm7 ; mm6 = error_1:error_0 + movq mm5, mm6 ; mm5 = error_1:error_0 + movq mm7, mm6 ; mm7 = error_1:error_0 + psubd mm5, mm3 ; mm5 = error_2: + movq mm3, mm6...
2010 Oct 20
2
[LLVMdev] llvm register reload/spilling around calls
...gt; added, however the calling code did not change at all... > > Look in X86InstrControl.td. The call instructions are all prefixed > by: > > let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, > FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, > XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, > XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], > > This is the fixed list of call-clobbered registers. It should really > be controlled by the calling convention of the called function > instead. > > The...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...20.10.2010 05:00, Jakob Stoklund Olesen wrote: >> Look in X86InstrControl.td. The call instructions are all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >> >> This is the fixed list of call-clobbered registers. It should really >> be controlled by the calling convention of the called function >>...
2010 Oct 20
1
[LLVMdev] llvm register reload/spilling around calls
...Jakob Stoklund Olesen wrote: >>> Look in X86InstrControl.td. The call instructions are all prefixed >>> by: >>> >>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >>> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >>> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >>> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >>> >>> This is the fixed list of call-clobbered registers. It should really >>> be controlled by the calling convention of the called...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...gt; added, however the calling code did not change at all... Look in X86InstrControl.td. The call instructions are all prefixed by: let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], This is the fixed list of call-clobbered registers. It should really be controlled by the calling convention of the called function instead. The WINCALL* ins...
2007 Jun 19
3
[LLVMdev] TargetRegisterClass for Physical Register
...that it's in multiple classes). Does ValueType have something to do with that? In the same file, the VR64 register class has the following definition: def VR64 : RegisterClass<"X86", [v8i8, v4i16, v2i32, v1i64], 64, [MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7]>; So there are multiple ValueTypes here (the scalar registers each only have one corresponding to the bit size of the register). But still, if I have physical register MM2, that completely determines its register class. Is there some other architecture where the physical register name/numbe...
2010 Oct 20
3
[LLVMdev] llvm register reload/spilling around calls
Thanks for giving it a look! On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: > On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: > >> So I saw that the code is doing lots of register >> spilling/reloading. Now I understand that due to calling >> conventions, there's not really a way to avoid this - I tried using >> coldcc but apparently the backend
2008 Sep 03
2
[LLVMdev] Codegen/Register allocation question.
...ef,dead>, %FP5<imp-def,dead>, %FP6<imp-def,dead>, %ST0<imp-def,dead>, %ST1<imp-def,dead>, %MM0<imp-def,dead>, %MM1<imp-def,dead>, %MM2<imp-def,dead>, %MM3<imp-def,dead>, %MM4<imp-def,dead>, %MM5<imp-def,dead>, %MM6<imp-def,dead>, %MM7<imp-def,dead>, %XMM0<imp-def,dead>, %XMM1<imp-def,dead>, %XMM2<imp-def,dead>, %XMM3<imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10&l...
2017 Dec 11
2
New x86 instruction with opcode 0x0F 0x7A
Hi all, I'm trying to simulate an extended x86 architecture on gem5 with several new instructions. My hardware setup is done and now I'd like llvm to accept the existence of the new instruction passed in inline assembly and output the correct opcode and registers. I chose the two-byte opcode 0x0F 0x7A and I would like the instruction to have the same operands and return values as CVTPS2PI
2008 Sep 04
0
[LLVMdev] Codegen/Register allocation question.
...;imp-def,dead>, %FP6<imp-def,dead>, > %ST0<imp-def,dead>, %ST1<imp-def,dead>, %MM0<imp-def,dead>, > %MM1<imp-def,dead>, %MM2<imp-def,dead>, %MM3<imp-def,dead>, > %MM4<imp-def,dead>, %MM5<imp-def,dead>, %MM6<imp-def,dead>, > %MM7<imp-def,dead>, %XMM0<imp-def,dead>, %XMM1<imp-def,dead>, > %XMM2<imp-def,dead>, %XMM3<imp-def,dead>, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dea...
2007 Jun 18
2
[LLVMdev] TargetRegisterClass for Physical Register
How do I get the TargetRegisterClass for a physical register? SSARegMap::getRegClass only works for virtual registers. -Dave
2007 Jun 19
0
[LLVMdev] TargetRegisterClass for Physical Register
Take a look at getPhysicalRegisterRegClass( const MRegisterInfo *MRI, MVT::ValueType VT, unsigned reg) in ScheduleDAG.cpp. -- Christopher Lamb On Jun 18, 2007, at 4:52 PM, David A. Greene wrote: > How do I get the TargetRegisterClass for a physical register? > SSARegMap::getRegClass only works for virtual registers. > >
2002 Jan 22
1
glm.predict?
...178, 177, 176, 205, 153, 228, 227, 147, 173, 157, 214, 167, 140, 179, 204, 184, 151, 115, 173, 208, 135, 175, 136, 121, 189, 148, 174), .Names = c("Lead1.mm1", "Lead1.mm2", "Lead1.mm3", "Lead1.mm4", "Lead1.mm5", "Lead1.mm6", "Lead1.mm7", "Lead1.mm8", "Lead1.mm9", "Lead1.mm10", "Lead1.mm11", "Lead1.mm12", "Lead1.mm13", "Lead1.mm14", "Lead1.mm15", "Lead1.mm16", "Lead1.mm17", "Lead1.mm18", "Lead1.mm19", &qu...
2009 Aug 30
3
experimental patch for libtheora1.1beta3
...r_SOURCES = getopt.c getopt1.c getopt.h ? patches/patch-lib_x86_mmxencfrag_c $OpenBSD$ --- lib/x86/mmxencfrag.c.orig Wed Aug 12 22:26:45 2009 +++ lib/x86/mmxencfrag.c Wed Aug 12 23:43:30 2009 @@ -496,7 +496,10 @@ static unsigned oc_int_frag_satd_thresh_mmxext(const u "movq 0x78(%[buf]),%%mm7\n\t" /*The sums produced by OC_HADAMARD_ABS_ACCUM_8x4 each have an extra 4 added to them, and a factor of two removed; correct the final sum here.*/ - "lea -32(%[ret],%[ret]),%[ret]\n\t" + /* Not working "lea -32(%[ret],%[ret]),%[ret]\n\t" */ + /* Like...
2009 Oct 13
3
Proposal for replacing asm code with intrinsics
...c which compiles into 1-2 assembly instructions and are much easier to maintain. For example: _mm_sad_epu8(__m128, __m128) will be compiled in PSADBW instruction with compiler-allocated registers. And code like: psadbw mm4,mm5 paddw mm0,mm4 Can be re-written into _m64 mm0, mm4, mm5, mm6, mm7; //of course using meaningful names mm0= _mm_add_epi16(mm0, _mm_sad_pu8(mm4, mm5)); Compiler will replace variables with actual registers, ensuring better allocation and scheduling of them. So, benefits are: 1) Easier to read & understand code which can use same variable names as generic versi...
2007 Jun 26
3
[LLVMdev] Live Intervals Question
...d>, %FP4<imp-def,dead>, %FP5<imp-def,dead>, %FP6<imp-def,dead>, %ST(0)<imp-def,dead>, %MM0<imp-def,dead>, %MM1<imp-def,dead>, %MM2<imp-def,dead>, %MM3<imp-def,dead>, %MM4<imp-def,dead>, %MM5<imp-def,dead>, %MM6<imp-def,dead>, %MM7<imp-def,dead>, %XMM0<imp-def,dead>, %XMM1<imp-def,dead>, %XMM2<imp-def,dead>, %XMM3<imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM1...
2007 Jun 26
0
[LLVMdev] Live Intervals Question
...mp-def,dead>, %FP5<imp-def,dead>, > %FP6<imp-def,dead>, %ST(0)<imp-def,dead>, %MM0<imp-def,dead>, > %MM1<imp-def,dead>, %MM2<imp-def,dead>, %MM3<imp-def,dead>, > %MM4<imp-def,dead>, %MM5<imp-def,dead>, %MM6<imp-def,dead>, > %MM7<imp-def,dead>, %XMM0<imp-def,dead>, %XMM1<imp-def,dead>, > %XMM2<imp-def,dead>, %XMM3<imp-def,dead>, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dea...
2007 Jun 26
4
[LLVMdev] Live Intervals Question
...5<imp-def,dead>, > > %FP6<imp-def,dead>, %ST(0)<imp-def,dead>, %MM0<imp-def,dead>, > > %MM1<imp-def,dead>, %MM2<imp-def,dead>, %MM3<imp-def,dead>, > > %MM4<imp-def,dead>, %MM5<imp-def,dead>, %MM6<imp-def,dead>, > > %MM7<imp-def,dead>, %XMM0<imp-def,dead>, %XMM1<imp-def,dead>, > > %XMM2<imp-def,dead>, %XMM3<imp-def,dead>, %XMM4<imp-def,dead>, > > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > > %XMM8<imp-def,dead>, %XMM9...