thr3ads.net - search: "_

Displaying 18 results from an estimated 18 matches for "__m64".

[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?

2013 Nov 22

[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?

...hat uses some SSE2 intrinsics and builds with gcc46, but not clang: clang can't find _mm_movpi64_epi64(), while gcc46 defines it in its lib/gcc46/gcc/.../4.6.3/include/emmintrin.h: extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_movpi64_epi64 (__m64 __A) { return _mm_set_epi64 ((__m64)0LL, __A); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_set_epi64x (long long __q1, long long __q0) { return __extension__ (__m128i)(__v2di){ __q0, __q1 }; } Now, Clang in /usr/include/clan...

mmx optimization

2005 Apr 19

mmx optimization

...;pred_mb are arrays of short int and not unsigned char. I cannot therefore use psadbw, because it works on 8 bit data. I've currently rewritten the function in this way: si32 sad_4x4 (macroblock_t * mb, ui8 x, ui8 y) { zeros = _mm_setzero_si64 (); ones = _mm_set1_pi16 (1); orig = *((__m64*) &mb->orig_mb[corner_x][corner_y]); pred = *((__m64*) &mb->pred_mb[corner_x][corner_y]); diff = _m_psubw (orig, pred); cmp = _m_pcmpgtw (zeros, diff); sign = _m_paddw (ones, cmp); sign = _m_paddw (sign, cmp); sad = _m_pmaddwd (diff, sign); orig = *((__m64*) &a...

[LLVMdev] GCC DejaGNU regressions

2009 Jul 25

[LLVMdev] GCC DejaGNU regressions

The GCC DejaGNU testsuite has discovered some regressions. Here's one; this was reduced from testsuite/gcc.apple/4656532.c: typedef long long __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); static __inline __m64 __attribute__((__always_inline__, __nodebug__)) _mm_slli_si64 (__m64 __m, int __count) { } __m64 x, y; void t1(int n) { y = _mm_slli_si64(x, n); } Compiled with LLVM-GCC (v76963) on Darwin/x86, this generates an ICE...

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

...does allow you to use a slightly more elaborate > workaround with a union, though. Hum what's your take on this then: /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ /* APPLE LOCAL 4505813 */ typedef long long __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); :-)

[LLVMdev] Making GEP into vector illegal?

2008 Oct 15

[LLVMdev] Making GEP into vector illegal?

...elaborate >> workaround with a union, though. > > Hum what's your take on this then: > > /* The Intel API is flexible enough that we must allow aliasing with > other > vector types, and their scalar components. */ > /* APPLE LOCAL 4505813 */ > typedef long long __m64 __attribute__ ((__vector_size__ (8), > __may_alias__)); This is actually completely different AFAIK, this allows things like: ((float*)&myvec4)[2] which is exactly what the proposal wants to continue supporting in the IR. -Chris

[LLVMdev] Win64 bugs

2009 Aug 05

[LLVMdev] Win64 bugs

...any problems yet with my > hack because I haven't used other callee-saved registers so far. Anyway, I'm > looking forward to your fix! I've commited the first series of patches to ToT to unbreak win64, basically: 1. Honour register save area 2. Enable proper passing of __m128 and __m64 arguments 3. Minor cleanups here and there The callee-saved problem is still unfixed, I'm working on general solution. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University

[LLVMdev] stack alignment restriction

2010 Dec 29

[LLVMdev] stack alignment restriction

Hi Is there a way to enforce a different alignment on vales on stack as compared to other basic types. Particularly, i would like characters to be stored at 2 byte boundary. thanks dz

[LLVMdev] stack alignment restriction

2010 Dec 29

[LLVMdev] stack alignment restriction

...t on vales on stack > as compared to other basic types. Particularly, i would like > characters to be stored at 2 byte boundary. > Check out examples in the lib/Target/* directories. For instance in X86CallingConv.td, we have things like this: def CC_X86_64_C : CallingConv<[ ... // __m64 vectors get 8-byte stack slots that are 8-byte aligned. CCIfType<[x86mmx,v1i64], CCAssignToStack<8, 8>> } The second parameter to CCAssigneToStack is the alignment for that type. -bw

[LLVMdev] stack alignment restriction

2010 Dec 29

[LLVMdev] stack alignment restriction

...mpared to other basic types. Particularly, i would like >> characters to be stored at 2 byte boundary. >> > Check out examples in the lib/Target/* directories. For instance in X86CallingConv.td, we have things like this: > > def CC_X86_64_C : CallingConv<[ > ... > // __m64 vectors get 8-byte stack slots that are 8-byte aligned. > CCIfType<[x86mmx,v1i64], CCAssignToStack<8, 8>> > } > > The second parameter to CCAssigneToStack is the alignment for that type. > > -bw > >

Proposal to remove MMX support.

2020 Aug 30

Proposal to remove MMX support.

I recently diagnosed a bug in someone else's software, which turned out to be due to incorrect MMX intrinsics usage: if you use any of the x86 intrinsics that accept or return __m64 values, then you, the *programmer* are required to call _mm_empty() before using any x87 floating point instructions or leaving the function. I was aware that this was required at the assembly-level, but not that the compiler forced users to deal with this when using intrinsics. This is a real nas...

[LLVMdev] Making GEP into vector illegal?

2008 Oct 15

[LLVMdev] Making GEP into vector illegal?

...ith a union, though. >> >> Hum what's your take on this then: >> >> /* The Intel API is flexible enough that we must allow aliasing with >> other >> vector types, and their scalar components. */ >> /* APPLE LOCAL 4505813 */ >> typedef long long __m64 __attribute__ ((__vector_size__ (8), >> __may_alias__)); > > This is actually completely different AFAIK, That statement was that: > float4 a; > float* ptr_z = (float*)(&a) + 3; ``violates strict aliasing`` That assertion is wrong. The docs says: @item may_alias Accesses...

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

On Tue, Oct 14, 2008 at 1:34 PM, Daniel M Gessel <gessel at apple.com> wrote: > In Joe programmer language (i.e. C ;) ), are we basically talking > about disallowing: > > float4 a; > float* ptr_z = &a.z; > > ? That's my reading as well; the argument for not allowing it is just to make optimization easier. We don't allow addressing individual bits either,

[LLVMdev] Win64 bugs

2009 Aug 05

[LLVMdev] Win64 bugs

Hi Anton, Thanks a lot for the heads up. I hadn't run into any problems yet with my hack because I haven't used other callee-saved registers so far. Anyway, I'm looking forward to your fix! Kind regards, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Anton Korobeynikov Sent: zaterdag 1 augustus 2009

[LLVMdev] Win64 bugs

2009 Aug 06

[LLVMdev] Win64 bugs

2009 Aug 01

[LLVMdev] Win64 bugs

Hello, Nicolas > The attached patch is a workaround for the XMM misalignment issue. Basically > it uses the fallback method of saving and restoring registers on the stack, > which does work correctly with alignment. If I recall correctly it also > doesn't save any registers unnecessarily, but I could be wrong about that. Please don't use this patch, it's completely wrong.

Proposal for replacing asm code with intrinsics

2009 Oct 13

Proposal for replacing asm code with intrinsics

...nefits are: 1) Easier to read & understand code which can use same variable names as generic version in C 2) Single source code for gcc & msvc & intel compiler (all of them supports same syntax) 3) Easier migration to SSE2 (which can handle 128bit vs. 64 as with MMX) thru replacement of __m64 to __m128 4) 64-bit code generation support 5) Compiler can reschedule instructions based on target CPU to deliver better performance w/o manual tuning. I did several tests with high-quality manually optimized assembly in the past and then replaced it to intrinsics which resulted in 3-5% better per...

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

In Joe programmer language (i.e. C ;) ), are we basically talking about disallowing: float4 a; float* ptr_z = &a.z; ? Won't programmers just resort to: float4 a; float* ptr_z = (float*)(&a) + 3; ? On Oct 14, 2008, at 3:55 PM, Mon Ping Wang wrote: > Hi, > > Something like a sequential type makes sense especially in light of > what Duncan is point out. I agree

Proposal to remove MMX support.

2020 Aug 31

Proposal to remove MMX support.

...ev at lists.llvm.org> > *Subject:* [EXT] [llvm-dev] Proposal to remove MMX support. > > > > I recently diagnosed a bug in someone else's software, which turned out to > be due to incorrect MMX intrinsics usage: if you use any of the x86 > intrinsics that accept or return __m64 values, then you, the *programmer* are > required to call _mm_empty() before using any x87 floating point > instructions or leaving the function. I was aware that this was required at > the assembly-level, but not that the compiler forced users to deal with > this when using intrinsics....

search for: __m64