search for: movq2dq

Displaying 13 results from an estimated 13 matches for "movq2dq".

2008 Aug 01
0
[LLVMdev] Generating movq2dq using IRBuilder
Hi Dan, Yes, they could be represented with insertelement and extractelement, but I don't think they actually generate optimal code using movq2dq and such. Else both bugs 2584 and 2585 would be fixed. Anyway, I'm actually already encouraged to get involved myself. I'm quite experienced with MMX and SSE but I'm still trying to learn more about how LLVM does instruction selection and such. By the way, I noticed that movq2dq and s...
2008 Jul 31
2
[LLVMdev] Generating movq2dq using IRBuilder
Hi all, How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is no zext from 64-bit to 128-bit (corresponding to MMX to XMM register transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64, which generates valid code but rather a number of stores and loads on the stack instead of a single mo...
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
...to 32 and 64 to 32 using movd. This also seems related to Bug 2585. Thanks again. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nicolas Capens Sent: Thursday, 31 July, 2008 16:03 To: 'LLVM Developers Mailing List' Subject: [LLVMdev] Generating movq2dq using IRBuilder Hi all, How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is no zext from 64-bit to 128-bit (corresponding to MMX to XMM register transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64, which generates valid code but rather a...
2008 Jul 31
5
[LLVMdev] Generating movq2dq using IRBuilder
On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: > In the same breath I’d also like to kindly ask if someone could have > a look at the reverse operations, namely trunk from 128 to 64 bit > using movdq2q, and 128 to 32 and 64 to 32 using movd. This also > seems related to Bug 2585. Thanks again. The operations you're describing can be represented as insertelement and
2008 Aug 01
1
[LLVMdev] Generating movq2dq using IRBuilder
...efit using SSE. Or am I missing something? Cheers, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Stefanus Du Toit Sent: Thursday, 31 July, 2008 23:51 To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Generating movq2dq using IRBuilder On 31-Jul-08, at 2:38 PM, Dan Gohman wrote: > On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >> In the same breath I'd also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, a...
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...successful code is doing an aggregate copy field-by-field while the failing code has lowered this to a memcpy. I would certainly expect the memcpy expansion to be smart enough to avoid using MM registers, though; that's a serious bug if it isn't. movd %xmm0, %rax movd %rax, %mm0 movq2dq %mm0, %xmm1 movq2dq %mm0, %xmm2 punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] movq 16(%rsp), %rax movd %rax, %mm0 movq2dq %mm0, %xmm0 punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis wrote: > Hi, > > I've attached...
2010 Aug 31
2
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...e is doing an aggregate copy field-by-field while the failing code has lowered this to a memcpy. I would certainly expect the memcpy expansion to be smart enough to avoid using MM registers, though; that's a serious bug if it isn't. > > movd %xmm0, %rax > movd %rax, %mm0 > movq2dq %mm0, %xmm1 > movq2dq %mm0, %xmm2 > punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] > movq 16(%rsp), %rax > movd %rax, %mm0 > movq2dq %mm0, %xmm0 > punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] > > > On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis...
2010 Aug 31
5
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...ebCore5mouniEPNS_15GraphicsContextEPNS_30GraphicsContextPlatformPrivateERKNS_9FloatRectERNS_10FloatPointES8_ movss 8(%rsp), %xmm0 movss 12(%rsp), %xmm1 subss 20(%rsp), %xmm1 subss 16(%rsp), %xmm0 insertps $16, %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[2,3] movd %xmm0, %rax movd %rax, %mm0 movq2dq %mm0, %xmm1 movq2dq %mm0, %xmm2 punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] movq 16(%rsp), %rax movd %rax, %mm0 movq2dq %mm0, %xmm0 punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] addq $24, %rsp ret
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
On 31-Jul-08, at 2:38 PM, Dan Gohman wrote: > On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >> In the same breath I’d also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >> seems related to Bug 2585. Thanks again. > > The operations
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...e the >> failing code has lowered this to a memcpy. I would certainly >> expect the memcpy expansion to be smart enough to avoid using MM >> registers, though; that's a serious bug if it isn't. >> >> movd %xmm0, %rax >> movd %rax, %mm0 >> movq2dq %mm0, %xmm1 >> movq2dq %mm0, %xmm2 >> punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] >> movq 16(%rsp), %rax >> movd %rax, %mm0 >> movq2dq %mm0, %xmm0 >> punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] >> >> >> On Aug 31, 2010, a...
2020 Aug 31
2
Proposal to remove MMX support.
...emulation, it should be a no-op, > but if there’s MMX asm, we need to actually clear the register file. > > Moving data between the register files in order to call an inline asm is not a correctness issue, however, just a potential performance issue. The compiler will insert movdq2q and movq2dq instructions as needed to copy the data (introduced in SSE2). If this is slow in current CPUs, then your code will be slow...but, if such code is being used in a performance critical location now, it really shouldn't be using MMX still, so I don't think this is a seriosu issue. For _mm_emp...
2013 Nov 22
0
[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?
..._inline__ __m128i __attribute__((__always_inline__, __nodebug__)) _mm_movpi64_pi64(__m64 __a) { return (__m128i){ (long long)__a, 0 }; } Microsoft (http://msdn.microsoft.com/en-us/library/has3d153(v=vs.90).aspx) defines these two: _mm_movepi64_pi64 MOVDQ2Q Move _mm_movpi64_epi64 MOVQ2DQ Move That is: __m64 _mm_movepi64_pi64 (__m128i a); MOVDQ2Q r0 := a0 ; __m128i _mm_movpi64_epi64 (__m64 a); MOVDQ2Q r0 := a0 ; r1 := 0X0 ; Cf. Intel's manual [1]: _mm_movepi64_pi64 Move MOVDQ2Q _mm_movpi64_epi64 Move MOVDQ2Q __m64 _mm_movepi64_pi64(__m128i a) Re...
2020 Aug 30
3
Proposal to remove MMX support.
I recently diagnosed a bug in someone else's software, which turned out to be due to incorrect MMX intrinsics usage: if you use any of the x86 intrinsics that accept or return __m64 values, then you, the *programmer* are required to call _mm_empty() before using any x87 floating point instructions or leaving the function. I was aware that this was required at the assembly-level, but not that