Displaying 13 results from an estimated 13 matches for "movq2dq".
2008 Aug 01
0
[LLVMdev] Generating movq2dq using IRBuilder
Hi Dan,
Yes, they could be represented with insertelement and extractelement, but I
don't think they actually generate optimal code using movq2dq and such. Else
both bugs 2584 and 2585 would be fixed.
Anyway, I'm actually already encouraged to get involved myself. I'm quite
experienced with MMX and SSE but I'm still trying to learn more about how
LLVM does instruction selection and such.
By the way, I noticed that movq2dq and s...
2008 Jul 31
2
[LLVMdev] Generating movq2dq using IRBuilder
Hi all,
How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is
no zext from 64-bit to 128-bit (corresponding to MMX to XMM register
transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64,
which generates valid code but rather a number of stores and loads on the
stack instead of a single mo...
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
...to 32 and 64 to 32 using movd. This also seems related to Bug 2585.
Thanks again.
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Nicolas Capens
Sent: Thursday, 31 July, 2008 16:03
To: 'LLVM Developers Mailing List'
Subject: [LLVMdev] Generating movq2dq using IRBuilder
Hi all,
How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is
no zext from 64-bit to 128-bit (corresponding to MMX to XMM register
transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64,
which generates valid code but rather a...
2008 Jul 31
5
[LLVMdev] Generating movq2dq using IRBuilder
On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:
> In the same breath I’d also like to kindly ask if someone could have
> a look at the reverse operations, namely trunk from 128 to 64 bit
> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also
> seems related to Bug 2585. Thanks again.
The operations you're describing can be represented as insertelement
and
2008 Aug 01
1
[LLVMdev] Generating movq2dq using IRBuilder
...efit using
SSE. Or am I missing something?
Cheers,
Nicolas
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Stefanus Du Toit
Sent: Thursday, 31 July, 2008 23:51
To: LLVM Developers Mailing List
Subject: Re: [LLVMdev] Generating movq2dq using IRBuilder
On 31-Jul-08, at 2:38 PM, Dan Gohman wrote:
> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:
>> In the same breath I'd also like to kindly ask if someone could have
>> a look at the reverse operations, namely trunk from 128 to 64 bit
>> using movdq2q, a...
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...successful code is doing an aggregate copy field-by-field while the
failing code has lowered this to a memcpy. I would certainly expect
the memcpy expansion to be smart enough to avoid using MM registers,
though; that's a serious bug if it isn't.
movd %xmm0, %rax
movd %rax, %mm0
movq2dq %mm0, %xmm1
movq2dq %mm0, %xmm2
punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0]
movq 16(%rsp), %rax
movd %rax, %mm0
movq2dq %mm0, %xmm0
punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis wrote:
> Hi,
>
> I've attached...
2010 Aug 31
2
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...e is doing an aggregate copy field-by-field while the failing code has lowered this to a memcpy. I would certainly expect the memcpy expansion to be smart enough to avoid using MM registers, though; that's a serious bug if it isn't.
>
> movd %xmm0, %rax
> movd %rax, %mm0
> movq2dq %mm0, %xmm1
> movq2dq %mm0, %xmm2
> punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0]
> movq 16(%rsp), %rax
> movd %rax, %mm0
> movq2dq %mm0, %xmm0
> punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
>
>
> On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis...
2010 Aug 31
5
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...ebCore5mouniEPNS_15GraphicsContextEPNS_30GraphicsContextPlatformPrivateERKNS_9FloatRectERNS_10FloatPointES8_
movss 8(%rsp), %xmm0
movss 12(%rsp), %xmm1
subss 20(%rsp), %xmm1
subss 16(%rsp), %xmm0
insertps $16, %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[2,3]
movd %xmm0, %rax
movd %rax, %mm0
movq2dq %mm0, %xmm1
movq2dq %mm0, %xmm2
punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0]
movq 16(%rsp), %rax
movd %rax, %mm0
movq2dq %mm0, %xmm0
punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
addq $24, %rsp
ret
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
On 31-Jul-08, at 2:38 PM, Dan Gohman wrote:
> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:
>> In the same breath I’d also like to kindly ask if someone could have
>> a look at the reverse operations, namely trunk from 128 to 64 bit
>> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also
>> seems related to Bug 2585. Thanks again.
>
> The operations
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
...e the
>> failing code has lowered this to a memcpy. I would certainly
>> expect the memcpy expansion to be smart enough to avoid using MM
>> registers, though; that's a serious bug if it isn't.
>>
>> movd %xmm0, %rax
>> movd %rax, %mm0
>> movq2dq %mm0, %xmm1
>> movq2dq %mm0, %xmm2
>> punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0]
>> movq 16(%rsp), %rax
>> movd %rax, %mm0
>> movq2dq %mm0, %xmm0
>> punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
>>
>>
>> On Aug 31, 2010, a...
2020 Aug 31
2
Proposal to remove MMX support.
...emulation, it should be a no-op,
> but if there’s MMX asm, we need to actually clear the register file.
>
> Moving data between the register files in order to call an inline asm is
not a correctness issue, however, just a potential performance issue. The
compiler will insert movdq2q and movq2dq instructions as needed to copy the
data (introduced in SSE2). If this is slow in current CPUs, then your code
will be slow...but, if such code is being used in a performance critical
location now, it really shouldn't be using MMX still, so I don't think this
is a seriosu issue.
For _mm_emp...
2013 Nov 22
0
[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?
..._inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_movpi64_pi64(__m64 __a)
{
return (__m128i){ (long long)__a, 0 };
}
Microsoft (http://msdn.microsoft.com/en-us/library/has3d153(v=vs.90).aspx)
defines these two:
_mm_movepi64_pi64 MOVDQ2Q Move
_mm_movpi64_epi64 MOVQ2DQ Move
That is:
__m64 _mm_movepi64_pi64 (__m128i a);
MOVDQ2Q
r0 := a0 ;
__m128i _mm_movpi64_epi64 (__m64 a);
MOVDQ2Q
r0 := a0 ; r1 := 0X0 ;
Cf. Intel's manual [1]:
_mm_movepi64_pi64 Move MOVDQ2Q
_mm_movpi64_epi64 Move MOVDQ2Q
__m64 _mm_movepi64_pi64(__m128i a)
Re...
2020 Aug 30
3
Proposal to remove MMX support.
I recently diagnosed a bug in someone else's software, which turned out to
be due to incorrect MMX intrinsics usage: if you use any of the x86
intrinsics that accept or return __m64 values, then you, the *programmer* are
required to call _mm_empty() before using any x87 floating point
instructions or leaving the function. I was aware that this was required at
the assembly-level, but not that