thr3ads.net - similar to: "[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64

Displaying 18 results from an estimated 18 matches similar to: "[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?"

[LLVMdev] Generating movq2dq using IRBuilder

2008 Jul 31

[LLVMdev] Generating movq2dq using IRBuilder

In the same breath I'd also like to kindly ask if someone could have a look at the reverse operations, namely trunk from 128 to 64 bit using movdq2q, and 128 to 32 and 64 to 32 using movd. This also seems related to Bug 2585. Thanks again. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nicolas Capens Sent: Thursday, 31 July, 2008 16:03 To:

[LLVMdev] Generating movq2dq using IRBuilder

2008 Jul 31

[LLVMdev] Generating movq2dq using IRBuilder

On 31-Jul-08, at 2:38 PM, Dan Gohman wrote: > On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >> In the same breath I’d also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >> seems related to Bug 2585. Thanks again. > > The operations

[LLVMdev] Generating movq2dq using IRBuilder

2008 Jul 31

[LLVMdev] Generating movq2dq using IRBuilder

On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: > In the same breath I’d also like to kindly ask if someone could have > a look at the reverse operations, namely trunk from 128 to 64 bit > using movdq2q, and 128 to 32 and 64 to 32 using movd. This also > seems related to Bug 2585. Thanks again. The operations you're describing can be represented as insertelement and

[LLVMdev] Generating movq2dq using IRBuilder

2008 Aug 01

[LLVMdev] Generating movq2dq using IRBuilder

Hi Dan, Yes, they could be represented with insertelement and extractelement, but I don't think they actually generate optimal code using movq2dq and such. Else both bugs 2584 and 2585 would be fixed. Anyway, I'm actually already encouraged to get involved myself. I'm quite experienced with MMX and SSE but I'm still trying to learn more about how LLVM does instruction selection

Proposal to remove MMX support.

2020 Aug 31

Proposal to remove MMX support.

On Mon, Aug 31, 2020 at 3:02 PM Eli Friedman <efriedma at quicinc.com> wrote: > Broadly speaking, I see two problems with implicitly enabling MMX > emulation on a target that has SSE2: > > > > 1. The interaction with inline asm. Inline asm can still have MMX > operands/results/clobbers, and can still put the processor in MMX mode. If > code is mixing MMX

[LLVMdev] Generating movq2dq using IRBuilder

2008 Aug 01

[LLVMdev] Generating movq2dq using IRBuilder

Hi Stefanus, I'm not if using MMX instructions when doing operations on 64-bit vectors is so terrible? With x86-64 you have double the registers, but it comes at the cost of longer instruction encodings. So there's probably no benefit using SSE. Or am I missing something? Cheers, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at

[LLVMdev] Generating movq2dq using IRBuilder

2008 Jul 31

[LLVMdev] Generating movq2dq using IRBuilder

Hi all, How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is no zext from 64-bit to 128-bit (corresponding to MMX to XMM register transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64, which generates valid code but rather a number of stores and loads on the stack instead of a single movq2dq. Looking though the code, I found a pattern for

Proposal to remove MMX support.

2020 Aug 30

Proposal to remove MMX support.

I recently diagnosed a bug in someone else's software, which turned out to be due to incorrect MMX intrinsics usage: if you use any of the x86 intrinsics that accept or return __m64 values, then you, the *programmer* are required to call _mm_empty() before using any x87 floating point instructions or leaving the function. I was aware that this was required at the assembly-level, but not that

mmx optimization

2005 Apr 19

mmx optimization

Hi, I've been giving a look at the archives of the mailing list and I've seen that you have rewritten a lot of functions using mmx to make them faster. I'm currently trying to optimize some code, but I'm have some problems, because I work with 16 bit per component and not 8 like theora. I know that it is off topic, but I'm posting to ask you a little help. I've got

[LLVMdev] GCC DejaGNU regressions

2009 Jul 25

[LLVMdev] GCC DejaGNU regressions

The GCC DejaGNU testsuite has discovered some regressions. Here's one; this was reduced from testsuite/gcc.apple/4656532.c: typedef long long __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); static __inline __m64 __attribute__((__always_inline__, __nodebug__)) _mm_slli_si64 (__m64 __m, int __count) { } __m64 x, y; void t1(int n) { y = _mm_slli_si64(x, n); } Compiled with

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

On Oct 14, 2008, at 1:54 PM, Eli Friedman wrote: > Maybe... although note that with gcc vector intrinsics, this violates > strict aliasing. gcc does allow you to use a slightly more elaborate > workaround with a union, though. Hum what's your take on this then: /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar

[LLVMdev] Making GEP into vector illegal?

2008 Oct 15

[LLVMdev] Making GEP into vector illegal?

On Oct 14, 2008, at 4:30 PM, Mike Stump wrote: > On Oct 14, 2008, at 1:54 PM, Eli Friedman wrote: >> Maybe... although note that with gcc vector intrinsics, this violates >> strict aliasing. gcc does allow you to use a slightly more elaborate >> workaround with a union, though. > > Hum what's your take on this then: > > /* The Intel API is flexible enough

[LLVMdev] Win64 bugs

2009 Aug 05

[LLVMdev] Win64 bugs

Hello, Nicolas > Thanks a lot for the heads up. I hadn't run into any problems yet with my > hack because I haven't used other callee-saved registers so far. Anyway, I'm > looking forward to your fix! I've commited the first series of patches to ToT to unbreak win64, basically: 1. Honour register save area 2. Enable proper passing of __m128 and __m64 arguments 3. Minor

[LLVMdev] Win64 bugs

2009 Aug 06

[LLVMdev] Win64 bugs

Thanks! What revision is your commit? I'd like to have a closer look at your patch in an attempt to understand the issue better, and maybe try fixing the callee-saved problem. Cheers, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Anton Korobeynikov Sent: woensdag 5 augustus 2009 9:35 To: LLVM Developers Mailing

[LLVMdev] stack alignment restriction

2010 Dec 29

[LLVMdev] stack alignment restriction

On Dec 28, 2010, at 4:02 PM, drizzle drizzle wrote: > Hi > Is there a way to enforce a different alignment on vales on stack > as compared to other basic types. Particularly, i would like > characters to be stored at 2 byte boundary. > Check out examples in the lib/Target/* directories. For instance in X86CallingConv.td, we have things like this: def CC_X86_64_C :

[LLVMdev] stack alignment restriction

2010 Dec 29

[LLVMdev] stack alignment restriction

Thanks for the answer.. A followup question - Is this already taken into consideration when generating address calculation offset etc or would this need to be specially taken care? I am assuming all load/stores also would need to be custom lowered. thanks dz On Wed, Dec 29, 2010 at 5:45 AM, Bill Wendling <wendling at apple.com> wrote: > On Dec 28, 2010, at 4:02 PM, drizzle drizzle

[LLVMdev] Making GEP into vector illegal?

2008 Oct 15

[LLVMdev] Making GEP into vector illegal?

On Oct 14, 2008, at 5:41 PM, Chris Lattner wrote: > On Oct 14, 2008, at 4:30 PM, Mike Stump wrote: >> On Oct 14, 2008, at 1:54 PM, Eli Friedman wrote: >>> Maybe... although note that with gcc vector intrinsics, this >>> violates >>> strict aliasing. gcc does allow you to use a slightly more >>> elaborate >>> workaround with a union,

Proposal for replacing asm code with intrinsics

2009 Oct 13

Proposal for replacing asm code with intrinsics

Hi, I'm new to Theora and would like to propose several performance optimization using advanced instructions in x86 CPUs (SSE2-SSE4.2). There are several source files in \x86 and \x86_vc which developed using inline assembler. However this cause several maintenance problems: 1) Need to sync gcc & msvc versions 2) Only 32bit environment is supported 3) No support for newer than MMX

similar to: [LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?