thr3ads.net - similar to: "[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets"

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

2014 Dec 15

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

AFAIK, there is no additional penalty for AMD processors. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Monday, December 15, 2014 3:57 AM To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets FWIW, this makes sense to me. I'd be interested to hear from

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 24

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

> I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. I think that keeping tests compatibility is not a reason for an additional “llc” flag. We check encoding in test/MC/X86 dir. Is there any option to report-out from llc in non-debug mode? It should be an option to control

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 11

X86 TRUNCATE cost for AVX & AVX2 mode

Hi, I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost calculation for TRUNCATE instruction in AVX mode. In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there it finds cost as 30 for this operation. 30 cost for this operation looks very high. Wondering why

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 12

X86 TRUNCATE cost for AVX & AVX2 mode

<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2

[LLVMdev] VCOMISS instruction in X86

2013 May 20

[LLVMdev] VCOMISS instruction in X86

Hi, I'm looking at scalar and packed instructions in X86. The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it? defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32, "ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG; defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64,

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 23

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. On Wed, Nov 23, 2016 at 5:01 AM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > ------------------------------ > > *From: *"Gadi via llvm-dev Haber" <llvm-dev at lists.llvm.org> >

[LLVMdev] RFC: AVX Feature Specification

2009 Apr 30

[LLVMdev] RFC: AVX Feature Specification

I've been working on adding AVX to LLVM and have run across a number of questions. Here's the first one. In some ways AVX is "just another" SSE level. Having AVX implies you have SSE1-SSE4.2. However AVX is very different from SSE and there are a number of sub-features which may or may not be available on various implementations. So right now I've done this: def

[LLVMdev] Stack alignment on X86 AVX seems incorrect

2012 Mar 01