thr3ads.net - similar to: "Possible AVX512 codegen bug in LLVM 10.0.1?"

Displaying 20 results from an estimated 4000 matches similar to: "Possible AVX512 codegen bug in LLVM 10.0.1?"

2017 Jun 21

AVX 512 Assembly Code Generation issues

when i generate code with 72 loop iterations. the compiler generates code with using avx512 zmm operations 4 times (16x4=64) and remaining 8 iterations are handled by routine mov operations with EAX register. wouldn't it be better if it uses ymm for remaining 8 iterations as it does when iteration count is between 8 and 15. same for xmm and so on. please correct me if i am wrong. Thank

Loop vectorizer Queires

2016 Jun 13

Loop vectorizer Queires

Hello, I have a few issues in vectorizing loops using Clang 3.8. Will it be ok if I shoot some of my findings and queries here? Meanwhile, can I please know if LLVM support autovectorized MIC instructions for Xeon phi? If so, could you please tell me the flags to use? I am Jumana, a masters student in Embedded system working as a graduate research intern with Intel. For my thesis, I am working

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

2016 May 01

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

Hi, For now no. But I will add this three builtins to CGBuiltin.cpp. If you want, you can be a reviewer of this change. Regards Michael Zuckerman From: Craig Topper [mailto:craig.topper at gmail.com] Sent: Thursday, April 28, 2016 04:53 To: Zuckerman, Michael <michael.zuckerman at intel.com> Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

2016 May 15

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

Hi , In the future, we will address this issue. Regards Michael Zuckerman From: Eric Christopher [mailto:echristo at gmail.com] Sent: Sunday, May 01, 2016 19:54 To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

On 06/23/2016 12:56 PM, Craig Topper wrote: > Can you check what value "getHostCPUName" returned? getHostCPUName() = skylake > > On Thu, Jun 23, 2016 at 9:53 AM, Frank Winter via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > With LLVM 3.8 the JIT compiler engine generates an AVX512 > instruction although I

[X86][AVX512] RFC: make i1 illegal in the Codegen

2017 Jan 24

[X86][AVX512] RFC: make i1 illegal in the Codegen

Hi All, AVX-512 introduced the K mask registers and masked operations which make a natural choice for legalizing vectors of i1's. For example, define <8 x i32> @foo(<8 x i32>%a, <8 x i32*> %p) { %r = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %p, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>,

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 30

avx512 JIT backend generates wrong code on <4 x float>

Hi Hal! Thanks, but unfortunately it didn't help. The exact same assembler instructions are generated for both 3.8 (yesterday) and trunk (from today). So, this really looks like a bug. Best, Frank On 06/29/2016 03:48 PM, Hal Finkel wrote: > Hi Frank, > > I recommend trying trunk LLVM. AVX-512 development has been very active recently. > > -Hal > > ----- Original

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

With LLVM 3.8 the JIT compiler engine generates an AVX512 instruction although I target an 'avx2' CPU (intel Core I7). I just downloaded the most recent 3.8 and still it happens. It happens with this input module: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" define void @module_cFFEMJ(i64 %lo, i64 %hi, i64 %myId, i1 %ordered, i64 %start, i32* noalias align 32

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

avx512 JIT backend generates wrong code on <4 x float>

Hi Frank, I recommend trying trunk LLVM. AVX-512 development has been very active recently. -Hal ----- Original Message ----- > From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org> > To: "LLVM Dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, June 29, 2016 2:41:39 PM > Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

avx512 JIT backend generates wrong code on <4 x float>

Hi! When compiling the attached module with the JIT engine on an Intel KNL I see wrong code getting emitted. I attach a complete exploit program which shows the bug in LLVM 3.8. It loads and JIT compiles the module and prints the assembler. I stumbled on this since the result of an actual calculation was wrong. So, it's not only the text version of the assembler also the machine

Should shadow_lock be spin_lock_recursive?

2005 May 11

Should shadow_lock be spin_lock_recursive?

During our testing, we found this code path where xen attempts to grab the shadow_lock, while holding it - leading to a deadlock. >> free_dom_mem-> >> shadow_sync_and_drop_references-> >> shadow_lock -> ..................... first lock >> shadow_remove_all_access-> >> remove_all_access_in_page-> >> put_page-> >>

Vectorizers code ownership

2016 Nov 09

Vectorizers code ownership

Hi Quentin, Thank you for bringing this up. I planned to finish the discussion on the vectorizer before starting the discussion on the X86 backed code ownership, but now is a good time. Simon, Sanjay, Craig, Elena, Bruno, Michael, Andrea, Chandler have made significant contributions to the X86 backend in the last few years. I think that Craig Topper would be a great code owner, assuming he wants

[LLVMdev] LLVM loop vectorizer

2015 Jul 08

[LLVMdev] LLVM loop vectorizer

Hello. I am trying to vectorize a CSR SpMV (sparse matrix vector multiplication) procedure but the LLVM loop vectorizer is not able to handle such code. I am using cland and llvm version 3.4 (on Ubuntu 12.10). I use the -fvectorize option with clang and -loop-vectorize with opt-3.4 . The CSR SpMV function is inspired from

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 13