thr3ads.net - search: "avx512"

Displaying 20 results from an estimated 171 matches for "avx512".

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

2016 May 15

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

...Michael Zuckerman From: Eric Christopher [mailto:echristo at gmail.com] Sent: Sunday, May 01, 2016 19:54 To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set Why? On Sun, May 1, 2016, 6:04 AM Zuckerman, Michael via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi, For now no. But I will add this three builtins to CGBuiltin.cpp. If you want, y...

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

2016 May 01

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

...uiltins to CGBuiltin.cpp. If you want, you can be a reviewer of this change. Regards Michael Zuckerman From: Craig Topper [mailto:craig.topper at gmail.com] Sent: Thursday, April 28, 2016 04:53 To: Zuckerman, Michael <michael.zuckerman at intel.com> Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set Can we use native IR for the stores the way the 128-bit and 256-bit equivalents do? On Wed, Apr 27, 2016 at 3:44 AM, Michael Zuckerman via cfe-commits <cfe-commits at lists.llvm.org<mailto:cfe-commits at lists.llvm.org>&gt...

[LLVMdev] [AVX512] Inconsistent mask types for intrinsics?

2013 Oct 30

[LLVMdev] [AVX512] Inconsistent mask types for intrinsics?

Hey guys, There seems to be an inconsistency between mask operand types for the AVX512 intrinsics. The mask instruction intrinsics expect a v16i1 for the mask operands: > def int_x86_kadd_v16i1 : GCCBuiltin<"__builtin_ia32_kaddw">, > Intrinsic<[llvm_v16i1_ty], [llvm_v16i1_ty, llvm_v16i1_ty], > [IntrNoMem]>; But o...

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

...what value "getHostCPUName" returned? getHostCPUName() = skylake > > On Thu, Jun 23, 2016 at 9:53 AM, Frank Winter via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > With LLVM 3.8 the JIT compiler engine generates an AVX512 > instruction although I target an 'avx2' CPU (intel Core I7). > I just downloaded the most recent 3.8 and still it happens. > > It happens with this input module: > > > target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" > >...

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

Thank you. I'm letting it auto detect by setting the target using getProcessTarget. I disabled avx512 support by passing -avx512f (and the other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in X86.td. It's the exact same executable running on Kabylake. What does the Cannot select: specifically mean? Is there some table that doesn't have a definition for a key in it th...

X86 backend code ownership

2016 Nov 10

X86 backend code ownership

Thanks for the support Nadav, Zvi, Chandler, Renato, and anyone else I missed. Quetin, to maybe address your concerns. My focus lately has been fixing inconsistency in instruction selection behavior between the older AVX instruction encodings and the new AVX512 encodings. I've also been trying to fix cases where concepts haven't been extended to wider vectors yet. For instance, the instcombine handling of x86 shift intrinsics. I've also been trying to remove AVX512 intrinsics for things that can be represented with native IR or where we can us...

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

With LLVM 3.8 the JIT compiler engine generates an AVX512 instruction although I target an 'avx2' CPU (intel Core I7). I just downloaded the most recent 3.8 and still it happens. It happens with this input module: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" define void @module_cFFEMJ(i64 %lo, i64 %hi, i64 %myId, i1 %...

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 30

avx512 JIT backend generates wrong code on <4 x float>

...ently. > > -Hal > > ----- Original Message ----- >> From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org> >> To: "LLVM Dev" <llvm-dev at lists.llvm.org> >> Sent: Wednesday, June 29, 2016 2:41:39 PM >> Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4 x float> >> >> Hi! >> >> When compiling the attached module with the JIT engine on an Intel >> KNL I >> see wrong code getting emitted. I attach a complete exploit program >> which shows the bug in LLVM 3.8. It l...

Possible AVX512 codegen bug in LLVM 10.0.1?

2020 Sep 05

Possible AVX512 codegen bug in LLVM 10.0.1?

Hey LLVMDev, Perhaps I'm missing something, but I think I've stumbled across a codegen bug in LLVM 10.0.1 related to AVX512. I've attached a small LLVM IR testcase and generated x86_64 assembly file that shows the bug. The test case is small, but not quite minimal, mostly because of driver code included in the test case so one can compile and run the program. The program does a simple vectorizable computation two...

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

avx512 JIT backend generates wrong code on <4 x float>

...evelopment has been very active recently. -Hal ----- Original Message ----- > From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org> > To: "LLVM Dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, June 29, 2016 2:41:39 PM > Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4 x float> > > Hi! > > When compiling the attached module with the JIT engine on an Intel > KNL I > see wrong code getting emitted. I attach a complete exploit program > which shows the bug in LLVM 3.8. It loads and JIT compiles the...

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

...ts.com> wrote: > Correction: getProcessTriple not getProcessTarget. > > On 8 May 2017, at 17:55, Andy Schneider via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Thank you. I'm letting it auto detect by setting the target using > getProcessTarget. I disabled avx512 support by passing -avx512f (and the > other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in > X86.td. It's the exact same executable running on Kabylake. > > What does the Cannot select: specifically mean? Is there some table that > doesn't have a defini...

R 4.0.1-4.0.2 built with Intel Composer 19.0-19.1.1, error in "make check" on CentOS 7.7

2020 Jun 24

R 4.0.1-4.0.2 built with Intel Composer 19.0-19.1.1, error in "make check" on CentOS 7.7

...o XE compiler suite, versions 19.0.x to 19.1.1. Build seems to go fine. I built it like this: module purge module load intel/19.1.1 module list export CC=icc export CXX=icpc export F77=ifort export FC=ifort export AR=xiar export LD=xild export CFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" export F77FLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" export FFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" export CXXFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" export MKL="-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp...

X86 backend code ownership

2016 Nov 10

X86 backend code ownership

...Thanks for the support Nadav, Zvi, Chandler, Renato, and anyone else I >> missed. >> >> Quetin, to maybe address your concerns. My focus lately has been fixing >> inconsistency in instruction selection behavior between the older AVX >> instruction encodings and the new AVX512 encodings. I've also been trying >> to fix cases where concepts haven't been extended to wider vectors yet. For >> instance, the instcombine handling of x86 shift intrinsics. I've also been >> trying to remove AVX512 intrinsics for things that can be represented with &...

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

Hi, I have a JIT compiler using the legacy JIT on LLVM 3.5 that, when run on the Xeon v5 Skylakes produces "Cannot select: intrinsic %llvm.x86.sse41.round.sd". Note, this does not occur on i7 Kabylakes. To get this far I had to disable AVX512 code gen. Upgrading the system I am looking at from 3.5 to a later version is a big job that I'd prefer not to have on my critical path. Does anyone have any tips on where I would look to debug this sort of issue? I'm new to LLVM. Thanks Andy

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 14

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

...;>> >>>> If skylake is that bad at AVX2 >>>> >>>> >>>> I don't think this says anything negative about AVX2, but AVX-512. >>>> >>> >> Right. I think we're at AVX/AVX2 is "bad" on Haswell/Broadwell and AVX512 >> is "bad" on Skylake. At least in the "random autovectorization spread out" >> aspect. >> >> >>> >>>> >>>> it belongs in -mcpu / -march IMO. >>>> >>>> >>>> No. We'd still want to e...

AVX 512 Assembly Code Generation issues

2017 Jun 21

AVX 512 Assembly Code Generation issues

when i generate code with 72 loop iterations. the compiler generates code with using avx512 zmm operations 4 times (16x4=64) and remaining 8 iterations are handled by routine mov operations with EAX register. wouldn't it be better if it uses ymm for remaining 8 iterations as it does when iteration count is between 8 and 15. same for xmm and so on. please correct me if i am wrong....

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 13

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

...says anything negative about AVX2, but AVX-512. > > it belongs in -mcpu / -march IMO. > > > No. We'd still want to enable the architectural features for vector > intrinsics and the like. > I took this to mean that the feature should be enabled by default for -march=skylake-avx512. > > > Based on the current performance data we're seeing, we think we need to > ultimately default skylake-avx512 to -mprefer-vector-width=256. > > > Craig, is this for both integer and floating-point code? > I believe so, but I'll try to get confirmation from t...

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 13

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

...t; On 11/11/2017 09:52 PM, UE US via llvm-dev wrote: >> If skylake is that bad at AVX2 > > I don't think this says anything negative about AVX2, but AVX-512. > > > Right. I think we're at AVX/AVX2 is "bad" on Haswell/Broadwell and > AVX512 is "bad" on Skylake. At least in the "random autovectorization > spread out" aspect. > > > >> it belongs in -mcpu / -march IMO. > > No. We'd still want to enable the architectural features for > vector intrinsics and the li...

[LLVMdev] broken LLVM-MC?

2013 Dec 13

[LLVMdev] broken LLVM-MC?

Hi, It seems LLVM-MC is broken with Avx512? $ echo "vinserti32x4 \$1, %xmm21, %zmm5, %zmm17"|./Release+Asserts/bin/llvm-mc -assemble -arch=x86-64 -show-encoding -x86-asm-syntax=att .text vinserti32x4 $1, %xmm21, %zmm5, %zmm17 # encoding: [0x62,0xa3,0x55,0x48,0x38,0xcd,0x01] $ echo "0x62,0xa3,0x55,0x48,0x38,0xcd,...

R 4.0.1-4.0.2 built with Intel Composer 19.0-19.1.1, error in "make check" on CentOS 7.7

2020 Jun 25

R 4.0.1-4.0.2 built with Intel Composer 19.0-19.1.1, error in "make check" on CentOS 7.7

...go fine. I built it like this: > > module purge > module load intel/19.1.1 > module list > > export CC=icc > export CXX=icpc > export F77=ifort > export FC=ifort > export AR=xiar > export LD=xild > > export CFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" > export F77FLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" > export FFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" > export CXXFLAGS="-O3 -ipo -qopenmp -axAVX,CORE-AVX2,CORE-AVX512" > export MKL="-lmkl_intel_lp64 -lmkl_intel_thre...

search for: avx512