thr3ads.net - search: "hasavx512"

Displaying 10 results from an estimated 10 matches for "hasavx512".

[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI

2014 Mar 13

[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI

...eservedMask function. http://llvm.org/docs/doxygen/html/X86RegisterInfo_8cpp_source.html case CallingConv::Intel_OCL_BI <http://llvm.org/docs/doxygen/html/namespacellvm_1_1CallingConv.html#a4f861731fc6dbfdccc05af5968d98974ad47327c131a0990283111588b89587cb>: { if (IsWin64 && HasAVX512) return CSR_Win64_Intel_OCL_BI_AVX512_RegMask; if (Is64Bit && HasAVX512) return CSR_64_Intel_OCL_BI_AVX512_RegMask; if (IsWin64 && HasAVX) return CSR_Win64_Intel_OCL_BI_AVX_RegMask; if (Is64Bit && HasAVX) return CSR_64...

KNL Vectorization with larger vector width

2018 Jul 24

KNL Vectorization with larger vector width

Thank You. Right now to see the effect i did following changes; unsigned X86TTIImpl::getRegisterBitWidth(bool Vector) { if (Vector) { if (ST->hasAVX512()) return 65536; here i changed 512 to 65536. Then in loopvectorize.cpp i did following; assert(MaxVectorSize <= 2048 && "Did not expect to pack so many elements" " into one vector!"); changed 64 to 2048. It runs fine. I can...

KNL Vectorization with larger vector width

2018 Jul 24

KNL Vectorization with larger vector width

...ahmed2305 at gmail.com> wrote: > Hello, > Do i need to change following function; > > unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) { > if (Vector && !ST->hasSSE1()) > return 0; > > if (ST->is64Bit()) { > if (Vector && ST->hasAVX512()) > return 32; > return 16; > } > return 8; > } > > to > > if (ST->is2048Bit()) { > if (Vector && ST->hasAVX512()) > return 1024; > return 512; > } > return 256; > > > please help... > > On Tue,...

Vectorizer has trouble with vpmovmskb and store

2018 Nov 27

Vectorizer has trouble with vpmovmskb and store

...ool X86TargetLowering::isLoadBitCastBeneficial(EVT LoadVT, > EVT BitcastVT) const { > + if (!LoadVT.isVector() && BitcastVT.isVector() && > + BitcastVT.getVectorElementType() == MVT::i1 && > + !Subtarget.hasAVX512()) > + return false; > + > if (!Subtarget.hasDQI() && BitcastVT == MVT::v8i1) > return false; > > > ~Craig > > > On Mon, Nov 26, 2018 at 2:51 PM Johan Engelen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi all, >>...

Vectorizer has trouble with vpmovmskb and store

2018 Nov 26

Vectorizer has trouble with vpmovmskb and store

Hi all, I've run into a case where the optimizer seems to be having trouble doing the "obvious" thing. Consider this code: ``` define i16 @foo(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0) { %a1 = icmp slt <16 x i8> %a0, zeroinitializer %a2 = bitcast <16 x i1> %a1 to i16 %astore = getelementptr inbounds <8 x i16>, <8 x i16>*

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...39;m understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to implement it properly. The AVX2 spec includes gather; whether it's slow or fast is an implementation detail. We need a feature bit / cost model entry somewhere to signify this, so we're not overloading the meanin...

KNL Vectorization with larger vector width

2018 Jul 23

KNL Vectorization with larger vector width

Thank You. I got it. Version issue. TTI.getRegisterBitWidth(true) How to put my target machine info in TTI? Please help. On Mon, Jul 23, 2018 at 11:33 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: > On 7/23/2018 10:49 AM, hameeza ahmed via llvm-dev wrote: > > Thank You. > > But I cannot find your mentioned function

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...ctly, you're saying that vgather* is slow on all > of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will > not generate it for any of those machines. > > Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() > && !hasAVX512()". It could break for some hypothetical future processor > that manages to implement it properly. The AVX2 spec includes gather; > whether it's slow or fast is an implementation detail. We need a feature > bit / cost model entry somewhere to signify this, so we're not overloa...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

search for: hasavx512