Displaying 10 results from an estimated 10 matches for "hasavx512".
2014 Mar 13
3
[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI
...eservedMask function.
http://llvm.org/docs/doxygen/html/X86RegisterInfo_8cpp_source.html
case CallingConv::Intel_OCL_BI <http://llvm.org/docs/doxygen/html/namespacellvm_1_1CallingConv.html#a4f861731fc6dbfdccc05af5968d98974ad47327c131a0990283111588b89587cb>: {
if (IsWin64 && HasAVX512)
return CSR_Win64_Intel_OCL_BI_AVX512_RegMask;
if (Is64Bit && HasAVX512)
return CSR_64_Intel_OCL_BI_AVX512_RegMask;
if (IsWin64 && HasAVX)
return CSR_Win64_Intel_OCL_BI_AVX_RegMask;
if (Is64Bit && HasAVX)
return CSR_64...
2018 Jul 24
2
KNL Vectorization with larger vector width
Thank You.
Right now to see the effect i did following changes;
unsigned X86TTIImpl::getRegisterBitWidth(bool Vector) {
if (Vector) {
if (ST->hasAVX512())
return 65536;
here i changed 512 to 65536. Then in loopvectorize.cpp i did following;
assert(MaxVectorSize <= 2048 && "Did not expect to pack so many elements"
" into one vector!");
changed 64 to 2048.
It runs fine. I can...
2018 Jul 24
2
KNL Vectorization with larger vector width
...ahmed2305 at gmail.com> wrote:
> Hello,
> Do i need to change following function;
>
> unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) {
> if (Vector && !ST->hasSSE1())
> return 0;
>
> if (ST->is64Bit()) {
> if (Vector && ST->hasAVX512())
> return 32;
> return 16;
> }
> return 8;
> }
>
> to
>
> if (ST->is2048Bit()) {
> if (Vector && ST->hasAVX512())
> return 1024;
> return 512;
> }
> return 256;
>
>
> please help...
>
> On Tue,...
2018 Nov 27
2
Vectorizer has trouble with vpmovmskb and store
...ool X86TargetLowering::isLoadBitCastBeneficial(EVT LoadVT,
> EVT BitcastVT) const {
> + if (!LoadVT.isVector() && BitcastVT.isVector() &&
> + BitcastVT.getVectorElementType() == MVT::i1 &&
> + !Subtarget.hasAVX512())
> + return false;
> +
> if (!Subtarget.hasDQI() && BitcastVT == MVT::v8i1)
> return false;
>
>
> ~Craig
>
>
> On Mon, Nov 26, 2018 at 2:51 PM Johan Engelen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi all,
>>...
2018 Nov 26
2
Vectorizer has trouble with vpmovmskb and store
Hi all,
I've run into a case where the optimizer seems to be having trouble doing
the "obvious" thing.
Consider this code:
```
define i16 @foo(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0) {
%a1 = icmp slt <16 x i8> %a0, zeroinitializer
%a2 = bitcast <16 x i1> %a1 to i16
%astore = getelementptr inbounds <8 x i16>, <8 x i16>*
2016 Feb 26
2
how to force llvm generate gather intrinsic
...39;m understanding correctly, you're saying that vgather* is slow on all
of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will
not generate it for any of those machines.
Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() &&
!hasAVX512()". It could break for some hypothetical future processor that
manages to implement it properly. The AVX2 spec includes gather; whether
it's slow or fast is an implementation detail. We need a feature bit / cost
model entry somewhere to signify this, so we're not overloading the meanin...
2018 Jul 23
2
KNL Vectorization with larger vector width
Thank You. I got it. Version issue.
TTI.getRegisterBitWidth(true)
How to put my target machine info in TTI?
Please help.
On Mon, Jul 23, 2018 at 11:33 PM, Friedman, Eli <efriedma at codeaurora.org>
wrote:
> On 7/23/2018 10:49 AM, hameeza ahmed via llvm-dev wrote:
>
> Thank You.
>
> But I cannot find your mentioned function
2016 Feb 26
0
how to force llvm generate gather intrinsic
...ctly, you're saying that vgather* is slow on all
> of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will
> not generate it for any of those machines.
>
> Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2()
> && !hasAVX512()". It could break for some hypothetical future processor
> that manages to implement it properly. The AVX2 spec includes gather;
> whether it's slow or fast is an implementation detail. We need a feature
> bit / cost model entry somewhere to signify this, so we're not overloa...
2016 Feb 26
0
how to force llvm generate gather intrinsic
No. Gather operation is slow on AVX2 processors.
- Elena
From: zhi chen [mailto:zchenhn at gmail.com]
Sent: Thursday, February 25, 2016 20:48
To: Sanjay Patel <spatel at rotateright.com>
Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] how to force
2016 Feb 25
2
how to force llvm generate gather intrinsic
It seems that http://reviews.llvm.org/D15690 only implemented
gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to
enable gather for AVX/2? Thanks.
Best,
Zhi
On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I don't think gather has been enabled for AVX2 as of r261875.
> Masked load/store were enabled for AVX with:
>