Displaying 20 results from an estimated 23 matches for "x86ttiimpl".
2018 Jul 24
2
KNL Vectorization with larger vector width
Thank You.
Right now to see the effect i did following changes;
unsigned X86TTIImpl::getRegisterBitWidth(bool Vector) {
if (Vector) {
if (ST->hasAVX512())
return 65536;
here i changed 512 to 65536. Then in loopvectorize.cpp i did following;
assert(MaxVectorSize <= 2048 && "Did not expect to pack so many elements"...
2018 Jul 24
2
KNL Vectorization with larger vector width
...32> vector instructions.
How to do this?
What adjustments are needed?
Please help
I m trying this but unable to solve.
Thank You
On Tue, Jul 24, 2018 at 4:44 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:
> Hello,
> Do i need to change following function;
>
> unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) {
> if (Vector && !ST->hasSSE1())
> return 0;
>
> if (ST->is64Bit()) {
> if (Vector && ST->hasAVX512())
> return 32;
> return 16;
> }
> return 8;
> }
>
> to
>
> if (ST->...
2016 Jan 23
3
how to force llvm generate gather intrinsic
...d generate those instructions. I don't want to touch the source code.
Best,
Zhi
On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I was just looking at the related masked load/store operations, and I
> think there are at least 2 bugs:
>
> 1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with
> AVX1 (not just AVX2).
> 2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256
> bit vectors with AVX2 (not just AVX512).
>
> I looked at this for the first time today, so I may be missing something...
&...
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi,
I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode,
say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I
used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather
intrinsic generated.
int foo(int A[800], int B[800], int C[800]) {
for (int i = 0; i < 800; i++) {
A[B[i]] = i + 5;
}
for (int i = 0; i < 800;
2018 Jul 23
2
KNL Vectorization with larger vector width
Thank You. I got it. Version issue.
TTI.getRegisterBitWidth(true)
How to put my target machine info in TTI?
Please help.
On Mon, Jul 23, 2018 at 11:33 PM, Friedman, Eli <efriedma at codeaurora.org>
wrote:
> On 7/23/2018 10:49 AM, hameeza ahmed via llvm-dev wrote:
>
> Thank You.
>
> But I cannot find your mentioned function
2016 Mar 17
2
generate vectorized code
...mini at apple.com> wrote:
> Hi Rail,
>
> Two hints to begin with:
>
> 1) Makes sure you example is vectorized on X86 for example
> 2) Is your target correctly overriding the TTI (declaring the vector
> register size for example) so that the vectorizer can kicks-in (see
> X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test
> the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512
> (I don't see an equivalent option for the loop vectorizer though).
>
> Well, it sort of worked. I added a getRegisterBitWidth(...) but then...
2016 Apr 11
2
X86 TRUNCATE cost for AVX & AVX2 mode
Hi,
I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost
calculation for TRUNCATE instruction in AVX mode.
In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for
TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there
it finds cost as 30 for this operation....
2017 Nov 01
5
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...8 as SubtargetFeatures in X86.td not
mapped to any CPU.
-Add mprefer-avx256 and mprefer-avx128 and the corresponding
-mno-prefer-avx128/256 options to clang's driver Options.td file. I believe
this will allow clang to pass these straight through to the -target-feature
attribute in IR.
-Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is
enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return
256 if AVX is enabled and prefer-avx128 is not set.
There may be some other backend changes needed, but I plan to address those
as we find them.
At a later point, consi...
2016 Mar 17
2
generate vectorized code
...t;> Hi Rail,
>>
>> Two hints to begin with:
>>
>> 1) Makes sure you example is vectorized on X86 for example
>> 2) Is your target correctly overriding the TTI (declaring the vector
>> register size for example) so that the vectorizer can kicks-in (see
>> X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test
>> the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512
>> (I don't see an equivalent option for the loop vectorizer though).
>>
>> Well, it sort of worked. I added a getRegisterBitWid...
2016 Mar 16
2
generate vectorized code
My question is:
How do I make clang to generate assembly with vector instruction for my
target?
The back story is:
I've added a few vector instructions to my target and confirmed that they
are used by running my code on the test below and using a following
command:
opt i.esencia.ll -S -march=esencia -mcpu=esencia -loop-vectorize | llc
-mcpu=esencia -o i.esencia.s
target datalayout =
2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
...vsky at intel.com>>; Zuckerman, Michael <michael.zuckerman at intel.com<mailto:michael.zuckerman at intel.com>>
Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: X86 TRUNCATE cost for AVX & AVX2 mode
Hi,
I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost
calculation for TRUNCATE instruction in AVX mode.
In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for
TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there
it finds cost as 30 for this operation....
2016 Mar 18
3
generate vectorized code
...nts to begin with:
>>>>
>>>> 1) Makes sure you example is vectorized on X86 for example
>>>> 2) Is your target correctly overriding the TTI (declaring the vector
>>>> register size for example) so that the vectorizer can kicks-in (see
>>>> X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test
>>>> the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512
>>>> (I don't see an equivalent option for the loop vectorizer though).
>>>>
>>>> Well, it sort of wor...
2017 Nov 03
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...gt;
>> -Add mprefer-avx256 and mprefer-avx128 and the corresponding
>> -mno-prefer-avx128/256 options to clang's driver Options.td file. I believe
>> this will allow clang to pass these straight through to the -target-feature
>> attribute in IR.
>>
>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is
>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return
>> 256 if AVX is enabled and prefer-avx128 is not set.
>>
>
> Instead of multiple flags that have difficult to understand intersecting
> behavi...
2016 Jan 23
2
how to force llvm generate gather intrinsic
...tructions. I don't want to touch the source code.
Best,
Zhi
On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote:
I was just looking at the related masked load/store operations, and I think there are at least 2 bugs:
1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with AVX1 (not just AVX2).
2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256 bit vectors with AVX2 (not just AVX512).
I looked at this for the first time today, so I may be missing something...
So for the moment, the an...
2017 Nov 07
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...avx128 and the corresponding
> >>> -mno-prefer-avx128/256 options to clang's driver Options.td file. I believe
> >>> this will allow clang to pass these straight through to the -target-feature
> >>> attribute in IR.
> >>>
> >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is
> >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return
> >>> 256 if AVX is enabled and prefer-avx128 is not set.
> >>>
> >>
> >> Instead of multiple flags that have dif...
2017 Nov 09
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...efer-avx128/256 options to clang's driver Options.td file. I
>> believe
>> > >>> this will allow clang to pass these straight through to the
>> -target-feature
>> > >>> attribute in IR.
>> > >>>
>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if
>> AVX512 is
>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly
>> return
>> > >>> 256 if AVX is enabled and prefer-avx128 is not set.
>> > >>>
>> > >>...
2017 Nov 11
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...tions.td file.
>>>> I believe
>>>> > >>> this will allow clang to pass these straight through to the
>>>> -target-feature
>>>> > >>> attribute in IR.
>>>> > >>>
>>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if
>>>> AVX512 is
>>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly
>>>> return
>>>> > >>> 256 if AVX is enabled and prefer-avx128 is not set.
>>>>...
2017 Nov 12
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...e
>>>>>> > >>> this will allow clang to pass these straight through to the
>>>>>> -target-feature
>>>>>> > >>> attribute in IR.
>>>>>> > >>>
>>>>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if
>>>>>> AVX512 is
>>>>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set.
>>>>>> Similarly return
>>>>>> > >>> 256 if AVX is enabled and prefer-avx1...
2017 Nov 13
3
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...t;>> > >>> this will allow clang to pass these straight through to the
>>>>>>> -target-feature
>>>>>>> > >>> attribute in IR.
>>>>>>> > >>>
>>>>>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if
>>>>>>> AVX512 is
>>>>>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set.
>>>>>>> Similarly return
>>>>>>> > >>> 256 if AVX is enabled...
2017 Nov 13
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...>> -target-feature
>> > >>> attribute in IR.
>> > >>>
>> > >>> -Modify
>> X86TTIImpl::getRegisterBitWidth to
>> only return 512 if AVX512 is
>> > >>> enabled and prefer-avx256 and
>> prefer-avx128 is not set. Similarly
>>...