search for: x86ttiimpl

Displaying 20 results from an estimated 23 matches for "x86ttiimpl".

2018 Jul 24
2
KNL Vectorization with larger vector width
Thank You. Right now to see the effect i did following changes; unsigned X86TTIImpl::getRegisterBitWidth(bool Vector) { if (Vector) { if (ST->hasAVX512()) return 65536; here i changed 512 to 65536. Then in loopvectorize.cpp i did following; assert(MaxVectorSize <= 2048 && "Did not expect to pack so many elements"...
2018 Jul 24
2
KNL Vectorization with larger vector width
...32> vector instructions. How to do this? What adjustments are needed? Please help I m trying this but unable to solve. Thank You On Tue, Jul 24, 2018 at 4:44 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Hello, > Do i need to change following function; > > unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) { > if (Vector && !ST->hasSSE1()) > return 0; > > if (ST->is64Bit()) { > if (Vector && ST->hasAVX512()) > return 32; > return 16; > } > return 8; > } > > to > > if (ST-&gt...
2016 Jan 23
3
how to force llvm generate gather intrinsic
...d generate those instructions. I don't want to touch the source code. Best, Zhi On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com> wrote: > I was just looking at the related masked load/store operations, and I > think there are at least 2 bugs: > > 1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with > AVX1 (not just AVX2). > 2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256 > bit vectors with AVX2 (not just AVX512). > > I looked at this for the first time today, so I may be missing something... &...
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi, I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather intrinsic generated. int foo(int A[800], int B[800], int C[800]) { for (int i = 0; i < 800; i++) { A[B[i]] = i + 5; } for (int i = 0; i < 800;
2018 Jul 23
2
KNL Vectorization with larger vector width
Thank You. I got it. Version issue. TTI.getRegisterBitWidth(true) How to put my target machine info in TTI? Please help. On Mon, Jul 23, 2018 at 11:33 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: > On 7/23/2018 10:49 AM, hameeza ahmed via llvm-dev wrote: > > Thank You. > > But I cannot find your mentioned function
2016 Mar 17
2
generate vectorized code
...mini at apple.com> wrote: > Hi Rail, > > Two hints to begin with: > > 1) Makes sure you example is vectorized on X86 for example > 2) Is your target correctly overriding the TTI (declaring the vector > register size for example) so that the vectorizer can kicks-in (see > X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test > the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512 > (I don't see an equivalent option for the loop vectorizer though). > > Well, it sort of worked. I added a getRegisterBitWidth(...) but then...
2016 Apr 11
2
X86 TRUNCATE cost for AVX & AVX2 mode
Hi, I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost calculation for TRUNCATE instruction in AVX mode. In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there it finds cost as 30 for this operation....
2017 Nov 01
5
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...8 as SubtargetFeatures in X86.td not mapped to any CPU. -Add mprefer-avx256 and mprefer-avx128 and the corresponding -mno-prefer-avx128/256 options to clang's driver Options.td file. I believe this will allow clang to pass these straight through to the -target-feature attribute in IR. -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return 256 if AVX is enabled and prefer-avx128 is not set. There may be some other backend changes needed, but I plan to address those as we find them. At a later point, consi...
2016 Mar 17
2
generate vectorized code
...t;> Hi Rail, >> >> Two hints to begin with: >> >> 1) Makes sure you example is vectorized on X86 for example >> 2) Is your target correctly overriding the TTI (declaring the vector >> register size for example) so that the vectorizer can kicks-in (see >> X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test >> the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512 >> (I don't see an equivalent option for the loop vectorizer though). >> >> Well, it sort of worked. I added a getRegisterBitWid...
2016 Mar 16
2
generate vectorized code
My question is: How do I make clang to generate assembly with vector instruction for my target? The back story is: I've added a few vector instructions to my target and confirmed that they are used by running my code on the test below and using a following command: opt i.esencia.ll -S -march=esencia -mcpu=esencia -loop-vectorize | llc -mcpu=esencia -o i.esencia.s target datalayout =
2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
...vsky at intel.com>>; Zuckerman, Michael <michael.zuckerman at intel.com<mailto:michael.zuckerman at intel.com>> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Subject: X86 TRUNCATE cost for AVX & AVX2 mode Hi, I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost calculation for TRUNCATE instruction in AVX mode. In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there it finds cost as 30 for this operation....
2016 Mar 18
3
generate vectorized code
...nts to begin with: >>>> >>>> 1) Makes sure you example is vectorized on X86 for example >>>> 2) Is your target correctly overriding the TTI (declaring the vector >>>> register size for example) so that the vectorizer can kicks-in (see >>>> X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test >>>> the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512 >>>> (I don't see an equivalent option for the loop vectorizer though). >>>> >>>> Well, it sort of wor...
2017 Nov 03
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...gt; >> -Add mprefer-avx256 and mprefer-avx128 and the corresponding >> -mno-prefer-avx128/256 options to clang's driver Options.td file. I believe >> this will allow clang to pass these straight through to the -target-feature >> attribute in IR. >> >> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is >> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return >> 256 if AVX is enabled and prefer-avx128 is not set. >> > > Instead of multiple flags that have difficult to understand intersecting > behavi...
2016 Jan 23
2
how to force llvm generate gather intrinsic
...tructions. I don't want to touch the source code. Best, Zhi On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote: I was just looking at the related masked load/store operations, and I think there are at least 2 bugs: 1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with AVX1 (not just AVX2). 2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256 bit vectors with AVX2 (not just AVX512). I looked at this for the first time today, so I may be missing something... So for the moment, the an...
2017 Nov 07
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...avx128 and the corresponding > >>> -mno-prefer-avx128/256 options to clang's driver Options.td file. I believe > >>> this will allow clang to pass these straight through to the -target-feature > >>> attribute in IR. > >>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512 is > >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return > >>> 256 if AVX is enabled and prefer-avx128 is not set. > >>> > >> > >> Instead of multiple flags that have dif...
2017 Nov 09
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...efer-avx128/256 options to clang's driver Options.td file. I >> believe >> > >>> this will allow clang to pass these straight through to the >> -target-feature >> > >>> attribute in IR. >> > >>> >> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if >> AVX512 is >> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly >> return >> > >>> 256 if AVX is enabled and prefer-avx128 is not set. >> > >>> >> > >>...
2017 Nov 11
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...tions.td file. >>>> I believe >>>> > >>> this will allow clang to pass these straight through to the >>>> -target-feature >>>> > >>> attribute in IR. >>>> > >>> >>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if >>>> AVX512 is >>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly >>>> return >>>> > >>> 256 if AVX is enabled and prefer-avx128 is not set. >>>>...
2017 Nov 12
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...e >>>>>> > >>> this will allow clang to pass these straight through to the >>>>>> -target-feature >>>>>> > >>> attribute in IR. >>>>>> > >>> >>>>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if >>>>>> AVX512 is >>>>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. >>>>>> Similarly return >>>>>> > >>> 256 if AVX is enabled and prefer-avx1...
2017 Nov 13
3
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...t;>> > >>> this will allow clang to pass these straight through to the >>>>>>> -target-feature >>>>>>> > >>> attribute in IR. >>>>>>> > >>> >>>>>>> > >>> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if >>>>>>> AVX512 is >>>>>>> > >>> enabled and prefer-avx256 and prefer-avx128 is not set. >>>>>>> Similarly return >>>>>>> > >>> 256 if AVX is enabled...
2017 Nov 13
2
RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
...>> -target-feature >> > >>> attribute in IR. >> > >>> >> > >>> -Modify >> X86TTIImpl::getRegisterBitWidth to >> only return 512 if AVX512 is >> > >>> enabled and prefer-avx256 and >> prefer-avx128 is not set. Similarly >>...