zhi chen via llvm-dev
2016-Jan-23 00:00 UTC
[llvm-dev] how to force llvm generate gather intrinsic
Hi, I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather intrinsic generated. int foo(int A[800], int B[800], int C[800]) { for (int i = 0; i < 800; i++) { A[B[i]] = i + 5; } for (int i = 0; i < 800; i++) { A[B[i]]++; } for (int i = 0; i < 800; i++) { A[i] = B[C[i]]; } return 0; } Could some give me an example that will generate gather intrinsic for AVX2? I tried to used the masked_gather intrinsic provided in the language ref. But it seemed that it only generates gather intrinsic for AVX-512 but for AVX-2. I found that there are 16 gather intrinsic versions depending on the data types provided for AVX-2. Do I have to check the data type before calling them specifically? or is there a generic way to use AVX-2 gather intrinsic? Best, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160122/32911e71/attachment.html>
Sanjay Patel via llvm-dev
2016-Jan-23 00:54 UTC
[llvm-dev] how to force llvm generate gather intrinsic
I was just looking at the related masked load/store operations, and I think there are at least 2 bugs: 1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with AVX1 (not just AVX2). 2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256 bit vectors with AVX2 (not just AVX512). I looked at this for the first time today, so I may be missing something... So for the moment, the answer to your question is 'no'; there's no generic way to produce these instructions. You should be able to use the _mm_* intrinsics in C though. On Fri, Jan 22, 2016 at 5:00 PM, zhi chen via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, > say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I > used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather > intrinsic generated. > > int foo(int A[800], int B[800], int C[800]) { > for (int i = 0; i < 800; i++) { > A[B[i]] = i + 5; > } > > for (int i = 0; i < 800; i++) { > A[B[i]]++; > } > > for (int i = 0; i < 800; i++) { > A[i] = B[C[i]]; > } > return 0; > } > > Could some give me an example that will generate gather intrinsic for > AVX2? I tried to used the masked_gather intrinsic provided in the language > ref. But it seemed that it only generates gather intrinsic for AVX-512 but > for AVX-2. I found that there are 16 gather intrinsic versions depending on > the data types provided for AVX-2. Do I have to check the data type before > calling them specifically? or is there a generic way to use AVX-2 gather > intrinsic? > > Best, > Zhi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160122/f5164a88/attachment.html>
zhi chen via llvm-dev
2016-Jan-23 00:58 UTC
[llvm-dev] how to force llvm generate gather intrinsic
Thanks for your response, Sanjay. I know there are intrinsics available in C/C++. But the problem is that I want to instrument my code at the IR level and generate those instructions. I don't want to touch the source code. Best, Zhi On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com> wrote:> I was just looking at the related masked load/store operations, and I > think there are at least 2 bugs: > > 1. X86TTIImpl::isLegalMaskedLoad/Store() should be legal for FP types with > AVX1 (not just AVX2). > 2. X86TTIImpl::isLegalMaskedGather/Scatter() should be legal for 128/256 > bit vectors with AVX2 (not just AVX512). > > I looked at this for the first time today, so I may be missing something... > > So for the moment, the answer to your question is 'no'; there's no generic > way to produce these instructions. You should be able to use the _mm_* > intrinsics in C though. > > > > > On Fri, Jan 22, 2016 at 5:00 PM, zhi chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi, >> >> I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, >> say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I >> used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather >> intrinsic generated. >> >> int foo(int A[800], int B[800], int C[800]) { >> for (int i = 0; i < 800; i++) { >> A[B[i]] = i + 5; >> } >> >> for (int i = 0; i < 800; i++) { >> A[B[i]]++; >> } >> >> for (int i = 0; i < 800; i++) { >> A[i] = B[C[i]]; >> } >> return 0; >> } >> >> Could some give me an example that will generate gather intrinsic for >> AVX2? I tried to used the masked_gather intrinsic provided in the language >> ref. But it seemed that it only generates gather intrinsic for AVX-512 but >> for AVX-2. I found that there are 16 gather intrinsic versions depending on >> the data types provided for AVX-2. Do I have to check the data type before >> calling them specifically? or is there a generic way to use AVX-2 gather >> intrinsic? >> >> Best, >> Zhi >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160122/cab9bfe7/attachment.html>