My question is: How do I make clang to generate assembly with vector instruction for my target? The back story is: I've added a few vector instructions to my target and confirmed that they are used by running my code on the test below and using a following command: opt i.esencia.ll -S -march=esencia -mcpu=esencia -loop-vectorize | llc -mcpu=esencia -o i.esencia.s target datalayout = "E-m:e-p:32:32-i64:32-f64:32-v64:32-v128:32-a:0:32-n32" target triple = "esencia" ; Function Attrs: nounwind uwtable define i32 @main() { entry: %z = alloca <4 x i32> %a = alloca <4 x i32> %b = alloca <4 x i32> %a.l = load <4 x i32>* %a %b.l = load <4 x i32>* %b %z.l = add <4 x i32> %a.l, %b.l store <4 x i32> %z.l, <4 x i32>* %z ret i32 0 } Now I'm trying to run clang and vectorize the following test: #define N 16 int main () { int a[N], b[N]; int c[N]; for (int i = 0; i < N; ++i) c[i] = a[i] + b[i]; int sum=0; for (int i = 0; i < N; ++i) sum += c[i]; return sum; } Here are the command lines I tried: clang -S test.c --target=esencia -fvectorize -o test.esencia.s clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize -fno-lax-vector-conversions Unfortunately nothing worked. Can someone help me out? I can't really figure out why this is not working. Any help is appreciated. -- Rail Shafigulin Software Engineer Esencia Technologies -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160316/41c2c588/attachment.html>
Hi Rail, Two hints to begin with: 1) Makes sure you example is vectorized on X86 for example 2) Is your target correctly overriding the TTI (declaring the vector register size for example) so that the vectorizer can kicks-in (see X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512 (I don't see an equivalent option for the loop vectorizer though). -- Mehdi> On Mar 16, 2016, at 11:31 AM, Rail Shafigulin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > My question is: > How do I make clang to generate assembly with vector instruction for my target? > > The back story is: > I've added a few vector instructions to my target and confirmed that they are used by running my code on the test below and using a following command: > > opt i.esencia.ll -S -march=esencia -mcpu=esencia -loop-vectorize | llc -mcpu=esencia -o i.esencia.s > > target datalayout = "E-m:e-p:32:32-i64:32-f64:32-v64:32-v128:32-a:0:32-n32" > target triple = "esencia" > > ; Function Attrs: nounwind uwtable > define i32 @main() { > entry: > %z = alloca <4 x i32> > %a = alloca <4 x i32> > %b = alloca <4 x i32> > %a.l = load <4 x i32>* %a > %b.l = load <4 x i32>* %b > %z.l = add <4 x i32> %a.l, %b.l > store <4 x i32> %z.l, <4 x i32>* %z > ret i32 0 > } > > Now I'm trying to run clang and vectorize the following test: > > #define N 16 > > int main () { > > int a[N], b[N]; > int c[N]; > > for (int i = 0; i < N; ++i) > c[i] = a[i] + b[i]; > > int sum=0; > for (int i = 0; i < N; ++i) > sum += c[i]; > > return sum; > } > > Here are the command lines I tried: > > clang -S test.c --target=esencia -fvectorize -o test.esencia.s > > clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize > > clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize -fno-lax-vector-conversions > > Unfortunately nothing worked. Can someone help me out? I can't really figure out why this is not working. > > Any help is appreciated. > > -- > Rail Shafigulin > Software Engineer > Esencia Technologies > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160316/4a5ed2b1/attachment.html>
On Wed, Mar 16, 2016 at 11:48 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:> Hi Rail, > > Two hints to begin with: > > 1) Makes sure you example is vectorized on X86 for example > 2) Is your target correctly overriding the TTI (declaring the vector > register size for example) so that the vectorizer can kicks-in (see > X86TTIImpl::getRegisterBitWidth for instance). Alternatively you can test > the SLP vectorizer by passing to clang: -mllvm -slp-max-reg-size -mllvm 512 > (I don't see an equivalent option for the loop vectorizer though). > > Well, it sort of worked. I added a getRegisterBitWidth(...) but then I gotthis error: fatal error: error in backend: Cannot select: 0x5e949a8: v4i32 BUILD_VECTOR 0x5e91ae8, 0x5e91ae8, 0x5e91ae8, 0x5e91ae8 [ORD=16] [ID=16] 0x5e91ae8: i32 = Constant<0> [ID=5] 0x5e91ae8: i32 = Constant<0> [ID=5] 0x5e91ae8: i32 = Constant<0> [ID=5] 0x5e91ae8: i32 = Constant<0> [ID=5] What am I missing? Any help is appreciated.> -- > Mehdi >-- Rail Shafigulin Software Engineer Esencia Technologies -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160316/a374d00e/attachment.html>