hameeza ahmed via llvm-dev
2018-Jan-29 20:43 UTC
[llvm-dev] Polly loop offloading to Accelerator
Thank You. i used -polly-ast-detect-parallel but there is no coincident info generated; my c code is simple vec-sum as follows; #include <stdio.h> int a[2048], b[2048], c[2048]; foo () { int i; for (i=0; i<2048; i++) { a[i]=b[5] + c[i]; } } i executed following commands; $clang -S -emit-llvm vec-sum.cpp -march=native -O3 -mllvm -disable-llvm-optzns -o vec-sum.s $opt -S -polly-canonicalize vec-sum.s > vecsum.preopt.ll $opt -polly-ast -polly-ast-detect-parallel -analyze -q vecsum.preopt.ll -polly-process-unprofitable the output is; :: isl ast :: foo :: %1---%8 if (1) #pragma simd #pragma known-parallel for (int c0 = 0; c0 <= 2047; c0 += 1) Stmt0(c0); else { /* original code */ } there is no coincident info. Can you please explain coincident...How it is related to locality? Well my work deals with identifying non-temporal accesses and if such accesses are found offload them to my accelerator... now it can be done by multiple approaches if i implement my own pass where i do all the dependency and locality analysis from scratch using some builtin llvm passes (i dont know if they are efficient enough), so in the end my pass should detect whether the code has non temporal behavior or not.. 2nd approach could be using polly passes to detect locality (non-temporal) in my new pass.......... and based on the analysis results i also need to transform my IR such as appending meta data (my-accelerator) to the non temporal accesses (loops)....... Please help me. Thank You Regards On Mon, Jan 29, 2018 at 8:20 PM, Michael Kruse <llvmdev at meinersbur.de> wrote:> Hi, > > you could use Polly to generate an AstInfo. With the option > -polly-ast-detect-parallel it will mark loops in the generated Ast as > "coincident", i.e. parallel and without reuse. > > If you know the vector width of your accelerator, you can use > LoopVectorizationLegality::canVectorize to determine whether you can > vectorize it. If your accelerators computational model assumes no > loop-carried dependencies, maybe LoopVectorizationLegality can be > modified to accept 'infinite' vector width. > > polly-dev at googlegroups.com would be the mailing list for help > specically for Polly. > > Michael > > > > > > 2018-01-20 9:47 GMT-06:00 hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org>: > > Hello, > > > > i have been working with an accelerator backend. the accelerator has > large > > vector/simd units. > > > > i want streaming loops (non-temporal) vectorized present in code to be > > offloaded to accelerator simd units. > > > > > > i find polly really suitable for this. > > > > i am thinking if the generated IR is passed to polly and then it analyzes > > loop to know it posses no reuse, if such loop is identified accelerator > > instructions are emitted.. > > > > where should i begin from to achieve the goals? > > > > please clarify? > > > > > > Thank You > > Regards > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/fa1c4894/attachment.html>
Michael Kruse via llvm-dev
2018-Jan-30 02:41 UTC
[llvm-dev] Polly loop offloading to Accelerator
2018-01-29 14:43 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>:> Thank You. > > i used -polly-ast-detect-parallel but there is no coincident info generated; > > my c code is simple vec-sum as follows; > > #include <stdio.h> > int a[2048], b[2048], c[2048]; > foo () { > int i; > for (i=0; i<2048; i++) { > a[i]=b[5] + c[i]; > > } > } > > i executed following commands; > > $clang -S -emit-llvm vec-sum.cpp -march=native -O3 -mllvm > -disable-llvm-optzns -o vec-sum.s > $opt -S -polly-canonicalize vec-sum.s > vecsum.preopt.ll > $opt -polly-ast -polly-ast-detect-parallel -analyze -q vecsum.preopt.ll > -polly-process-unprofitable > > the output is; > > :: isl ast :: foo :: %1---%8 > > if (1) > > #pragma simd > #pragma known-parallel > for (int c0 = 0; c0 <= 2047; c0 += 1) > Stmt0(c0); > > else > { /* original code */ } > > > there is no coincident info. > > Can you please explain coincident...How it is related to locality?"coincidence" is isl's term for being executable in parallel. It is right there: #pragma known-parallel> Well my work deals with identifying non-temporal accesses and if such > accesses are found offload them to my accelerator... > > now it can be done by multiple approaches if i implement my own pass where i > do all the dependency and locality analysis from scratch using some builtin > llvm passes (i dont know if they are efficient enough), so in the end my > pass should detect whether the code has non temporal behavior or not.. > 2nd approach could be using polly passes to detect locality (non-temporal) > in my new pass.......... > > and based on the analysis results i also need to transform my IR such as > appending meta data (my-accelerator) to the non temporal accesses > (loops).......Could you please explain what you understand by non-temporal accesses/behavior and why it is important to your accelerator? Is it about the accelerator's cache behavior? Michael