Francesco Petrogalli via llvm-dev
2016-Nov-30 15:13 UTC
[llvm-dev] Enable "#pragma omp declare simd" in the LoopVectorizer
I have sent out an equivalent RFC email for this functionality, as requested in the review https://reviews.llvm.org/D27250 Please use the new thread with “RFC” in it. Thanks, Francesco On 30/11/2016 11:46, "llvm-dev on behalf of Francesco Petrogalli via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote:>Dear all, > >I have just created a couple of differential reviews to enable the >vectorisation of loops that have function calls to routines marked with >“#pragma omp declare simd”. > >They can be (re)viewed here: > >* https://reviews.llvm.org/D27249 > >* https://reviews.llvm.org/D27250 > >The current implementation allows the loop vectorizer to generate vector >code for source file as: > > #pragma omp declare simd > double f(double x); > > void aaa(double *x, double *y, int N) { > for (int i = 0; i < N; ++i) { > x[i] = f(y[i]); > } > } > > >by invoking clang with arguments: > > $> clang -fopenmp -c -O3 file.c […] > > >Such functionality should provide a nice interface for vector libraries >developers that can be used to inform the loop vectorizer of the >availability of an external library with the vector implementation of the >scalar functions in the loops. For this, all is needed to do is to mark >with “#pragma omp declare simd” the function declaration in the header >file of the library and generate the associated symbols in the object file >of the library according to the name scheme of the vector ABI (see notes >below). > >I am interested in any feedback/suggestion/review the community might have >regarding this behaviour. > >Below you find a description of the implementation and some notes. > >Thanks, > >Francesco > >----------- > >The functionality is implemented as follow: > >1. Clang CodeGen generates a set of global external variables for each of >the function declarations marked with the OpenMP pragma. Each of such >globals are named according a mangling that is generated by >llvm::TargetLibraryInfoImpl (TLII), and holds the vector signature of the >associated vector function. (See examples in the tests of the clang patch. >Each scalar function can generate multiple vector functions depending on >the clauses of the declare simd directives) >2. When clang created the TLII, it processes the llvm::Module and finds >out which of the globals of the module have the correct mangling and type >so that they be added to the TLII as a list of vector function that can be >associated to the original scalar one. >3. The LoopVectorizer looks for the available vector functions through the >TLII not by scalar name and vectorisation factor but by scalar name and >vector function signature, thus enabling the vectorizer to be able to >distinguish a "vector vpow1(vector x, vector y)” from a “vector >vpow2(vector x, scalar y)”. (The second one corresponds to a “declare simd >uniform(y)” for a “scalar pow(scalar x, scalar y)” declaration). (Notice >that the changes in the loop vectorizer are minimal.) > > >Notes: > >1. To enable SIMD only for OpenMP, leaving all the multithread/target >behaviour behind, we should enable this also with a new option: >-fopenmp-simd >2. The AArch64 vector ABI in the code is essentially the same as for the >Intel one (apart from the prefix and the masking argument), and it is >based on the clauses associated to “declare simd” in OpenMP 4.0. For >OpenMP4.5, the parameters section of the mangled name should be updated. >This update will not change the vectorizer behaviour as all the vectorizer >needs to detect a vectorizable function is the original scalar name and a >compatible vector function signature. Of course, any changes/updates in >the ABI will have to be reflected in the symbols of the binary file of the >library. >3. Whistle this is working only for function declaration, the same >functionality can be used when (if) clang will implement the declare simd >OpenMP pragma for function definitions. >4. I have enabled this for any loop that invokes the scalar function call, >not just for those annotated with “#pragma omp for simd”. I don’t have any >preference here, but at the same time I don’t see any reason why this >shouldn’t be enabled by default for non annotated loops. Let me know if >you disagree, I’d happily change the functionality if there are sound >reasons behind that. > >_______________________________________________ >LLVM Developers mailing list >llvm-dev at lists.llvm.org >http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev