Francesco Petrogalli via llvm-dev
2019-Jun-11 20:55 UTC
[llvm-dev] RFC: Interface user provided vector functions with the vectorizer.
Dear all, I have re-written the proposal for interfacing user provided vector functions, originally posted in both llvm-dev and cfe-dev mailing list: "[RFC] Expose user provided vector function for auto-vectorization." The proposal looks quite different from the original submission, therefore I took the liberty to start a new thread. The original thread generated some good discussion. In particular, Simon Moll and Johannes Doerfert (CCed) have managed to provide good arguments for the following claims: 1. The Vector Function ABI name mangling scheme of a target is not enough to describe all uses cases of function vectorization that the compiler might end up needing to support in the future. 2. `declare variant` needs to be handled properly at IR level, to be able to give the compiler the full OpenMP context of the directive. This proposal addresses those two concerns and other (I believe) minor concerns that have been raised in the previous thread. This proposal is provided with examples and a self assessment around extendibility. I have CCed all the people that have participated in the discussion so far, please let me know if you think I have missed anything of what have been raised. Kind regards, Francesco *** DRAFT OF THE PROPOSAL *** # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer. Because the users care about portability (across compilers, libraries and systems), I believe we have to base sour solution on a standard that describes the mapping from the scalar function to the vector function. Because OpenMP is standard and widely used, we should base our solution on the mechanisms that the standard provides, via the directives `declare simd` and `declare variant`, the latter when used in with the `simd` trait in the `construct` set. Please notice that: 1. The scope of the proposal is not implementing full support for `pragma omp declare variant`. 2. The scope of the proposal is not enabling the vectorizer to do new kind of vectorizations (e.g. RV-like vectorization described by Simon). 3. The proposal aims to be extendible wrt 1. and 2. 4. The IR attribute introduced in this proposal is equivalent to the one needed for the VecClone pass under development in https://reviews.llvm.org/D22792 # CLANG COMPONENTS A C function attribute, `clang_declare_simd_variant`, to attach to the scalar version. The attribute provides enough information to the compiler about the vector shape of the user defined function. The vector shapes handled by the attribute are those handled by the OpenMP standard via `declare simd` (and no more than that). 1. The function attribute handling in clang is crafted with the requirement that it will be possible to re-use the same components for the info generated by `declare variant` when used with a `simd` traits in the `construct` set. 2. The attribute allows orthogonality with the vectorization that is done via OpenMP: the user vector function is still exposed for vectorization when not using `-fopenmp-[simd]` once the `declare simd` and `declare variant` directive of OpenMP will be available in the front-end. ## C function attribute: `clang_declare_simd_variant` The definition of this attribute has been crafted to match the semantics of `declare variant` for a `simd` construct described in OpenMP 5.0. I have added only the traits of the `device` set, `isa` and `arch`, which I believe are enough to cover for the use case of this proposal. If that is not the case, please provide an example, extending the attribute will be easy even once the current one is implemented. ``` clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) <variant-func-id>:= The name of a function variant that is a base language identifier, or, for C++, a template-id. <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} <simdlen> := simdlen(<positive number>) | simdlen("scalable") <mask> := inbranch | notinbranch <optional simd clauses> := <linear clause> | <uniform clause> | <align clause> | {,<optional simd clauses>} <linear clause> := linear_ref(<var>,<step>) | linear_var(<var>, <step>) | linear_uval(<var>, <step>) | linear(<var>, <step>) <step> := <var> | <non zero number> <uniform clause> := uniform(<var>) <align clause> := align(<var>, <positive number>) <var> := Name of a parameter in the scalar function declaration/definition <non zero number> := ... | -2 | -1 | 1 | 2 | ... <positive number> := 1 | 2 | 3 | ... <context selector clauses> := {<isa>}{,} {<arch>} <isa> := isa(target-specific-value) <arch> := arch(target-specific-value) ``` # LLVM COMPONENTS: ## VectorFunctionShape class The object `VectorFunctionShape` contains the information about the kind of vectorization available for an `llvm::Call`. The object `VectorFunctionShape` must contain the following information: 1. Vectorization Factor (or number or concurrent lanes executed by the SIMD version of the function). Encoded by unsigned integer. 2. Whether the vector function is requested for scalable vectorization, encoded by a boolean. 3. Information about masking / no masking, encoded by a boolean. 4. Information about the parameters, encoded in a container that carries objects of type `ParamaterType`, to describe features like `linear` and `uniform`. 5. Function name redirection, if a user has specified to use a custom name instead of the Vector Function ABI ones. Items 1. to 5. represents the information stored in the `vector-function-abi-variant` attribute (see next section). The object can be extended in the future to include new vectorization kinds (for example the RV-like vectorization of the Region Vectorizer), or to add more context information that might come from other uses of OpenMP `declare variant`, or to add new Vector Function ABIs not based on OpenMP. Such information can be retrieved by attributes that will be added to describe the `Call` instance. ## IR Attribute We define a `vector-function-abi-variant` attribute that lists the mangled names produced via the mangling function of the Vector Function ABI rules. ``` vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..." ``` 1. Because we use only OpenMP `declare simd` vectorization, and because we require a vector Function ABI, we make this explicit in the name of the attribute. 2. Because the Vector Function ABIs encode all the information needed to know the vectorization shape of the vector function in the mangled names, we provide the mangled name via the attribute. 3. Function names redirection is specified by enclosing the name of the redirection in parenthesis, as in `abi_mangled_name_02(user_redirection)`. ## Vector ABI Demangler The “Vector ABI demangler”, is the component that demangles the data in the `vector-function-abi-variant` attribute and that provides the instances of the class `VectorFunctionShape` that can be derived by the mangled names listed in the attribute. ## Query interface: Search Vector Function System (SVFS) An interface that can be queried by the LLVM components to understand whether or not a scalar function can be vectorized, and that retrieves the vector function to be used if such vector shape is available. 1. This component is going to be unrelated to OpenMP. 2. This component will use internally the demangler defined in the previous section, but it will not expose any aspect of the Vector Function ABI via its interface. The interface provides two methods. ``` std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call); llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info); ``` The first method is used to list all the vector shapes that available and attached to a scalar function. An empty results means that no vector versions are available. The second method retrieves the information needed to build a call to a vector function with a specific `VectorFunctionShape` info. # (SELF) ASSESSMENT ON EXTENDIBILITY 1. Extending the C function attribute `clang_declare_simd_variant` to new Vector Function ABIs that use OpenMP will be straightforward because the attribute is tight to such ABIs and OpenMP. 2. The C attribute `clang_declare_simd_variant` and the `declare variant` directive used for the `simd` trait will be sharing the internals in clang, so adding the OpenMP functionality for `simd` traits will be mostly handling the directive in the OpenMP parser. How this should be done is described in https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute 3. The IR attribute `vector-function-abi-variant` is not to be extended to represent other kind of vectorization other than those handled by `declare simd` and that are handled with a Vector Function ABI. 4. The IR attribute `vector-function-abi-variant` is not defined to be extended to represent the information of `declare variant` in its totality. 5. The IR attribute will not need to change when we will introduce non vector function ABI vectorization (RV-like, reductions...) or when we will decide to fully support `declare variant`. The information it carries will not need to be invalidated, but just extended with new attributes that will need to be handled by the `VectorFunctionShape` class, in a similar way the `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which operates on individual attributes to describe an overall functionality. # Examples ## Example 1 Exposing an Advanced SIMD vector function when targeting Advanced SIMD in AArch64. ``` double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd")); // Advanced SIMD version float64x2_t vector_foo_01(float64x2_t VectorInput); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"} ``` ## Example 2 Exposing an Advanced SIMD vector function when targeting Advanced SIMD in AArch64, but with the wrong signature. The user specifies a masked version of the function in the clauses of the attribute, the compiler throws an error suggesting the signature expected for ``vector_foo_02.`` ``` double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd")); // Advanced SIMD version float64x2_t vector_foo_02(float64x2_t VectorInput); // (suggested) compiler error -> ^ Missing mask parameter of type `uint64x2_t`. ``` ## Example 3 Targeting `sincos`-like signatures. ``` void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd")); // Advanced SIMD version void vector_foo_03(float64x2_t VectorInput, double * Output); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"} ``` ## Example 4 Scalable vectorization targeting SVE ``` double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve")); // SVE version svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} ``` ## Example 5 Fixed length vectorization targeting SVE ``` double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve")); // Fixed-length SVE version svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} ``` ## Example 06 This is an x86 example, equivalent to the one provided by Andrei Elovikow in http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ ``` float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) { return *a + b; } __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} ``` ## Example showing interaction with `declare simd` ``` #pragma omp declare simd linear(a) notinbranch float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) { return *a + x; } // Advanced SIMD version float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom implementation. } ``` The resulting IR attribute is made of three symbols: 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the ones the compiler builds by auto-vectorizing `foo_06` according to the rule defined in the Vector Function ABI specifications for AArch64. 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the user-defined redirection of the 4-lane version of `foo_06` to the custom implementation provided by the user when targeting Advanced SIMD for version 8.2 of the A64 instruction set. ``` attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"} ```
Simon Moll via llvm-dev
2019-Jun-17 08:02 UTC
[llvm-dev] RFC: Interface user provided vector functions with the vectorizer.
Hi Francesco, On 6/11/19 10:55 PM, Francesco Petrogalli wrote:> Dear all, > > I have re-written the proposal for interfacing user provided vector > functions, originally posted in both llvm-dev and cfe-dev mailing > list: > > "[RFC] Expose user provided vector function for auto-vectorization." > > The proposal looks quite different from the original submission, > therefore I took the liberty to start a new thread. > > The original thread generated some good discussion. In particular, > Simon Moll and Johannes Doerfert (CCed) have managed to provide good > arguments for the following claims: > > 1. The Vector Function ABI name mangling scheme of a target is not > enough to describe all uses cases of function vectorization that > the compiler might end up needing to support in the future.I think the new name of the attribute makes this point clear.> 2. `declare variant` needs to be handled properly at IR level, to be > able to give the compiler the full OpenMP context of the directive. > > This proposal addresses those two concerns and other (I believe) minor > concerns that have been raised in the previous thread. > > This proposal is provided with examples and a self assessment around > extendibility. > > I have CCed all the people that have participated in the discussion so > far, please let me know if you think I have missed anything of what > have been raised. > > Kind regards, > > FrancescoLGTM. Please add me as a reviewer for this when you post patches. Thanks! Simon> > *** DRAFT OF THE PROPOSAL *** > > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer. > > Because the users care about portability (across compilers, libraries > and systems), I believe we have to base sour solution on a standard > that describes the mapping from the scalar function to the vector > function. > > Because OpenMP is standard and widely used, we should base our > solution on the mechanisms that the standard provides, via the > directives `declare simd` and `declare variant`, the latter when used > in with the `simd` trait in the `construct` set. > > Please notice that: > > 1. The scope of the proposal is not implementing full support for > `pragma omp declare variant`. > 2. The scope of the proposal is not enabling the vectorizer to do new > kind of vectorizations (e.g. RV-like vectorization described by > Simon). > 3. The proposal aims to be extendible wrt 1. and 2. > 4. The IR attribute introduced in this proposal is equivalent to the > one needed for the VecClone pass under development in > https://reviews.llvm.org/D22792 > > # CLANG COMPONENTS > > A C function attribute, `clang_declare_simd_variant`, to attach to the > scalar version. The attribute provides enough information to the > compiler about the vector shape of the user defined function. The > vector shapes handled by the attribute are those handled by the OpenMP > standard via `declare simd` (and no more than that). > > 1. The function attribute handling in clang is crafted with the > requirement that it will be possible to re-use the same components > for the info generated by `declare variant` when used with a `simd` > traits in the `construct` set. > 2. The attribute allows orthogonality with the vectorization that is > done via OpenMP: the user vector function is still exposed for > vectorization when not using `-fopenmp-[simd]` once the `declare > simd` and `declare variant` directive of OpenMP will be available > in the front-end. > > ## C function attribute: `clang_declare_simd_variant` > > The definition of this attribute has been crafted to match the > semantics of `declare variant` for a `simd` construct described in > OpenMP 5.0. I have added only the traits of the `device` set, `isa` > and `arch`, which I believe are enough to cover for the use case of > this proposal. If that is not the case, please provide an example, > extending the attribute will be easy even once the current one is > implemented. > > ``` > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) > > <variant-func-id>:= The name of a function variant that is a base language identifier, or, > for C++, a template-id. > > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} > > <simdlen> := simdlen(<positive number>) | simdlen("scalable") > > <mask> := inbranch | notinbranch > > <optional simd clauses> := <linear clause> > | <uniform clause> > | <align clause> | {,<optional simd clauses>} > > <linear clause> := linear_ref(<var>,<step>) > | linear_var(<var>, <step>) > | linear_uval(<var>, <step>) > | linear(<var>, <step>) > > <step> := <var> | <non zero number> > > <uniform clause> := uniform(<var>) > > <align clause> := align(<var>, <positive number>) > > <var> := Name of a parameter in the scalar function declaration/definition > > <non zero number> := ... | -2 | -1 | 1 | 2 | ... > > <positive number> := 1 | 2 | 3 | ... > > <context selector clauses> := {<isa>}{,} {<arch>} > > <isa> := isa(target-specific-value) > > <arch> := arch(target-specific-value) > > ``` > > # LLVM COMPONENTS: > > ## VectorFunctionShape class > > The object `VectorFunctionShape` contains the information about the > kind of vectorization available for an `llvm::Call`. > > The object `VectorFunctionShape` must contain the following information: > > 1. Vectorization Factor (or number or concurrent lanes executed by the > SIMD version of the function). Encoded by unsigned integer. > 2. Whether the vector function is requested for scalable > vectorization, encoded by a boolean. > 3. Information about masking / no masking, encoded by a boolean. > 4. Information about the parameters, encoded in a container that > carries objects of type `ParamaterType`, to describe features like > `linear` and `uniform`. > 5. Function name redirection, if a user has specified to use a custom > name instead of the Vector Function ABI ones. > > Items 1. to 5. represents the information stored in the > `vector-function-abi-variant` attribute (see next section). > > The object can be extended in the future to include new vectorization > kinds (for example the RV-like vectorization of the Region > Vectorizer), or to add more context information that might come from > other uses of OpenMP `declare variant`, or to add new Vector Function > ABIs not based on OpenMP. Such information can be retrieved by > attributes that will be added to describe the `Call` instance. > > ## IR Attribute > > We define a `vector-function-abi-variant` attribute that lists the > mangled names produced via the mangling function of the Vector > Function ABI rules. > > ``` > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..." > ``` > > 1. Because we use only OpenMP `declare simd` vectorization, and > because we require a vector Function ABI, we make this explicit > in the name of the attribute. > 2. Because the Vector Function ABIs encode all the information > needed to know the vectorization shape of the vector function in > the mangled names, we provide the mangled name via the > attribute. > 3. Function names redirection is specified by enclosing the name of > the redirection in parenthesis, as in > `abi_mangled_name_02(user_redirection)`. > > ## Vector ABI Demangler > > The “Vector ABI demangler”, is the component that demangles the data > in the `vector-function-abi-variant` attribute and that provides the > instances of the class `VectorFunctionShape` that can be derived by > the mangled names listed in the attribute. > > ## Query interface: Search Vector Function System (SVFS) > > An interface that can be queried by the LLVM components to understand > whether or not a scalar function can be vectorized, and that retrieves > the vector function to be used if such vector shape is available. > > 1. This component is going to be unrelated to OpenMP. > 2. This component will use internally the demangler defined in the > previous section, but it will not expose any aspect of the Vector > Function ABI via its interface. > > The interface provides two methods. > > ``` > std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call); > > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info); > ``` > > The first method is used to list all the vector shapes that available > and attached to a scalar function. An empty results means that no > vector versions are available. > > The second method retrieves the information needed to build a call to > a vector function with a specific `VectorFunctionShape` info. > > # (SELF) ASSESSMENT ON EXTENDIBILITY > > > 1. Extending the C function attribute `clang_declare_simd_variant` to > new Vector Function ABIs that use OpenMP will be straightforward > because the attribute is tight to such ABIs and OpenMP. > 2. The C attribute `clang_declare_simd_variant` and the `declare > variant` directive used for the `simd` trait will be sharing the > internals in clang, so adding the OpenMP functionality for `simd` > traits will be mostly handling the directive in the OpenMP > parser. How this should be done is described in > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute > 3. The IR attribute `vector-function-abi-variant` is not to be > extended to represent other kind of vectorization other than those > handled by `declare simd` and that are handled with a Vector > Function ABI. > 4. The IR attribute `vector-function-abi-variant` is not defined to be > extended to represent the information of `declare variant` in its > totality. > 5. The IR attribute will not need to change when we will introduce non > vector function ABI vectorization (RV-like, reductions...) or when > we will decide to fully support `declare variant`. The information > it carries will not need to be invalidated, but just extended with > new attributes that will need to be handled by the > `VectorFunctionShape` class, in a similar way the > `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which > operates on individual attributes to describe an overall > functionality. > > # Examples > > ## Example 1 > > Exposing an Advanced SIMD vector function when targeting Advanced SIMD > in AArch64. > > ``` > double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd")); > > // Advanced SIMD version > float64x2_t vector_foo_01(float64x2_t VectorInput); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"} > ``` > > ## Example 2 > > Exposing an Advanced SIMD vector function when targeting Advanced SIMD > in AArch64, but with the wrong signature. The user specifies a masked > version of the function in the clauses of the attribute, the compiler > throws an error suggesting the signature expected for > ``vector_foo_02.`` > > ``` > double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd")); > > // Advanced SIMD version > float64x2_t vector_foo_02(float64x2_t VectorInput); > // (suggested) compiler error -> ^ Missing mask parameter of type `uint64x2_t`. > ``` > > ## Example 3 > > Targeting `sincos`-like signatures. > > ``` > void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd")); > > // Advanced SIMD version > void vector_foo_03(float64x2_t VectorInput, double * Output); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"} > ``` > ## Example 4 > > Scalable vectorization targeting SVE > > ``` > double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve")); > > // SVE version > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} > ``` > > ## Example 5 > > Fixed length vectorization targeting SVE > > ``` > double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve")); > > // Fixed-length SVE version > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} > ``` > > ## Example 06 > > This is an x86 example, equivalent to the one provided by Andrei > Elovikow in > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt > rendering with ICC at https://godbolt.org/z/Of1NxZ > > ``` > float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) > { > return *a + b; > } > > > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} > ``` > > ## Example showing interaction with `declare simd` > > ``` > #pragma omp declare simd linear(a) notinbranch > float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) { > return *a + x; > } > > // Advanced SIMD version > float32x4_t vector_foo_06(float *a, int32x4_t vx) { > // Custom implementation. > } > ``` > > The resulting IR attribute is made of three symbols: > > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the > ones the compiler builds by auto-vectorizing `foo_06` according to > the rule defined in the Vector Function ABI specifications for > AArch64. > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the > user-defined redirection of the 4-lane version of `foo_06` to the > custom implementation provided by the user when targeting Advanced > SIMD for version 8.2 of the A64 instruction set. > > ``` > attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"} > ``` >-- Simon Moll Researcher / PhD Student Compiler Design Lab (Prof. Hack) Saarland University, Computer Science Building E1.3, Room 4.31 Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de Fax. +49 (0)681 302-3065 : http://compilers.cs.uni-saarland.de/people/moll
Doerfert, Johannes via llvm-dev
2019-Jun-17 12:05 UTC
[llvm-dev] RFC: Interface user provided vector functions with the vectorizer.
I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews. Sorry for the delay and thanks for working on this. Get Outlook for Android<https://aka.ms/ghei36> ________________________________ From: Simon Moll <moll at cs.uni-saarland.de> Sent: Monday, June 17, 2019 10:02:58 AM To: Francesco Petrogalli; LLVM Development List; Clang Dev Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd; Roman Lebedev; Philip Reames; Shawn Landden Subject: Re: RFC: Interface user provided vector functions with the vectorizer. Hi Francesco, On 6/11/19 10:55 PM, Francesco Petrogalli wrote:> Dear all, > > I have re-written the proposal for interfacing user provided vector > functions, originally posted in both llvm-dev and cfe-dev mailing > list: > > "[RFC] Expose user provided vector function for auto-vectorization." > > The proposal looks quite different from the original submission, > therefore I took the liberty to start a new thread. > > The original thread generated some good discussion. In particular, > Simon Moll and Johannes Doerfert (CCed) have managed to provide good > arguments for the following claims: > > 1. The Vector Function ABI name mangling scheme of a target is not > enough to describe all uses cases of function vectorization that > the compiler might end up needing to support in the future.I think the new name of the attribute makes this point clear.> 2. `declare variant` needs to be handled properly at IR level, to be > able to give the compiler the full OpenMP context of the directive. > > This proposal addresses those two concerns and other (I believe) minor > concerns that have been raised in the previous thread. > > This proposal is provided with examples and a self assessment around > extendibility. > > I have CCed all the people that have participated in the discussion so > far, please let me know if you think I have missed anything of what > have been raised. > > Kind regards, > > FrancescoLGTM. Please add me as a reviewer for this when you post patches. Thanks! Simon> > *** DRAFT OF THE PROPOSAL *** > > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer. > > Because the users care about portability (across compilers, libraries > and systems), I believe we have to base sour solution on a standard > that describes the mapping from the scalar function to the vector > function. > > Because OpenMP is standard and widely used, we should base our > solution on the mechanisms that the standard provides, via the > directives `declare simd` and `declare variant`, the latter when used > in with the `simd` trait in the `construct` set. > > Please notice that: > > 1. The scope of the proposal is not implementing full support for > `pragma omp declare variant`. > 2. The scope of the proposal is not enabling the vectorizer to do new > kind of vectorizations (e.g. RV-like vectorization described by > Simon). > 3. The proposal aims to be extendible wrt 1. and 2. > 4. The IR attribute introduced in this proposal is equivalent to the > one needed for the VecClone pass under development in > https://reviews.llvm.org/D22792 > > # CLANG COMPONENTS > > A C function attribute, `clang_declare_simd_variant`, to attach to the > scalar version. The attribute provides enough information to the > compiler about the vector shape of the user defined function. The > vector shapes handled by the attribute are those handled by the OpenMP > standard via `declare simd` (and no more than that). > > 1. The function attribute handling in clang is crafted with the > requirement that it will be possible to re-use the same components > for the info generated by `declare variant` when used with a `simd` > traits in the `construct` set. > 2. The attribute allows orthogonality with the vectorization that is > done via OpenMP: the user vector function is still exposed for > vectorization when not using `-fopenmp-[simd]` once the `declare > simd` and `declare variant` directive of OpenMP will be available > in the front-end. > > ## C function attribute: `clang_declare_simd_variant` > > The definition of this attribute has been crafted to match the > semantics of `declare variant` for a `simd` construct described in > OpenMP 5.0. I have added only the traits of the `device` set, `isa` > and `arch`, which I believe are enough to cover for the use case of > this proposal. If that is not the case, please provide an example, > extending the attribute will be easy even once the current one is > implemented. > > ``` > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) > > <variant-func-id>:= The name of a function variant that is a base language identifier, or, > for C++, a template-id. > > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} > > <simdlen> := simdlen(<positive number>) | simdlen("scalable") > > <mask> := inbranch | notinbranch > > <optional simd clauses> := <linear clause> > | <uniform clause> > | <align clause> | {,<optional simd clauses>} > > <linear clause> := linear_ref(<var>,<step>) > | linear_var(<var>, <step>) > | linear_uval(<var>, <step>) > | linear(<var>, <step>) > > <step> := <var> | <non zero number> > > <uniform clause> := uniform(<var>) > > <align clause> := align(<var>, <positive number>) > > <var> := Name of a parameter in the scalar function declaration/definition > > <non zero number> := ... | -2 | -1 | 1 | 2 | ... > > <positive number> := 1 | 2 | 3 | ... > > <context selector clauses> := {<isa>}{,} {<arch>} > > <isa> := isa(target-specific-value) > > <arch> := arch(target-specific-value) > > ``` > > # LLVM COMPONENTS: > > ## VectorFunctionShape class > > The object `VectorFunctionShape` contains the information about the > kind of vectorization available for an `llvm::Call`. > > The object `VectorFunctionShape` must contain the following information: > > 1. Vectorization Factor (or number or concurrent lanes executed by the > SIMD version of the function). Encoded by unsigned integer. > 2. Whether the vector function is requested for scalable > vectorization, encoded by a boolean. > 3. Information about masking / no masking, encoded by a boolean. > 4. Information about the parameters, encoded in a container that > carries objects of type `ParamaterType`, to describe features like > `linear` and `uniform`. > 5. Function name redirection, if a user has specified to use a custom > name instead of the Vector Function ABI ones. > > Items 1. to 5. represents the information stored in the > `vector-function-abi-variant` attribute (see next section). > > The object can be extended in the future to include new vectorization > kinds (for example the RV-like vectorization of the Region > Vectorizer), or to add more context information that might come from > other uses of OpenMP `declare variant`, or to add new Vector Function > ABIs not based on OpenMP. Such information can be retrieved by > attributes that will be added to describe the `Call` instance. > > ## IR Attribute > > We define a `vector-function-abi-variant` attribute that lists the > mangled names produced via the mangling function of the Vector > Function ABI rules. > > ``` > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..." > ``` > > 1. Because we use only OpenMP `declare simd` vectorization, and > because we require a vector Function ABI, we make this explicit > in the name of the attribute. > 2. Because the Vector Function ABIs encode all the information > needed to know the vectorization shape of the vector function in > the mangled names, we provide the mangled name via the > attribute. > 3. Function names redirection is specified by enclosing the name of > the redirection in parenthesis, as in > `abi_mangled_name_02(user_redirection)`. > > ## Vector ABI Demangler > > The “Vector ABI demangler”, is the component that demangles the data > in the `vector-function-abi-variant` attribute and that provides the > instances of the class `VectorFunctionShape` that can be derived by > the mangled names listed in the attribute. > > ## Query interface: Search Vector Function System (SVFS) > > An interface that can be queried by the LLVM components to understand > whether or not a scalar function can be vectorized, and that retrieves > the vector function to be used if such vector shape is available. > > 1. This component is going to be unrelated to OpenMP. > 2. This component will use internally the demangler defined in the > previous section, but it will not expose any aspect of the Vector > Function ABI via its interface. > > The interface provides two methods. > > ``` > std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call); > > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info); > ``` > > The first method is used to list all the vector shapes that available > and attached to a scalar function. An empty results means that no > vector versions are available. > > The second method retrieves the information needed to build a call to > a vector function with a specific `VectorFunctionShape` info. > > # (SELF) ASSESSMENT ON EXTENDIBILITY > > > 1. Extending the C function attribute `clang_declare_simd_variant` to > new Vector Function ABIs that use OpenMP will be straightforward > because the attribute is tight to such ABIs and OpenMP. > 2. The C attribute `clang_declare_simd_variant` and the `declare > variant` directive used for the `simd` trait will be sharing the > internals in clang, so adding the OpenMP functionality for `simd` > traits will be mostly handling the directive in the OpenMP > parser. How this should be done is described in > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute > 3. The IR attribute `vector-function-abi-variant` is not to be > extended to represent other kind of vectorization other than those > handled by `declare simd` and that are handled with a Vector > Function ABI. > 4. The IR attribute `vector-function-abi-variant` is not defined to be > extended to represent the information of `declare variant` in its > totality. > 5. The IR attribute will not need to change when we will introduce non > vector function ABI vectorization (RV-like, reductions...) or when > we will decide to fully support `declare variant`. The information > it carries will not need to be invalidated, but just extended with > new attributes that will need to be handled by the > `VectorFunctionShape` class, in a similar way the > `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which > operates on individual attributes to describe an overall > functionality. > > # Examples > > ## Example 1 > > Exposing an Advanced SIMD vector function when targeting Advanced SIMD > in AArch64. > > ``` > double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd")); > > // Advanced SIMD version > float64x2_t vector_foo_01(float64x2_t VectorInput); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"} > ``` > > ## Example 2 > > Exposing an Advanced SIMD vector function when targeting Advanced SIMD > in AArch64, but with the wrong signature. The user specifies a masked > version of the function in the clauses of the attribute, the compiler > throws an error suggesting the signature expected for > ``vector_foo_02.`` > > ``` > double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd")); > > // Advanced SIMD version > float64x2_t vector_foo_02(float64x2_t VectorInput); > // (suggested) compiler error -> ^ Missing mask parameter of type `uint64x2_t`. > ``` > > ## Example 3 > > Targeting `sincos`-like signatures. > > ``` > void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd")); > > // Advanced SIMD version > void vector_foo_03(float64x2_t VectorInput, double * Output); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"} > ``` > ## Example 4 > > Scalable vectorization targeting SVE > > ``` > double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve")); > > // SVE version > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} > ``` > > ## Example 5 > > Fixed length vectorization targeting SVE > > ``` > double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve")); > > // Fixed-length SVE version > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"} > ``` > > ## Example 06 > > This is an x86 example, equivalent to the one provided by Andrei > Elovikow in > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt > rendering with ICC at https://godbolt.org/z/Of1NxZ > > ``` > float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) > { > return *a + b; > } > > > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} > ``` > > ## Example showing interaction with `declare simd` > > ``` > #pragma omp declare simd linear(a) notinbranch > float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) { > return *a + x; > } > > // Advanced SIMD version > float32x4_t vector_foo_06(float *a, int32x4_t vx) { > // Custom implementation. > } > ``` > > The resulting IR attribute is made of three symbols: > > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the > ones the compiler builds by auto-vectorizing `foo_06` according to > the rule defined in the Vector Function ABI specifications for > AArch64. > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the > user-defined redirection of the 4-lane version of `foo_06` to the > custom implementation provided by the user when targeting Advanced > SIMD for version 8.2 of the A64 instruction set. > > ``` > attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"} > ``` >-- Simon Moll Researcher / PhD Student Compiler Design Lab (Prof. Hack) Saarland University, Computer Science Building E1.3, Room 4.31 Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de Fax. +49 (0)681 302-3065 : http://compilers.cs.uni-saarland.de/people/moll -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190617/5902a8f4/attachment-0001.html>
Possibly Parallel Threads
- RFC: Interface user provided vector functions with the vectorizer.
- RFC: Interface user provided vector functions with the vectorizer.
- RFC: Interface user provided vector functions with the vectorizer.
- RFC: Interface user provided vector functions with the vectorizer.
- RFC: Interface user provided vector functions with the vectorizer.