search for: simdlen

Displaying 20 results from an estimated 27 matches for "simdlen".

2019 Jun 11
2
RFC: Interface user provided vector functions with the vectorizer.
...implemented. ``` clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) <variant-func-id>:= The name of a function variant that is a base language identifier, or, for C++, a template-id. <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} <simdlen> := simdlen(<positive number>) | simdlen("scalable") <mask> := inbranch | notinbranch <optional simd clauses> := <linear clause> | <uniform clause>...
2019 Jun 17
3
RFC: Interface user provided vector functions with the vectorizer.
..._declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) > > <variant-func-id>:= The name of a function variant that is a base language identifier, or, > for C++, a template-id. > > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} > > <simdlen> := simdlen(<positive number>) | simdlen("scalable") > > <mask> := inbranch | notinbranch > > <optional simd clauses> := <linear clause> > | <...
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
...ble to distinguish what C type generated the signature of > `foo`. > > I don’t know if this is going to be a problem for other architectures, > but this is definitely a problem on AArch64 where we need to be able > to generate the correct vector function signature for a specific > simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector > type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>. > > Therefore, I would like to propose a change to the RFC, which would > move the responsibility off generating the vector function sig...
2019 Jun 21
2
RFC: Interface user provided vector functions with the vectorizer.
...s i64, therefore is not possible to distinguish what C type generated the signature of `foo`. I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>. Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to...
2019 Jun 24
4
RFC: Interface user provided vector functions with the vectorizer.
...s i64, therefore is not possible to distinguish what C type generated the signature of `foo`. I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>. Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to...
2019 Jun 26
3
LAA behavior on Incorrect #pragma omp simd.
Hi All, I have a doubt regarding the behavior of LoopAccessAnalysis on incorrect #pragma omp simd with -fopenmp-simd flag. How should the compiler behave if the #pragma omp simd on a loop is incorrect and can be proved by Loop Access Analysis. Here is the sample code. #pragma omp simd for (dim_t p = 0; p < m; ++p) #pragma unroll for (dim_t i = 0; i < 6; ++i) { {
2019 Jun 04
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...e to accept anything a `declare variant` in a `simd` context could accept, but that wold probably raise the same problems that `declare variant` has raised. > > In fact, on AArch64, a typical use of declare variant would be more complicated than just specifying `simd` in the `context`, with `simdlen` and `[not]inbranch`. It would require also to specify the `isa` trait from the `device` set, and to use a `vendor` specific set to be able to list scalable (i.e. vector-length agnostic) functions for SVE. > > This would definitely look like implementing `declare variant`, as the attribute w...
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
>Thank you everybody for their input, and for your patience. This is proving harder than expected! :) Thank you for doing the hard part of the work. Hideki -----Original Message----- From: Francesco Petrogalli [mailto:Francesco.Petrogalli at arm.com] Sent: Monday, June 24, 2019 11:26 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Doerfert, Johannes <jdoerfert at anl.gov>;
2018 Jan 19
1
Does OpenMP hints bypass the vectorisation legality check in llvm
...SIMD code. Whether we run LVL.canVectorize() or not, we still have to record such information so that correct vector code will be generated. Also, unless programmer specifies all "cost model decisions", we still have to run cost model to fill the gap. There are many OpenMP SIMD loops w/o SIMDLEN clause ---- vectorizer needs to decide the optimal VF for the loop. Thanks, Hideki --------------------- Date: Fri, 19 Jan 2018 17:54:05 +0000 From: "Tian, Xinmin via llvm-dev" <llvm-dev at lists.llvm.org> To: Tom Sun <ps702 at cam.ac.uk>, "llvm-dev at lists.llvm.org&qu...
2018 Nov 30
2
[RFC] Re-implementing -fveclib with OpenMP
Hi all, I am submitting the following RFC [1] to re-implement -fveclib via OpenMP constructs. The RFC was discussed during a round table at the last LLVM developer meeting, and presented during the BoF [2]. The proposal is published on Phabricator, for the purpose of keeping track of the comments, and it now ready for a review from a wider audience after being polished by Hal Finkel and Hideki
2019 Jun 03
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
On Mon, 3 Jun 2019 at 20:00, Francesco Petrogalli via cfe-dev < cfe-dev at lists.llvm.org> wrote: > Hi All, > > The original intend of this thread is to "Expose user provided vector > function for auto-vectorization.” > > I originally proposed to use OpenMP `declare variant` for the sake of > using something that is defined by a standard. The RFC itself is not
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...be `<2 x double>(2 x double>). > > The problem with this choice is the number of vector version available for a target is not unique. > > In particular, the following declaration generates multiple vector versions, depending on the target: > > #pragma omp declare simd simdlen(2) notinbranch > double foo(double) {…}; > > On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > > On aarch64, the same declaration generates a unique symbol, as specified in the Vector Function ABI....
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...;> >>> The problem with this choice is the number of vector version available for a target is not unique. >>> >>> In particular, the following declaration generates multiple vector versions, depending on the target: >>> >>> #pragma omp declare simd simdlen(2) notinbranch >>> double foo(double) {…}; >>> >>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>> >>> On aarch64, the same declaration generates a unique symbol,...
2019 May 30
5
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...>>>>>> correspondent `#pragma omp declare variant` defined in OpenMP 5.0, as in >>>>>> the following example. >>>>>> >>>>>> #pragma clang declare variant(vector_sinf) \ >>>>>> match(construct=simd(simdlen(4),notinbranch), device={isa("simd")}) >>>>>> extern float sinf(float); >>>>>> >>>>>> float32x4_t vector_sinf(float32x4_t x); >>>>>> >>>>>> The `construct` set in the directive, together...
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...with this choice is the number of vector version available for a target is not unique. >>>>> >>>>> In particular, the following declaration generates multiple vector versions, depending on the target: >>>>> >>>>> #pragma omp declare simd simdlen(2) notinbranch >>>>> double foo(double) {…}; >>>>> >>>>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>>>> >>>>> On aarch64, the same...
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...function definition with that _ZGV... name, with the function body of the scalar code surrounded by a constant trip loop (trip count is part of the mangled function name) and then massage the function body. Once the IR for "#pragma omp simd" is well defined, we'd add #pragma omp simd simdlen() to that constant trip loop. The end result is a function with a widened interface that can be called from the vectorized caller. The work of VecClone ends here. Later, LoopVectorize should process the #pragma omp simd simdlen() and that's the real end of the callee side handling of #pragma om...
2019 May 31
5
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...day not be different from declaring these N variants explicitly with the respective declare variant match clause. > >> In particular, the following declaration generates multiple vector > >> versions, depending on the target: > >> > >> #pragma omp declare simd simdlen(2) notinbranch > >> double foo(double) {…}; > >> > >> On x86, this generates at least 4 symbols (one for SSE, one for AVX, > >> one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > >> > >> On aarch64, the same declaration generate...
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...f vector version available for a target is not unique. >>>>>>> >>>>>>> In particular, the following declaration generates multiple vector versions, depending on the target: >>>>>>> >>>>>>> #pragma omp declare simd simdlen(2) notinbranch >>>>>>> double foo(double) {…}; >>>>>>> >>>>>>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>>>>>> >>&...
2019 May 31
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...choice is the number of vector version available > for a target is not unique. For me, this simply means this mangling scheme is not sufficient. > In particular, the following declaration generates multiple vector > versions, depending on the target: > > #pragma omp declare simd simdlen(2) notinbranch > double foo(double) {…}; > > On x86, this generates at least 4 symbols (one for SSE, one for AVX, > one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > > On aarch64, the same declaration generates a unique symbol, as > specified in the Vector Fun...
2019 May 29
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
...the >> `#pragma clang declare variant` according to the rule defined by the >> correspondent `#pragma omp declare variant` defined in OpenMP 5.0, as in >> the following example. >> >> #pragma clang declare variant(vector_sinf) \ >> match(construct=simd(simdlen(4),notinbranch), device={isa("simd")}) >> extern float sinf(float); >> >> float32x4_t vector_sinf(float32x4_t x); >> >> The `construct` set in the directive, together with the `device` set, is >> used to generate the vector mangled name to be...