thr3ads.net - search: "simdlen"

Displaying 20 results from an estimated 27 matches for "simdlen".

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 11

RFC: Interface user provided vector functions with the vectorizer.

...implemented. ``` clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) <variant-func-id>:= The name of a function variant that is a base language identifier, or, for C++, a template-id. <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} <simdlen> := simdlen(<positive number>) | simdlen("scalable") <mask> := inbranch | notinbranch <optional simd clauses> := <linear clause> | <uniform clause>...

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 17

RFC: Interface user provided vector functions with the vectorizer.

..._declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>}) > > <variant-func-id>:= The name of a function variant that is a base language identifier, or, > for C++, a template-id. > > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>} > > <simdlen> := simdlen(<positive number>) | simdlen("scalable") > > <mask> := inbranch | notinbranch > > <optional simd clauses> := <linear clause> > | <...

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

...ble to distinguish what C type generated the signature of > `foo`. > > I don’t know if this is going to be a problem for other architectures, > but this is definitely a problem on AArch64 where we need to be able > to generate the correct vector function signature for a specific > simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector > type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>. > > Therefore, I would like to propose a change to the RFC, which would > move the responsibility off generating the vector function sig...

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 21

RFC: Interface user provided vector functions with the vectorizer.

...s i64, therefore is not possible to distinguish what C type generated the signature of `foo`. I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>. Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to...

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

LAA behavior on Incorrect #pragma omp simd.

2019 Jun 26

LAA behavior on Incorrect #pragma omp simd.

Hi All, I have a doubt regarding the behavior of LoopAccessAnalysis on incorrect #pragma omp simd with -fopenmp-simd flag. How should the compiler behave if the #pragma omp simd on a loop is incorrect and can be proved by Loop Access Analysis. Here is the sample code. #pragma omp simd for (dim_t p = 0; p < m; ++p) #pragma unroll for (dim_t i = 0; i < 6; ++i) { {

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 04

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...e to accept anything a `declare variant` in a `simd` context could accept, but that wold probably raise the same problems that `declare variant` has raised. > > In fact, on AArch64, a typical use of declare variant would be more complicated than just specifying `simd` in the `context`, with `simdlen` and `[not]inbranch`. It would require also to specify the `isa` trait from the `device` set, and to use a `vendor` specific set to be able to list scalable (i.e. vector-length agnostic) functions for SVE. > > This would definitely look like implementing `declare variant`, as the attribute w...

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

>Thank you everybody for their input, and for your patience. This is proving harder than expected! :) Thank you for doing the hard part of the work. Hideki -----Original Message----- From: Francesco Petrogalli [mailto:Francesco.Petrogalli at arm.com] Sent: Monday, June 24, 2019 11:26 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Doerfert, Johannes <jdoerfert at anl.gov>;

Does OpenMP hints bypass the vectorisation legality check in llvm

2018 Jan 19

Does OpenMP hints bypass the vectorisation legality check in llvm

...SIMD code. Whether we run LVL.canVectorize() or not, we still have to record such information so that correct vector code will be generated. Also, unless programmer specifies all "cost model decisions", we still have to run cost model to fill the gap. There are many OpenMP SIMD loops w/o SIMDLEN clause ---- vectorizer needs to decide the optimal VF for the loop. Thanks, Hideki --------------------- Date: Fri, 19 Jan 2018 17:54:05 +0000 From: "Tian, Xinmin via llvm-dev" <llvm-dev at lists.llvm.org> To: Tom Sun <ps702 at cam.ac.uk>, "llvm-dev at lists.llvm.org&qu...

[RFC] Re-implementing -fveclib with OpenMP

2018 Nov 30

[RFC] Re-implementing -fveclib with OpenMP

Hi all, I am submitting the following RFC [1] to re-implement -fveclib via OpenMP constructs. The RFC was discussed during a round table at the last LLVM developer meeting, and presented during the BoF [2]. The proposal is published on Phabricator, for the purpose of keeping track of the comments, and it now ready for a review from a wider audience after being polished by Hal Finkel and Hideki

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 03

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

On Mon, 3 Jun 2019 at 20:00, Francesco Petrogalli via cfe-dev < cfe-dev at lists.llvm.org> wrote: > Hi All, > > The original intend of this thread is to "Expose user provided vector > function for auto-vectorization.” > > I originally proposed to use OpenMP `declare variant` for the sake of > using something that is defined by a standard. The RFC itself is not

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...be `<2 x double>(2 x double>). > > The problem with this choice is the number of vector version available for a target is not unique. > > In particular, the following declaration generates multiple vector versions, depending on the target: > > #pragma omp declare simd simdlen(2) notinbranch > double foo(double) {…}; > > On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > > On aarch64, the same declaration generates a unique symbol, as specified in the Vector Function ABI....

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...;> >>> The problem with this choice is the number of vector version available for a target is not unique. >>> >>> In particular, the following declaration generates multiple vector versions, depending on the target: >>> >>> #pragma omp declare simd simdlen(2) notinbranch >>> double foo(double) {…}; >>> >>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>> >>> On aarch64, the same declaration generates a unique symbol,...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 30

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...>>>>>> correspondent `#pragma omp declare variant` defined in OpenMP 5.0, as in >>>>>> the following example. >>>>>> >>>>>> #pragma clang declare variant(vector_sinf) \ >>>>>> match(construct=simd(simdlen(4),notinbranch), device={isa("simd")}) >>>>>> extern float sinf(float); >>>>>> >>>>>> float32x4_t vector_sinf(float32x4_t x); >>>>>> >>>>>> The `construct` set in the directive, together...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...with this choice is the number of vector version available for a target is not unique. >>>>> >>>>> In particular, the following declaration generates multiple vector versions, depending on the target: >>>>> >>>>> #pragma omp declare simd simdlen(2) notinbranch >>>>> double foo(double) {…}; >>>>> >>>>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>>>> >>>>> On aarch64, the same...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...function definition with that _ZGV... name, with the function body of the scalar code surrounded by a constant trip loop (trip count is part of the mangled function name) and then massage the function body. Once the IR for "#pragma omp simd" is well defined, we'd add #pragma omp simd simdlen() to that constant trip loop. The end result is a function with a widened interface that can be called from the vectorized caller. The work of VecClone ends here. Later, LoopVectorize should process the #pragma omp simd simdlen() and that's the real end of the callee side handling of #pragma om...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...day not be different from declaring these N variants explicitly with the respective declare variant match clause. > >> In particular, the following declaration generates multiple vector > >> versions, depending on the target: > >> > >> #pragma omp declare simd simdlen(2) notinbranch > >> double foo(double) {…}; > >> > >> On x86, this generates at least 4 symbols (one for SSE, one for AVX, > >> one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > >> > >> On aarch64, the same declaration generate...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...f vector version available for a target is not unique. >>>>>>> >>>>>>> In particular, the following declaration generates multiple vector versions, depending on the target: >>>>>>> >>>>>>> #pragma omp declare simd simdlen(2) notinbranch >>>>>>> double foo(double) {…}; >>>>>>> >>>>>>> On x86, this generates at least 4 symbols (one for SSE, one for AVX, one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) >>>>>>> >>&...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 31

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...choice is the number of vector version available > for a target is not unique. For me, this simply means this mangling scheme is not sufficient. > In particular, the following declaration generates multiple vector > versions, depending on the target: > > #pragma omp declare simd simdlen(2) notinbranch > double foo(double) {…}; > > On x86, this generates at least 4 symbols (one for SSE, one for AVX, > one for AVX2, and one for AVX512: https://godbolt.org/z/TLYXPi) > > On aarch64, the same declaration generates a unique symbol, as > specified in the Vector Fun...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 May 29

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...the >> `#pragma clang declare variant` according to the rule defined by the >> correspondent `#pragma omp declare variant` defined in OpenMP 5.0, as in >> the following example. >> >> #pragma clang declare variant(vector_sinf) \ >> match(construct=simd(simdlen(4),notinbranch), device={isa("simd")}) >> extern float sinf(float); >> >> float32x4_t vector_sinf(float32x4_t x); >> >> The `construct` set in the directive, together with the `device` set, is >> used to generate the vector mangled name to be...

search for: simdlen