thr3ads.net - search: "simd

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 03

2

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...> mangling scheme of the Vector Function ABI provides all the information > about the shape and properties of the vector function, I propose the > approach exemplified in the following code: > > > ``` > // AArch64 Advanced SIMD compilation > double foo(double) __attribute__(simd_variant(“nN2v”,”neon_foo”)); > float64x2_t neon_foo(float64x2_t x) {…} > > // x86 SSE compilation > double foo(double) __attribute__(simd_variant(“aN2v”,”sse_foo”)); > __m128 sse_foo(__m128 x) {…} > ``` > > The attribute would use the “core” tokens of the mangled names (without >...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 03

6

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...ld use a custom attribute. Because the mangling scheme of the Vector Function ABI provides all the information about the shape and properties of the vector function, I propose the approach exemplified in the following code: ``` // AArch64 Advanced SIMD compilation double foo(double) __attribute__(simd_variant(“nN2v”,”neon_foo”)); float64x2_t neon_foo(float64x2_t x) {…} // x86 SSE compilation double foo(double) __attribute__(simd_variant(“aN2v”,”sse_foo”)); __m128 sse_foo(__m128 x) {…} ``` The attribute would use the “core” tokens of the mangled names (without _ZGV prefix and the scalar function name p...

[RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 07

2

[RFC] Expose user provided vector function for auto-vectorization.

...ld use a custom attribute. Because the mangling scheme of the Vector Function ABI provides all the information about the shape and properties of the vector function, I propose the approach exemplified in the following code: ``` // AArch64 Advanced SIMD compilation double foo(double) __attribute__(simd_variant(“nN2v”,”neon_foo”)); float64x2_t neon_foo(float64x2_t x) {…} // x86 SSE compilation double foo(double) __attribute__(simd_variant(“aN2v”,”sse_foo”)); __m128 sse_foo(__m128 x) {…} ``` The attribute would use the “core” tokens of the mangled names (without _ZGV prefix and the scalar function name p...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 04

2

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

...set, and to use a `vendor` specific set to be able to list scalable (i.e. vector-length agnostic) functions for SVE. > > This would definitely look like implementing `declare variant`, as the attribute would need to accept something like the following: > > ``` > . . . __attribute__(simd_variant(“vector_version”,“context={simd(simdlen(2),notinbranch”},device={isa(“simd")})) > ``` > > Using the sub-string from the mangled name to me has the following advantages: > > 1. No need to deal with `declare variant`. > 2. [...] > 3. In terms of usability, there is no nee...

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 01

2

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

Page 22 of OpenMP 5.0 specification (Lines 13/14): When any thread encounters a simd construct, the iterations of the loop associated with the construct may be executed concurrently using the SIMD lanes that are available to the thread This is the Execution Model. The word here is "may" i.e., not "must". Declare simd is not explicitly mentioned here, but requiring

search for: simd_variant