search for: __m256

Displaying 12 results from an estimated 12 matches for "__m256".

Did you mean: __m256i
2016 Nov 30
2
RFC: Adding Support For Vectorcall Calling Convention
...ion of HVA Types -------------------------------------- A Homogeneous Vector Aggregate (HVA) type is a composite type of up to four data members that have identical vector types. An HVA type has the same alignment requirement as the vector type of its members. For example: typedef struct { __m256 x; __m256 y; __m256 z; } hva3; // HVA type with 3 __m256 elements Vectorcall Extension ---------------------------- Vectorcall extends the standard x64 calling convention while adding support for HVA and vector types. There are four main differences: - Floating-point types are consid...
2016 May 31
2
[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.
...= x"1"], @@ -576,10 +585,13 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [OPUS_X86_MAY_HAVE_AVX], [OPUS_X86_PRESUME_AVX], [[#include <immintrin.h> + #include <time.h> ]], [[ - static __m256 mtest; - mtest = _mm256_setzero_ps(); + __m256 mtest; + mtest = _mm256_set1_ps((float)time(NULL)); + mtest = _mm256_addsub_ps(mtest, mtest); + return _mm_cvtss_si32(_mm256_extractf128_ps(mtest, 0)); ]] ) AS_IF([test...
2019 Jun 10
2
[RFC] Expose user provided vector function for auto-vectorization.
...t the vectorizer should care about? For the case mentioned earlier: float MyAdd(float* a, int b) { return *a + b; } __declspec(vector_variant(implements(MyAdd(float *a, int b)), linear(a), vectorlength(8), nomask, processor(core_2nd_gen_avx))) __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) If FE emitted ;; Alwaysinline define <8 x float> @MyAddVec.abi_wrapper(float* %v_a, <8 x i32> %v_b) { ;; Not sure about the exact values in the mask parameter. %v_b1 = shufflevector <8 x i32> %v_b, <8 x i32> un...
2019 Jun 10
2
[RFC] Expose user provided vector function for auto-vectorization.
...eloper-guide-and-reference-vector-variant: > > float MyAdd(float* a, int b) { return *a + b; } > __declspec(vector_variant(implements(MyAdd(float *a, int b)), > linear(a), vectorlength(8), > nomask, processor(core_2nd_gen_avx))) > __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) > > We need somehow communicate which lanes of widened "b" would map for the b1 parameter and which would go to the b2. If we only care about single ABI (like the one mandated by the OMP) than such things could be put to TTI...
2014 Dec 15
2
[LLVMdev] ABI incompatability when passing vector parameters on 32-bit x86
...clang and GCC in the way vector parameters are passed on 32-bit x86. (This is documented in PR21510.) Specifically, GCC uses XMM0-XMM2 to pass the first 3 __m128 parameters, and the rest are passed on the stack. Clang passes an additional parameter by register, using XMM0-XMM3. The same applies to __m256 with YMM0-2 vs. YMM0-3. In theory, it would apply to __m512 as well, but currently clang doesn't support passing __m512 in x86 mode at all. ICC has the same behavior as GCC, and it seems that MSVC in 32-bit mode only *allows* up to 3 vector parameters per function (when not using __vectorcall),...
2019 Jun 07
2
[RFC] Expose user provided vector function for auto-vectorization.
...en-us/cpp-compiler-developer-guide-and-reference-vector-variant: float MyAdd(float* a, int b) { return *a + b; } __declspec(vector_variant(implements(MyAdd(float *a, int b)), linear(a), vectorlength(8), nomask, processor(core_2nd_gen_avx))) __m256 __regcall MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2) We need somehow communicate which lanes of widened "b" would map for the b1 parameter and which would go to the b2. If we only care about single ABI (like the one mandated by the OMP) than such things could be put to TTI, but wha...
2019 Jun 11
2
RFC: Interface user provided vector functions with the vectorizer.
...http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ ``` float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) { return *a + b; } __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ``` The resulting IR attribute is: ``` attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} ``` ## Example showing interaction with `declare simd` ``` #pragma omp declare simd linear(a) notinbranch float foo_06(float *a, in...
2019 Jun 17
3
RFC: Interface user provided vector functions with the vectorizer.
...une/132885.html. Godbolt > rendering with ICC at https://godbolt.org/z/Of1NxZ > > ``` > float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) > { > return *a + b; > } > > > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); > ``` > > The resulting IR attribute is: > > ``` > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} > ``` > > ## Example showing interaction with `declare simd` > > ``` > #pragma omp decl...
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
...gt; >> > ``` >> > float MyAdd(float* a, int b) > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), > notinbranch, arch("core_2nd_gen_avx")) >> > { >> > return *a + b; >> > } >> > >> > >> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); >> > ``` >> > >> > The resulting IR attribute is: >> > >> > ``` >> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} >> > ``` >> > >> > ## Ex...
2019 Jun 21
2
RFC: Interface user provided vector functions with the vectorizer.
...CC at https://godbolt.org/z/Of1NxZ > > > > ``` > > float MyAdd(float* a, int b) > > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) { > > return *a + b; > > } > > > > > > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ``` > > > > The resulting IR attribute is: > > > > ``` > > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} > > ``` > > > > ## Example showing interaction with `declare simd`...
2019 Jun 24
4
RFC: Interface user provided vector functions with the vectorizer.
...ICC at https://godbolt.org/z/Of1NxZ > > > > ``` > > float MyAdd(float* a, int b) > > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) { > > return *a + b; > > } > > > > > > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ``` > > > > The resulting IR attribute is: > > > > ``` > > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"} > > ``` > > > > ## Example showing interaction with `declare simd`...
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
>Thank you everybody for their input, and for your patience. This is proving harder than expected! :) Thank you for doing the hard part of the work. Hideki -----Original Message----- From: Francesco Petrogalli [mailto:Francesco.Petrogalli at arm.com] Sent: Monday, June 24, 2019 11:26 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Doerfert, Johannes <jdoerfert at anl.gov>;