search for: _mm_hadd_ps

Displaying 2 results from an estimated 2 matches for "_mm_hadd_ps".

Did you mean: _mm_add_ps
2020 Aug 19
2
Question about llvm vectors
...tor of [4 x float] containing one of the source operands. /// The horizontal sums of the values are stored in the upper bits of the /// destination. /// \returns A 128-bit vector of [4 x float] containing the horizontal sums of /// both operands. static __inline__ __m128 __DEFAULT_FN_ATTRS _mm_hadd_ps(__m128 __a, __m128 __b) { return __builtin_ia32_haddps((__v4sf)__a, (__v4sf)__b); } Here clang will translate _mm_hadd_ps to a CPU specific feature. Why not create __builtin_vector_hadd(a, b) which would select the CPU specific instruction or a fallback generic implementation? Many thanks, Alex...
2020 Aug 20
2
Question about llvm vectors
...The horizontal sums of the values are stored in the upper bits of >> the >> /// destination. >> /// \returns A 128-bit vector of [4 x float] containing the horizontal >> sums of >> /// both operands. >> static __inline__ __m128 __DEFAULT_FN_ATTRS >> _mm_hadd_ps(__m128 __a, __m128 __b) >> { >> return __builtin_ia32_haddps((__v4sf)__a, (__v4sf)__b); >> } >> >> Here clang will translate _mm_hadd_ps to a CPU specific feature. >> Why not create __builtin_vector_hadd(a, b) which would select the CPU >> specific instru...