thr3ads.net - search: "__builtin_vector

Displaying 2 results from an estimated 2 matches for "__builtin_vector_hadd".

2020 Aug 19

Question about llvm vectors

...ector of [4 x float] containing the horizontal sums of /// both operands. static __inline__ __m128 __DEFAULT_FN_ATTRS _mm_hadd_ps(__m128 __a, __m128 __b) { return __builtin_ia32_haddps((__v4sf)__a, (__v4sf)__b); } Here clang will translate _mm_hadd_ps to a CPU specific feature. Why not create __builtin_vector_hadd(a, b) which would select the CPU specific instruction or a fallback generic implementation? Many thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200819/c4cb85dd/attachment.html>

Question about llvm vectors

2020 Aug 20

Question about llvm vectors

...tor is not referenced there https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors Best regards, Alexandre Bique On Wed, Aug 19, 2020 at 8:34 PM Craig Topper <craig.topper at gmail.com> wrote: > I'm not sure everyone would agree that the behavior of a > __builtin_vector_hadd should do what the X86 instruction does. It takes two > vectors and produces a result with elements from both vectors. Someone > might argue that a horizontal add should just take one source and produce a > vector with half the number of elements. Someone else might argue that a > horiz...

search for: __builtin_vector_hadd