Displaying 2 results from an estimated 2 matches for "__builtin_vector_hadd".
2020 Aug 19
2
Question about llvm vectors
...ector of [4 x float] containing the horizontal sums
of
/// both operands.
static __inline__ __m128 __DEFAULT_FN_ATTRS
_mm_hadd_ps(__m128 __a, __m128 __b)
{
return __builtin_ia32_haddps((__v4sf)__a, (__v4sf)__b);
}
Here clang will translate _mm_hadd_ps to a CPU specific feature.
Why not create __builtin_vector_hadd(a, b) which would select the CPU
specific instruction or a fallback generic implementation?
Many thanks,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200819/c4cb85dd/attachment.html>
2020 Aug 20
2
Question about llvm vectors
...tor
is not referenced there
https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors
Best regards,
Alexandre Bique
On Wed, Aug 19, 2020 at 8:34 PM Craig Topper <craig.topper at gmail.com> wrote:
> I'm not sure everyone would agree that the behavior of a
> __builtin_vector_hadd should do what the X86 instruction does. It takes two
> vectors and produces a result with elements from both vectors. Someone
> might argue that a horizontal add should just take one source and produce a
> vector with half the number of elements. Someone else might argue that a
> horiz...