Displaying 2 results from an estimated 2 matches for "vfmadd132ss".
2014 Jul 31
2
[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines
Hi Tim,
Thanks for the thorough explanation. It makes perfect sense.
I was not aware fast-math is supposed to prevent more precision being used
than what is in the standard.
I came across this issue while looking into the output or different
compilers. XL and Microsoft compiler seem
to have that turned on by default. But I assume that clang follows what gcc
does, and have that turned off.
2014 Aug 07
2
[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines
...gt;
> $ cat fma.c
> float foo(float x, float y, float z) { return x * y + z; }
>
> $ ./clang -march=core-avx2 -O2 -S fma.c -o - | grep ss
> vmulss %xmm1, %xmm0, %xmm0
> vaddss %xmm2, %xmm0, %xmm0
>
> $ ./gcc -march=core-avx2 -O2 -S fma.c -o - | grep ss
> vfmadd132ss %xmm1, %xmm2, %xmm0
>
> ----------------------------------------------------------------------
> This was brought up in Dec 2013 on this list:
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068868.html
>
> I don't see an answer as to whether this is a bug for a...