thr3ads.net - search: "vfmadd132ss"

Displaying 2 results from an estimated 2 matches for "vfmadd132ss".

[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines

2014 Jul 31

[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines

Hi Tim, Thanks for the thorough explanation. It makes perfect sense. I was not aware fast-math is supposed to prevent more precision being used than what is in the standard. I came across this issue while looking into the output or different compilers. XL and Microsoft compiler seem to have that turned on by default. But I assume that clang follows what gcc does, and have that turned off.

[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines

2014 Aug 07

[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines

...gt; > $ cat fma.c > float foo(float x, float y, float z) { return x * y + z; } > > $ ./clang -march=core-avx2 -O2 -S fma.c -o - | grep ss > vmulss %xmm1, %xmm0, %xmm0 > vaddss %xmm2, %xmm0, %xmm0 > > $ ./gcc -march=core-avx2 -O2 -S fma.c -o - | grep ss > vfmadd132ss %xmm1, %xmm2, %xmm0 > > ---------------------------------------------------------------------- > This was brought up in Dec 2013 on this list: > http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068868.html > > I don't see an answer as to whether this is a bug for a...

search for: vfmadd132ss