search for: vrcpps

Displaying 2 results from an estimated 2 matches for "vrcpps".

Did you mean: rcpps
2020 Aug 31
2
Should llvm optimize 1.0 / x ?
...: 0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9 <_Z4fct1Dv4_f+0x9> 7: 00 00 9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0 d: c3 retq e: 66 90 xchg %ax,%ax 0000000000000010 <_Z4fct2Dv4_f>: 10: c5 f8 53 c0 vrcpps %xmm0,%xmm0 14: c3 retq As you can see, 1.0 / x is not turned into vrcpps. Is it because of precision or a missing optimization? Regards, -- Alexandre Bique
2020 Sep 01
2
Should llvm optimize 1.0 / x ?
Hi Quentin, You are correct, I could manage to get clang to use vrcpps, but not in a satisfying way: clang++ -O3 -march=native -mtune=native \ -Rpass=loop-vectorize -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize \ -ffast-math -ffp-model=fast -ffp-exception-behavior=ignore -ffp-contract=fast \ -c -o vec.o vec.cc 0000000000000140 <_Z4fct4Dv4_f>: 14...