search for: fct5

Displaying 2 results from an estimated 2 matches for "fct5".

Did you mean: fc5
2020 Sep 01
2
Should llvm optimize 1.0 / x ?
...x0(%rip),%xmm2 # 14d <_Z4fct4Dv4_f+0xd> 14b: 00 00 14d: c4 e2 71 ac c2 vfnmadd213ps %xmm2,%xmm1,%xmm0 152: c4 e2 71 98 c1 vfmadd132ps %xmm1,%xmm1,%xmm0 157: c3 retq 158: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 15f: 00 0000000000000160 <_Z4fct5Dv4_f>: 160: c5 f8 53 c0 vrcpps %xmm0,%xmm0 164: c3 retq As you can see, fct4 is not equivalent to fct5. Regards, Alexandre Bique On Tue, Sep 1, 2020 at 12:59 AM Quentin Colombet <qcolombet at apple.com> wrote: > > Hi Alexandre, > > Have you trie...
2020 Aug 31
2
Should llvm optimize 1.0 / x ?
Hi, Here is a small C++ program: vec.cc: #include <cmath> using v4f32 = float __attribute__((__vector_size__(16))); v4f32 fct1(v4f32 x) { return 1.0 / x; } v4f32 fct2(v4f32 x) { return __builtin_ia32_rcpps(x); } Which is compiled to: vec.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_Z4fct1Dv4_f>: 0: c4 e2 79 18 0d 00 00 vbroadcastss