search for: _z4fct2dv4_f

Displaying 3 results from an estimated 3 matches for "_z4fct2dv4_f".

Did you mean: _z4fct1dv4_f
2020 Aug 31
2
Vectorization of math function failed?
...%rsp),%xmm1,%xmm1 57: 20 58: c4 e3 71 21 c0 30 vinsertps $0x30,%xmm0,%xmm1,%xmm0 5e: 48 83 c4 48 add $0x48,%rsp 62: c3 retq 63: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 6a: 00 00 00 6d: 0f 1f 00 nopl (%rax) 0000000000000070 <_Z4fct2Dv4_f>: 70: 48 83 ec 48 sub $0x48,%rsp 74: c5 f8 29 04 24 vmovaps %xmm0,(%rsp) 79: e8 00 00 00 00 callq 7e <_Z4fct2Dv4_f+0xe> 7e: c5 f8 29 44 24 30 vmovaps %xmm0,0x30(%rsp) 84: c5 fa 16 04 24 vmovshdup (%rsp),%xmm0 89: e8 00 00 00 00 callq...
2020 Aug 31
2
Should llvm optimize 1.0 / x ?
....text: 0000000000000000 <_Z4fct1Dv4_f>: 0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9 <_Z4fct1Dv4_f+0x9> 7: 00 00 9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0 d: c3 retq e: 66 90 xchg %ax,%ax 0000000000000010 <_Z4fct2Dv4_f>: 10: c5 f8 53 c0 vrcpps %xmm0,%xmm0 14: c3 retq As you can see, 1.0 / x is not turned into vrcpps. Is it because of precision or a missing optimization? Regards, -- Alexandre Bique
2020 Sep 01
2
Should llvm optimize 1.0 / x ?
...0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9 > > <_Z4fct1Dv4_f+0x9> > > 7: 00 00 > > 9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0 > > d: c3 retq > > e: 66 90 xchg %ax,%ax > > > > 0000000000000010 <_Z4fct2Dv4_f>: > > 10: c5 f8 53 c0 vrcpps %xmm0,%xmm0 > > 14: c3 retq > > > > > > As you can see, 1.0 / x is not turned into vrcpps. Is it because of > > precision or a missing optimization? > > > > Regards, > > -- > >...