Displaying 3 results from an estimated 3 matches for "vdivp".
Did you mean:
vdiv
2020 Aug 31
2
Should llvm optimize 1.0 / x ?
...return __builtin_ia32_rcpps(x);
}
Which is compiled to:
vec.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z4fct1Dv4_f>:
0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9
<_Z4fct1Dv4_f+0x9>
7: 00 00
9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0
d: c3 retq
e: 66 90 xchg %ax,%ax
0000000000000010 <_Z4fct2Dv4_f>:
10: c5 f8 53 c0 vrcpps %xmm0,%xmm0
14: c3 retq
As you can see, 1.0 / x is not turned into vrcpps. Is it because of
precision or a m...
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
...-dC --no-show-raw-insn ./a.out
...
00000000004004f0 <main>:
4004f0: vmovdqa 0x2004c8(%rip),%xmm0 # 6009c0 <x>
4004f8: vpsrld $0x17,%xmm0,%xmm0
4004fd: vpaddd 0x17b(%rip),%xmm0,%xmm0 # 400680
<__dso_handle+0x8>
400505: vcvtdq2ps %xmm0,%xmm1
400509: vdivps 0x17f(%rip),%xmm1,%xmm1 # 400690
<__dso_handle+0x18>
400511: vcvttps2dq %xmm1,%xmm1
400515: vpmullw 0x183(%rip),%xmm1,%xmm1 # 4006a0
<__dso_handle+0x28>
40051d: vpsubd %xmm1,%xmm0,%xmm0
400521: vmovq %xmm0,%rax
400526: movslq %eax,%rcx
400529: sar...
2020 Sep 01
2
Should llvm optimize 1.0 / x ?
...elf64-x86-64
> >
> >
> > Disassembly of section .text:
> >
> > 0000000000000000 <_Z4fct1Dv4_f>:
> > 0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9
> > <_Z4fct1Dv4_f+0x9>
> > 7: 00 00
> > 9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0
> > d: c3 retq
> > e: 66 90 xchg %ax,%ax
> >
> > 0000000000000010 <_Z4fct2Dv4_f>:
> > 10: c5 f8 53 c0 vrcpps %xmm0,%xmm0
> > 14: c3 retq
> >
> >
> > A...