Jacob Lifshay via llvm-dev
2021-Jul-01 23:33 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Thu, Jul 1, 2021, 15:28 H.J. Lu via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <joseph at codesourcery.com> > wrote: > > > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote: > > > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 > registers. > > > > That restricts use of _Float16 to processors with SSE. Is that what we > > want in the ABI, or should _Float16 be available with base 32-bit x86 > > architecture features only, much like _Float128 and the decimal FP types > > Yes, _Float16 requires XMM registers. > > > are? (If it is restricted to SSE, we can of course ensure relevant > libgcc > > functions are built with SSE enabled, and likewise in glibc if that gains > > _Float16 functions, though maybe with some extra complications to get > > relevant testcases to run whenever possible.) > > > > _Float16 functions in libgcc should be compiled with SSE enabled. > > BTW, _Float16 software emulation may require more than just SSE > since we need to do _Float16 load and store with XMM registers. > There is no 16bit load/store for XMM registers without AVX512FP16. >Umm, if you just need to load/store 16-bit scalars in XMM registers you can use pextrw and pinsrw which don't require AVX. f16x8 can use any of the standard full-register load/stores. https://gcc.godbolt.org/z/ncznr9TM1 Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210701/c39f5b7c/attachment.html>
Richard Biener via llvm-dev
2021-Jul-02 07:45 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Fri, Jul 2, 2021 at 1:34 AM Jacob Lifshay via Gcc-patches <gcc-patches at gcc.gnu.org> wrote:> > On Thu, Jul 1, 2021, 15:28 H.J. Lu via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <joseph at codesourcery.com> > > wrote: > > > > > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote: > > > > > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 > > registers. > > > > > > That restricts use of _Float16 to processors with SSE. Is that what we > > > want in the ABI, or should _Float16 be available with base 32-bit x86 > > > architecture features only, much like _Float128 and the decimal FP types > > > > Yes, _Float16 requires XMM registers. > > > > > are? (If it is restricted to SSE, we can of course ensure relevant > > libgcc > > > functions are built with SSE enabled, and likewise in glibc if that gains > > > _Float16 functions, though maybe with some extra complications to get > > > relevant testcases to run whenever possible.) > > > > > > > _Float16 functions in libgcc should be compiled with SSE enabled. > > > > BTW, _Float16 software emulation may require more than just SSE > > since we need to do _Float16 load and store with XMM registers. > > There is no 16bit load/store for XMM registers without AVX512FP16. > > > > Umm, if you just need to load/store 16-bit scalars in XMM registers you can > use pextrw and pinsrw which don't require AVX. f16x8 can use any of the > standard full-register load/stores.It looks like that requires SSE2, with SSE only inserts/extracts to/from MMX regs are supported. But of course GPR half-word loads and GPR->XMM moves of full size would work.> https://gcc.godbolt.org/z/ncznr9TM1 > > Jacob