H.J. Lu via llvm-dev
2021-Jul-01 21:05 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
1. Pass _Float16 and _Complex _Float16 values on stack. 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. --- low-level-sys-info.tex | 57 +++++++++++++++++++++++++++++------------- 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex index acaf30e..82956e3 100644 --- a/low-level-sys-info.tex +++ b/low-level-sys-info.tex @@ -30,7 +30,8 @@ object, and the term \emph{\textindex{\sixteenbyte{}}} refers to a \subsubsection{Fundamental Types} Table~\ref{basic-types} shows the correspondence between ISO C -scalar types and the processor scalar types. \code{__float80}, +scalar types and the processor scalar types. \code{_Float16}, +\code{__float80}, \code{__float128}, \code{__m64}, \code{__m128}, \code{__m256} and \code{__m512} types are optional. @@ -79,22 +80,25 @@ scalar types and the processor scalar types. \code{__float80}, & \texttt{\textit{any-type} *} & 4 & 4 & unsigned \fourbyte \\ & \texttt{\textit{any-type} (*)()} & & \\ \hline - Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\ \cline{2-5} - point & \texttt{double} & 8 & 4 & double (IEEE-754) \\ - & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\ + & \texttt{_Float16}$^{\dagger\dagger\dagger\dagger\dagger\dagger}$ & 2 & 2 & 16-bit (IEEE-754) \\ \cline{2-5} - & \texttt{__float80}$^{\dagger\dagger}$ & 12 & 4 & 80-bit extended (IEEE-754) \\ - & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\ + & \texttt{float} & 4 & 4 & single (IEEE-754) \\ + \cline{2-5} + Floating- & \texttt{double} & 8 + & 8$^{\dagger\dagger\dagger\dagger}$ & double (IEEE-754) \\ + \cline{2-5} + point & \texttt{__float80}$^{\dagger\dagger}$ & 16 & 16 & 80-bit extended (IEEE-754) \\ + & \texttt{long double}$^{\dagger\dagger\dagger\dagger\dagger}$ & 16 & 16 & 80-bit extended (IEEE-754) \\ \cline{2-5} & \texttt{__float128}$^{\dagger\dagger}$ & 16 & 16 & 128-bit extended (IEEE-754) \\ \hline - Complex& \texttt{_Complex float} & 8 & 4 & complex single (IEEE-754) \\ + & \texttt{_Complex float} & 8 & 4 & complex single (IEEE-754) \\ \cline{2-5} - Floating-& \texttt{_Complex double} & 16 & 4 & complex double (IEEE-754) \\ - point & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\ + Complex& \texttt{_Complex double} & 16 & 4 & complex double (IEEE-754) \\ + Floating-& \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\ \cline{2-5} - & \texttt{_Complex __float80}$^{\dagger\dagger}$ & 24 & 4 & complex 80-bit extended (IEEE-754) \\ + point & \texttt{_Complex __float80}$^{\dagger\dagger}$ & 24 & 4 & complex 80-bit extended (IEEE-754) \\ & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\ \cline{2-5} & \texttt{_Complex __float128}$^{\dagger\dagger}$ & 32 & 16 & complex 128-bit extended (IEEE-754) \\ @@ -125,6 +129,8 @@ The \texttt{long double} type is 64-bit, the same as the \texttt{double} type, on the Android{\texttrademark} platform. More information on the Android{\texttrademark} platform is available from \url{http://www.android.com/}.}\\ +\multicolumn{5}{p{13cm}}{\myfontsize $^{\dagger\dagger\dagger\dagger\dagger\dagger}$ +The \texttt{_Float16} type, from ISO/IEC TS 18661-3:2015, is optional.}\\ \end{tabular} } \end{table} @@ -323,6 +329,7 @@ at the time of the call. \begin{table} \Hrule \caption{Register Usage} + \myfontsize \label{fig-reg-usage} \begin{center} \begin{tabular}{l|p{8.35cm}|l} @@ -346,13 +353,29 @@ of some 64bit return types & No \\ \EBP & callee-saved register; optionally used as frame pointer & Yes \\ \ESI & callee-saved register & yes \\ \EDI & callee-saved register & yes \\ -\reg{xmm0}, \reg{ymm0} & scratch registers; also used to pass and return -\code{__m128}, \code{__m256} parameters & No\\ -\reg{xmm1}--\reg{xmm2},& scratch registers; also used to pass -\code{__m128}, & No \\ -\reg{ymm1}--\reg{ymm2} & \code{__m256} parameters & \\ -\reg{xmm3}--\reg{xmm7},& scratch registers & No \\ -\reg{ymm3}--\reg{ymm7} & & \\ +\reg{xmm0} & scratch register; also used to pass the first \code{__m128} + parameter and return \code{__m128}, \code{_Float16}, + the real part of \code{_Complex _Float16} & No \\ +\reg{ymm0} & scratch register; also used to pass the first \code{__m256} + parameter and return \code{__m256} & No \\ +\reg{zmm0} & scratch register; also used to pass the first \code{__m512} + parameter and return \code{__m512} & No \\ +\reg{xmm1} & scratch register; also used to pass the second \code{__m128} + parameter and return the imaginary part of + \code{_Complex _Float16} & No \\ +\reg{ymm1} & scratch register; also used to pass the second \code{__m256} + parameters & No \\ +\reg{zmm1} & scratch register; also used to pass the second \code{__m512} + parameters & No \\ +\reg{xmm2} & scratch register; also used to pass the third \code{__m128} + parameters & No \\ +\reg{ymm2} & scratch register; also used to pass the third \code{__m256} + parameters & No \\ +\reg{zmm2} & scratch register; also used to pass the third \code{__m512} + parameters & No \\ +\reg{xmm3}--\reg{xmm7} & scratch registers & No \\ +\reg{ymm3}--\reg{ymm7} & scratch registers & No \\ +\reg{zmm3}--\reg{zmm7} & scratch registers & No \\ \reg{mm0} & scratch register; also used to pass and return \code{__m64} parameter & No\\ \reg{mm1}--\reg{mm2} & used to pass \code{__m64} parameters & No\\ -- 2.31.1
Joseph Myers via llvm-dev
2021-Jul-01 22:10 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:> 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.That restricts use of _Float16 to processors with SSE. Is that what we want in the ABI, or should _Float16 be available with base 32-bit x86 architecture features only, much like _Float128 and the decimal FP types are? (If it is restricted to SSE, we can of course ensure relevant libgcc functions are built with SSE enabled, and likewise in glibc if that gains _Float16 functions, though maybe with some extra complications to get relevant testcases to run whenever possible.) -- Joseph S. Myers joseph at codesourcery.com