thr3ads.net - llvm dev - [llvm-dev] [PATCH] Add optional

If this information is useful, please help other people find it:
Share via:

H.J. Lu via llvm-dev

2021-Jul-01 21:05 UTC

[llvm-dev] [PATCH] Add optional _Float16 support

1. Pass _Float16 and _Complex _Float16 values on stack.
2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
---
 low-level-sys-info.tex | 57 +++++++++++++++++++++++++++++-------------
 1 file changed, 40 insertions(+), 17 deletions(-)

diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index acaf30e..82956e3 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -30,7 +30,8 @@ object, and the term \emph{\textindex{\sixteenbyte{}}} refers
to a
 \subsubsection{Fundamental Types}
 
 Table~\ref{basic-types} shows the correspondence between ISO C
-scalar types and the processor scalar types.  \code{__float80},
+scalar types and the processor scalar types.  \code{_Float16},
+\code{__float80},
 \code{__float128}, \code{__m64}, \code{__m128}, \code{__m256} and
 \code{__m512} types are optional.
 
@@ -79,22 +80,25 @@ scalar types and the processor scalar types. 
\code{__float80},
     & \texttt{\textit{any-type} *} & 4 & 4 & unsigned \fourbyte
\\
     & \texttt{\textit{any-type} (*)()} & & \\
     \hline
-    Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\
     \cline{2-5}
-    point & \texttt{double} & 8 & 4 & double (IEEE-754) \\
-    & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$  & &
& \\
+    & \texttt{_Float16}$^{\dagger\dagger\dagger\dagger\dagger\dagger}$
& 2 & 2 & 16-bit (IEEE-754) \\
     \cline{2-5}
-    & \texttt{__float80}$^{\dagger\dagger}$  & 12 & 4 & 80-bit
extended (IEEE-754) \\
-    & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$  & &
& \\
+    & \texttt{float} & 4 & 4 & single (IEEE-754) \\
+    \cline{2-5}
+    Floating- & \texttt{double} & 8
+	& 8$^{\dagger\dagger\dagger\dagger}$ & double (IEEE-754) \\
+    \cline{2-5}
+    point & \texttt{__float80}$^{\dagger\dagger}$  & 16 & 16 &
80-bit extended (IEEE-754) \\
+    & \texttt{long double}$^{\dagger\dagger\dagger\dagger\dagger}$  &
16 & 16 & 80-bit extended (IEEE-754) \\
     \cline{2-5}
     & \texttt{__float128}$^{\dagger\dagger}$ & 16 & 16 &
128-bit extended (IEEE-754) \\
     \hline
-    Complex& \texttt{_Complex float} & 8 & 4 & complex single
(IEEE-754) \\
+    & \texttt{_Complex float} & 8 & 4 & complex single
(IEEE-754) \\
     \cline{2-5}
-    Floating-& \texttt{_Complex double} & 16 & 4 & complex
double (IEEE-754) \\
-    point & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$
& & & \\
+    Complex& \texttt{_Complex double} & 16 & 4 & complex double
(IEEE-754) \\
+    Floating-& \texttt{_Complex long
double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
     \cline{2-5}
-    & \texttt{_Complex __float80}$^{\dagger\dagger}$  & 24 & 4
& complex 80-bit extended (IEEE-754) \\
+    point & \texttt{_Complex __float80}$^{\dagger\dagger}$  & 24 &
4 & complex 80-bit extended (IEEE-754) \\
     & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$  &
& & \\
     \cline{2-5}
     & \texttt{_Complex __float128}$^{\dagger\dagger}$ & 32 & 16
& complex 128-bit extended (IEEE-754) \\
@@ -125,6 +129,8 @@ The \texttt{long double} type is 64-bit, the same as the
\texttt{double}
 type, on the Android{\texttrademark} platform.  More information on the
 Android{\texttrademark} platform is available from
 \url{http://www.android.com/}.}\\
+\multicolumn{5}{p{13cm}}{\myfontsize
$^{\dagger\dagger\dagger\dagger\dagger\dagger}$
+The \texttt{_Float16} type, from ISO/IEC TS 18661-3:2015, is optional.}\\
   \end{tabular}
 }
 \end{table}
@@ -323,6 +329,7 @@ at the time of the call.
 \begin{table}
 \Hrule
   \caption{Register Usage}
+  \myfontsize
   \label{fig-reg-usage}
   \begin{center}
     \begin{tabular}{l|p{8.35cm}|l}
@@ -346,13 +353,29 @@ of some 64bit return types & No \\
 \EBP & callee-saved register; optionally used as frame pointer & Yes \\
 \ESI & callee-saved register & yes \\
 \EDI & callee-saved register & yes \\
-\reg{xmm0}, \reg{ymm0} & scratch registers; also used to pass and return
-\code{__m128}, \code{__m256} parameters & No\\
-\reg{xmm1}--\reg{xmm2},& scratch registers; also used to pass
-\code{__m128}, & No \\
-\reg{ymm1}--\reg{ymm2} & \code{__m256} parameters & \\
-\reg{xmm3}--\reg{xmm7},& scratch registers & No \\
-\reg{ymm3}--\reg{ymm7} & & \\
+\reg{xmm0} & scratch register; also used to pass the first \code{__m128}
+             parameter and return \code{__m128}, \code{_Float16},
+	     the real part of \code{_Complex _Float16} & No \\
+\reg{ymm0} & scratch register; also used to pass the first \code{__m256}
+             parameter and return \code{__m256} & No \\
+\reg{zmm0} & scratch register; also used to pass the first \code{__m512}
+             parameter and return \code{__m512} & No \\
+\reg{xmm1} & scratch register; also used to pass the second \code{__m128}
+             parameter and return the imaginary part of
+	     \code{_Complex _Float16} & No \\
+\reg{ymm1} & scratch register; also used to pass the second \code{__m256}
+             parameters & No \\
+\reg{zmm1} & scratch register; also used to pass the second \code{__m512}
+             parameters & No \\
+\reg{xmm2} & scratch register; also used to pass the third \code{__m128}
+             parameters & No \\
+\reg{ymm2} & scratch register; also used to pass the third \code{__m256}
+             parameters & No \\
+\reg{zmm2} & scratch register; also used to pass the third \code{__m512}
+             parameters & No \\
+\reg{xmm3}--\reg{xmm7} & scratch registers & No \\
+\reg{ymm3}--\reg{ymm7} & scratch registers & No \\
+\reg{zmm3}--\reg{zmm7} & scratch registers & No \\
 \reg{mm0} & scratch register; also used to pass and return
 \code{__m64} parameter & No\\
 \reg{mm1}--\reg{mm2} & used to pass \code{__m64} parameters & No\\
-- 
2.31.1

Joseph Myers via llvm-dev

2021-Jul-01 22:10 UTC

head link

[llvm-dev] [PATCH] Add optional _Float16 support

On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
That restricts use of _Float16 to processors with SSE.  Is that what we 
want in the ABI, or should _Float16 be available with base 32-bit x86 
architecture features only, much like _Float128 and the decimal FP types 
are?  (If it is restricted to SSE, we can of course ensure relevant libgcc 
functions are built with SSE enabled, and likewise in glibc if that gains 
_Float16 functions, though maybe with some extra complications to get 
relevant testcases to run whenever possible.)

-- 
Joseph S. Myers
joseph at codesourcery.com

llvm dev - Jul 2021 - [PATCH] Add optional _Float16 support

[llvm-dev] [PATCH] Add optional _Float16 support

[llvm-dev] [PATCH] Add optional _Float16 support