thr3ads.net - llvm dev - [llvm-dev] [PATCH] Add optional

If this information is useful, please help other people find it:
Share via:

John McCall via llvm-dev

2021-Aug-24 05:55 UTC

[llvm-dev] [PATCH] Add optional _Float16 support

On Thu, Jul 29, 2021 at 9:40 AM H.J. Lu <hjl.tools at gmail.com> wrote:
> On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <hjl.tools at gmail.com>
wrote:
> >
> > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at
codesourcery.com>
> wrote:
> > >
> > > On Tue, 13 Jul 2021, H.J. Lu wrote:
> > >
> > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <
> pengfei.wang at intel.com> wrote:
> > > > >
> > > > > > Return _Float16 and _Complex _Float16 values in
%xmm0/%xmm1
> registers.
> > > > >
> > > > > Can you please explain the behavior here? Is there
difference
> between _Float16 and _Complex _Float16 when return? I.e.,
> > > > > 1, In which case will _Float16 values return in both
%xmm0 and
> %xmm1?
> > > > > 2, For a single _Float16 value, are both real part and
imaginary
> part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?
> > > >
> > > > Here is the v2 patch to add the missing _Float16 bits.   The
PDF
> file is at
> > > >
> > > >
https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
> > >
> > > This PDF shows _Complex _Float16 as having a size of 2 bytes
(should be
> > > 4-byte size, 2-byte alignment).
> > >
> > > It also seems to change double from 4-byte to 8-byte alignment,
which
> is
> > > wrong.  And it's inconsistent about whether it covers the
long double > > > double (Android) case - it shows that case for
_Complex long double but
> > > not for long double itself.
> >
> > Here is the v3 patch with the fixes.  I also updated the PDF file.
>
> Here is the final patch I checked in.   _Complex _Float16 is changed to
> return
> in XMM0 register.   The new PDF file is at
>
> https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI

This should be explicit that the real part is returned in bits 0..15 and
the imaginary part is returned in bits 16..31, or however we conventionally
designate subcomponents of a vector.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210824/4f6f4ca7/attachment.html>

H.J. Lu via llvm-dev

2021-Aug-25 12:35 UTC

head link

[llvm-dev] [PATCH] Add optional _Float16 support

On Mon, Aug 23, 2021 at 10:55 PM John McCall <rjmccall at gmail.com>
wrote:>
> On Thu, Jul 29, 2021 at 9:40 AM H.J. Lu <hjl.tools at gmail.com>
wrote:
>>
>> On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <hjl.tools at gmail.com>
wrote:
>> >
>> > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at
codesourcery.com> wrote:
>> > >
>> > > On Tue, 13 Jul 2021, H.J. Lu wrote:
>> > >
>> > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei
<pengfei.wang at intel.com> wrote:
>> > > > >
>> > > > > > Return _Float16 and _Complex _Float16 values
in %xmm0/%xmm1 registers.
>> > > > >
>> > > > > Can you please explain the behavior here? Is there
difference between _Float16 and _Complex _Float16 when return? I.e.,
>> > > > > 1, In which case will _Float16 values return in
both %xmm0 and %xmm1?
>> > > > > 2, For a single _Float16 value, are both real part
and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1
respectively?
>> > > >
>> > > > Here is the v2 patch to add the missing _Float16 bits.  
The PDF file is at
>> > > >
>> > > >
https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
>> > >
>> > > This PDF shows _Complex _Float16 as having a size of 2 bytes
(should be
>> > > 4-byte size, 2-byte alignment).
>> > >
>> > > It also seems to change double from 4-byte to 8-byte
alignment, which is
>> > > wrong.  And it's inconsistent about whether it covers the
long double >> > > double (Android) case - it shows that case for
_Complex long double but
>> > > not for long double itself.
>> >
>> > Here is the v3 patch with the fixes.  I also updated the PDF file.
>>
>> Here is the final patch I checked in.   _Complex _Float16 is changed to
return
>> in XMM0 register.   The new PDF file is at
>>
>> https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
>
>
> This should be explicit that the real part is returned in bits 0..15 and
the imaginary part is returned in bits 16..31, or however we conventionally
designate subcomponents of a vector.
>
> John.
How about this?

diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index 860ff66..8f527c1 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -457,6 +457,9 @@ and \texttt{unions}) are always returned in memory.
     & \texttt{__float128} & memory \\
     \hline
     & \texttt{_Complex _Float16} & \reg{xmm0} \\
+    & & The real part is returned in bits 0..15. The imaginary part is
+        returned \\
+    & & in bits 16..31.\\
     \cline{2-3}
     Complex & \texttt{_Complex float} & \EDX:\EAX \\
     floating- & & The real part is returned in \EAX. The imaginary part
is

https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/uploads/89eb3e52c7e5eadd58f7597508e13f34/intel386-psABI-2021-08-25.pdf

-- 
H.J.

llvm dev - Aug 2021 - [PATCH] Add optional _Float16 support

[llvm-dev] [PATCH] Add optional _Float16 support

[llvm-dev] [PATCH] Add optional _Float16 support