H.J. Lu via llvm-dev
2021-Jul-13 16:24 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at codesourcery.com> wrote:> > On Tue, 13 Jul 2021, H.J. Lu wrote: > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.wang at intel.com> wrote: > > > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > > > > > Can you please explain the behavior here? Is there difference between _Float16 and _Complex _Float16 when return? I.e., > > > 1, In which case will _Float16 values return in both %xmm0 and %xmm1? > > > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively? > > > > Here is the v2 patch to add the missing _Float16 bits. The PDF file is at > > > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI > > This PDF shows _Complex _Float16 as having a size of 2 bytes (should be > 4-byte size, 2-byte alignment). > > It also seems to change double from 4-byte to 8-byte alignment, which is > wrong. And it's inconsistent about whether it covers the long double > double (Android) case - it shows that case for _Complex long double but > not for long double itself.Here is the v3 patch with the fixes. I also updated the PDF file.> -- > Joseph S. Myers > joseph at codesourcery.com >-- H.J. -------------- next part -------------- A non-text attachment was scrubbed... Name: v3-0001-Add-optional-_Float16-support.patch Type: text/x-patch Size: 7346 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210713/54dfaddd/attachment.bin>
H.J. Lu via llvm-dev
2021-Jul-29 13:39 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <hjl.tools at gmail.com> wrote:> > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at codesourcery.com> wrote: > > > > On Tue, 13 Jul 2021, H.J. Lu wrote: > > > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.wang at intel.com> wrote: > > > > > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > > > > > > > Can you please explain the behavior here? Is there difference between _Float16 and _Complex _Float16 when return? I.e., > > > > 1, In which case will _Float16 values return in both %xmm0 and %xmm1? > > > > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively? > > > > > > Here is the v2 patch to add the missing _Float16 bits. The PDF file is at > > > > > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI > > > > This PDF shows _Complex _Float16 as having a size of 2 bytes (should be > > 4-byte size, 2-byte alignment). > > > > It also seems to change double from 4-byte to 8-byte alignment, which is > > wrong. And it's inconsistent about whether it covers the long double > > double (Android) case - it shows that case for _Complex long double but > > not for long double itself. > > Here is the v3 patch with the fixes. I also updated the PDF file.Here is the final patch I checked in. _Complex _Float16 is changed to return in XMM0 register. The new PDF file is at https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI -- H.J. -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Add-optional-_Float16-support.patch Type: text/x-patch Size: 6998 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210729/7352025d/attachment.bin>
John McCall via llvm-dev
2021-Aug-24 05:55 UTC
[llvm-dev] [PATCH] Add optional _Float16 support
On Thu, Jul 29, 2021 at 9:40 AM H.J. Lu <hjl.tools at gmail.com> wrote:> On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <hjl.tools at gmail.com> wrote: > > > > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at codesourcery.com> > wrote: > > > > > > On Tue, 13 Jul 2021, H.J. Lu wrote: > > > > > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei < > pengfei.wang at intel.com> wrote: > > > > > > > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 > registers. > > > > > > > > > > Can you please explain the behavior here? Is there difference > between _Float16 and _Complex _Float16 when return? I.e., > > > > > 1, In which case will _Float16 values return in both %xmm0 and > %xmm1? > > > > > 2, For a single _Float16 value, are both real part and imaginary > part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively? > > > > > > > > Here is the v2 patch to add the missing _Float16 bits. The PDF > file is at > > > > > > > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI > > > > > > This PDF shows _Complex _Float16 as having a size of 2 bytes (should be > > > 4-byte size, 2-byte alignment). > > > > > > It also seems to change double from 4-byte to 8-byte alignment, which > is > > > wrong. And it's inconsistent about whether it covers the long double > > > double (Android) case - it shows that case for _Complex long double but > > > not for long double itself. > > > > Here is the v3 patch with the fixes. I also updated the PDF file. > > Here is the final patch I checked in. _Complex _Float16 is changed to > return > in XMM0 register. The new PDF file is at > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABIThis should be explicit that the real part is returned in bits 0..15 and the imaginary part is returned in bits 16..31, or however we conventionally designate subcomponents of a vector. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210824/4f6f4ca7/attachment.html>