thr3ads.net - llvm dev - [llvm-dev] Do we need intrinsics for floating-point classification functions? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Serge Pavlov via llvm-dev

2021-Sep-02 12:33 UTC

[llvm-dev] Do we need intrinsics for floating-point classification functions?

Hi all,

Some time ago a new intrinsic `llvm.isnan` was introduced, which was
intended to represent IEEE-754 operation `isNaN` as well as a family of C
library functions `isnan*`. Then a concern was raised (see
https://reviews.llvm.org/D104854) that this functionality should be
removed. Discussion in the subsequent RFC (
https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html) came to
consensus that such intrinsic is necessary. Nevertheless the patches
related to the new intrinsic were reverted. I have to restart the
discussion in hope to convince the community that this intrinsic and other
classification functions are necessary.

There are two main reasons why this intrinsic is necessary:
1. It allows correct implementation of `isnan` if strict floating point
semantics is in effect,
2. It allows preserving the check in -ffast-math compilation.

To facilitate the discussion let's concentrate on the first problem.

Previously the frontend intrinsic `__builtin_isnan` was converted into `cmp
uno` during IR generation in clang codegen. This solution is not suitable
if FP exceptions are not ignored, because compare instructions raise
exceptions if its argument is signaling NaN. Both IEEE-754 (5.7.2) an C
standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf,
F.3p6) demand that this function does not raise floating point exceptions.
There was no target-independent IR construct that could represent `isnan`.

This drawback was significant enough and some attempts to alleviate it were
undertaken. In https://reviews.llvm.org/D95948 `isnan` was implemented
using integer operations in strictfp functions. It however is not suitable
for targets where a more efficient way exists, like dedicated instruction.
Another solution was implemented in https://reviews.llvm.org/D96568, where
a hook `clang::TargetCodeGenInfo::testFPKind` was introduced, which injects
target specific code into IR. Such a solution makes IR more
target-dependent and prevents some IR-level optimizations.

To have a solution suitable for all cases, a new intrinsic function
`llvm.isnan` was introduced (https://reviews.llvm.org/D104854). It protects
the check from undesirable optimizations and preserves it till selector,
where it can be lowered in optimal for a particular target way.

Other classification functions also need their own intrinsics. In strictfp
mode even a check for zero (`iszero`) cannot be made by comparing a value
against zero, - if the value is signaling NaN, FP exceptions would be
raised. James Y Knight in the previous discussion (
https://lists.llvm.org/pipermail/llvm-dev/2021-August/152282.html) listed
such "non-computational" functions, which should not signal if
provided
with an sNAN argument.

It looks like new intrinsic is the only consistent and in target-agnostic
way to implement these checks in all environments including the case when
FP exceptions are not ignored.

Any feedback is welcome.

Thanks,
--Serge
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210902/a38fae3f/attachment-0001.html>

Sanjay Patel via llvm-dev

2021-Sep-02 17:58 UTC

head link

[llvm-dev] Do we need intrinsics for floating-point classification functions?

Thank you for restarting this proposal. It's a bit more work, but let's
make sure we have a good design, so we don't miss any details.

What optimizations/analysis does this intrinsic enable **in IR** in
strict-FP mode?
If it is only constant folding, we don't need to add an intrinsic. We fold
lots of math library calls with no corresponding LLVM intrinsic under
llvm::ConstantFoldCall().
We also do transforms based on operands/uses of library calls in
SimplifyLibCalls.

So I'm really asking: why do we translate isnan() in clang into something
else in the first place? The backend can lower that call into whatever
suits the target without involving LLVM IR.

On Thu, Sep 2, 2021 at 8:33 AM Serge Pavlov via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> Some time ago a new intrinsic `llvm.isnan` was introduced, which was
> intended to represent IEEE-754 operation `isNaN` as well as a family of C
> library functions `isnan*`. Then a concern was raised (see
> https://reviews.llvm.org/D104854) that this functionality should be
> removed. Discussion in the subsequent RFC (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html) came
> to consensus that such intrinsic is necessary. Nevertheless the patches
> related to the new intrinsic were reverted. I have to restart the
> discussion in hope to convince the community that this intrinsic and other
> classification functions are necessary.
>
> There are two main reasons why this intrinsic is necessary:
> 1. It allows correct implementation of `isnan` if strict floating point
> semantics is in effect,
> 2. It allows preserving the check in -ffast-math compilation.
>
> To facilitate the discussion let's concentrate on the first problem.
>
> Previously the frontend intrinsic `__builtin_isnan` was converted into
> `cmp uno` during IR generation in clang codegen. This solution is not
> suitable if FP exceptions are not ignored, because compare instructions
> raise exceptions if its argument is signaling NaN. Both IEEE-754 (5.7.2) an
> C standard  (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf,
> F.3p6) demand that this function does not raise floating point exceptions.
> There was no target-independent IR construct that could represent `isnan`.
>
> This drawback was significant enough and some attempts to alleviate it
> were undertaken. In https://reviews.llvm.org/D95948 `isnan` was
> implemented using integer operations in strictfp functions. It however is
> not suitable for targets where a more efficient way exists, like dedicated
> instruction. Another solution was implemented in
> https://reviews.llvm.org/D96568, where a hook
> `clang::TargetCodeGenInfo::testFPKind` was introduced, which injects target
> specific code into IR. Such a solution makes IR more target-dependent and
> prevents some IR-level optimizations.
>
> To have a solution suitable for all cases, a new intrinsic function
> `llvm.isnan` was introduced (https://reviews.llvm.org/D104854). It
> protects the check from undesirable optimizations and preserves it till
> selector, where it can be lowered in optimal for a particular target way.
>
> Other classification functions also need their own intrinsics. In strictfp
> mode even a check for zero (`iszero`) cannot be made by comparing a value
> against zero, - if the value is signaling NaN, FP exceptions would be
> raised. James Y Knight in the previous discussion (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152282.html) listed
> such "non-computational" functions, which should not signal if
provided
> with an sNAN argument.
>
> It looks like new intrinsic is the only consistent and in target-agnostic
> way to implement these checks in all environments including the case when
> FP exceptions are not ignored.
>
> Any feedback is welcome.
>
> Thanks,
> --Serge
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210902/75f4208e/attachment.html>

James Y Knight via llvm-dev

2021-Sep-03 05:22 UTC

head link

[llvm-dev] Do we need intrinsics for floating-point classification functions?

As you say, we don't strictly need a new intrinsic -- we can emit the code
to do the correct integer-based bit-checking in the frontend -- as was done
before. Yet, that is not ideal. The rationale to have an intrinsic instead
of the frontend generating integer-bit-manipulation seem good, IMO.

So the question that arises: *which* new intrinsic(s) do we want to add? I
list 3 options below. I think Option 1 should be discarded from
consideration, and I'm not sure which of 2 or 3 is best. I'm leaning
towards 3 -- it seems like it may be simpler -- although I'm not certain.

*Option 1*: we could theoretically address the problem with an
llvm.experimental.constrained.superquiet_fcmp, which would be the same as
fcmp, except it would not raise an fp exception even for an sNAN. This
would be a straightforward substitution for fcmp in Clang's existing
codegen, and it should work.

However, I don't think this would be a good idea, because it's a poor
match
for hardware. The "superquiet_fcmp" operation doesn't map to CPU
functionality in any ISA i'm aware of. Generally, the implementation would
have to be to filter out sNANs prior to using the hardware fp-compare
instruction. And how you you detect an sNAN without raising an fp
exceptoin? Fall down to integer ops. But by doing that, you've done work
almost equivalent to what you needed for the actual classification function
that you were trying to implement in the first place. Not real useful.

*Option 2*: We could add a whole family of classification intrinsics. I
think that would be:

llvm.isnan
llvm.issignaling
llvm.isinf
llvm.isfinite
llvm.isnormal
llvm.issubnormal
llvm.iszero
(Note, some of these are missing corresponding __builtin_is* in Clang at
the moment -- we have no __builtin_issignaling, __builtin_issubnormal, or
__builtin_iszero. Probably ought to.)

We don't necessarily need an intrinsic for fpclassify if we have the above
intrinsics, since it can be built with llvm.isnan, llvm.isinf, llvm.iszero,
and llvm.isnormal.

*Option 3*: Add only an fpclassify intrinsic.

That is, something like:
   i32 llvm.fpclassify.i32.f32(i32 %if_snan, i32 %if_qnan, i32
%if_infinite, i32 %if_normal, i32 %if_subnormal, i32 %if_zero, float %value)
which classifies the given value, returning the value of the argument
corresponding to its categorization. We can say the %if_* args are required
to be constant integers, if we like, for simplicity of implementation.

Thus, Clang would codegen __builtin_isnan/etc like:
  %isnan = call i1 llvm.fpclassify.i1.f32(i1 1, i1 1, i1 0, i1 0, i1 0, i1
0, float %value)
And for fpclassify, we might generate something like:
  %ret = call i32 llvm.fpclassify.i32.f32(i32 0, i32 0, i32 1, i32 4, i32
3, i32 2, float %value)

On most architectures, we'd expand this intrinsic into appropriate integer
operations (skipping the parts of classification which are irrelevant for
the given constant arguments), since there's no corresponding hardware
instructions available for float classification.  Or, for non-strictfp
functions, we could continue to expand into an fcmp-based set of
tests...although looking at the asm we currently generate, the integer
versions may well be faster, other than isnan.

On SystemZ/s390x, this intrinsic would translate almost directly into the
"test data class" instruction -- so long as the %if_* arguments are
all
0/1. That's kinda nice. ("Test data class" takes a fp value and a
bitmask,
and returns true if a bit is set in the position corresponding to the
classification of the fp value.)

*Separately*, we have the signbit operation. I think that's the only other
operation that needs to be addressed related to this RFC. Currently, Clang
always generates integer bit-ops for __builtin_signbit in the
frontend. This is arguably OK as is. Yet, completing the set of IR fp
classification intrinsics seems like it'd be a good idea. So, we could also
(along with any of the above options) add:
  i1 llvm.signbit.f32(float %value)

On Thu, Sep 2, 2021 at 8:33 AM Serge Pavlov via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> Some time ago a new intrinsic `llvm.isnan` was introduced, which was
> intended to represent IEEE-754 operation `isNaN` as well as a family of C
> library functions `isnan*`. Then a concern was raised (see
> https://reviews.llvm.org/D104854) that this functionality should be
> removed. Discussion in the subsequent RFC (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html) came
> to consensus that such intrinsic is necessary. Nevertheless the patches
> related to the new intrinsic were reverted. I have to restart the
> discussion in hope to convince the community that this intrinsic and other
> classification functions are necessary.
>
> There are two main reasons why this intrinsic is necessary:
> 1. It allows correct implementation of `isnan` if strict floating point
> semantics is in effect,
> 2. It allows preserving the check in -ffast-math compilation.
>
> To facilitate the discussion let's concentrate on the first problem.
>
> Previously the frontend intrinsic `__builtin_isnan` was converted into
> `cmp uno` during IR generation in clang codegen. This solution is not
> suitable if FP exceptions are not ignored, because compare instructions
> raise exceptions if its argument is signaling NaN. Both IEEE-754 (5.7.2) an
> C standard  (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf,
> F.3p6) demand that this function does not raise floating point exceptions.
> There was no target-independent IR construct that could represent `isnan`.
>
> This drawback was significant enough and some attempts to alleviate it
> were undertaken. In https://reviews.llvm.org/D95948 `isnan` was
> implemented using integer operations in strictfp functions. It however is
> not suitable for targets where a more efficient way exists, like dedicated
> instruction. Another solution was implemented in
> https://reviews.llvm.org/D96568, where a hook
> `clang::TargetCodeGenInfo::testFPKind` was introduced, which injects target
> specific code into IR. Such a solution makes IR more target-dependent and
> prevents some IR-level optimizations.
>
> To have a solution suitable for all cases, a new intrinsic function
> `llvm.isnan` was introduced (https://reviews.llvm.org/D104854). It
> protects the check from undesirable optimizations and preserves it till
> selector, where it can be lowered in optimal for a particular target way.
>
> Other classification functions also need their own intrinsics. In strictfp
> mode even a check for zero (`iszero`) cannot be made by comparing a value
> against zero, - if the value is signaling NaN, FP exceptions would be
> raised. James Y Knight in the previous discussion (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152282.html) listed
> such "non-computational" functions, which should not signal if
provided
> with an sNAN argument.
>
> It looks like new intrinsic is the only consistent and in target-agnostic
> way to implement these checks in all environments including the case when
> FP exceptions are not ignored.
>
> Any feedback is welcome.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210903/71fa065c/attachment.html>

llvm dev - Sep 2021 - Do we need intrinsics for floating-point classification functions?

[llvm-dev] Do we need intrinsics for floating-point classification functions?

[llvm-dev] Do we need intrinsics for floating-point classification functions?

[llvm-dev] Do we need intrinsics for floating-point classification functions?