thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Richard Smith via llvm-dev

2021-Sep-10 00:58 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Thu, 9 Sept 2021 at 13:55, Chris Tetreault via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> The point I was trying to make regarding the C++ standard is that
> fast-math is a non-standard language extension.
>
-ffinite-math-only does not need to be a non-standard language extension.
Neither C nor C++ requires that floating-point types can represent infinity
or NaN, and we could define this flag as meaning that there are
(notionally) simply no such values in the relevant types. Of course, that's
not actually consistent with what we currently do, nor with what GCC does.

Would it be reasonable to treat operations on Inf and NaN values as UB in
this mode only if the same operation on a signaling NaN might signal?
(Approximately, that'd mean we imagine these non-finite value encodings all
encode sNaNs that are UB if they would signal.) That means the operations
that ISO 60559 defines as non-computational or quiet-computational would be
permitted to receive NaN and Inf as input and produce them as output, but
that other computational operations would not.

Per ISO 60559, the quiet-computational operations that I think are relevant
to us are: copy, negate, abs, copySign, and conversions between encoding
(eg, bitcast). The non-computational operations that I think are relevant
to us are classification functions (including isNaN).

If you enable it, you should expect the compiler to diverge from
the> language standard. I’m sure there’s precedent for this. If I write #pragma
> once at the top of my header, and include it twice back to back, the
> preprocessor won’t paste my header twice. Should #pragma once be removed
> because it breaks #include?
>
>
>
> Now, you have a real-world example that uses NaN as a sentinel value. In
> your case, it would be nice if the compiler worked as you suggest. Now,
> suppose I have a “safe matrix multiply”:
>
>
>
> ```
>
> std::optional<MyMatrixT> safeMul(const MyMatrixT & lhs, const
MyMatrixT &
> rhs) {
>
>   for (int i = 0; i < lhs.rows; ++i) {
>
>     for (int j = 0; j < lhs.cols; ++j) {
>
>       if (isnan(lhs[i][j])) {
>
>         return {};
>
>       }
>
>     }
>
>   }
>
>   for (int i = 0; i < rhs.rows; ++i) {
>
>     for (int j = 0; j < rhs.cols; ++j) {
>
>       if (isnan(rhs[i][j])) {
>
>         return {};
>
>       }
>
>     }
>
>  }
>
>
>
>   // do the multiply
>
> }
>
> ```
>
>
>
> In this case, if isnan(x) can be constant folded to false with fast-math
> enabled, then these two loops can be completely eliminated since they are
> empty and do nothing. If MyMatrixT is a 100 x 100 matrix, and/or safeMul is
> called in a hot loop, this could be huge. What should I do instead here?
>
>
>
> Really, it would be much more consistent if we apply the clang
> documentation for fast-math “Operands to floating-point operations are not
> equal to NaN and Inf” literally, and not actually implement “Operands to
> floating-point operations are not equal to NaN and Inf, except in the case
> of isnan(), but only if the argument to isnan() is a value stored in a
> variable and not an expression”. As far as using isnan from the standard
> library compiled without fast-math vs a compiler builtin, I don’t think
> this is an issue. Really, enabling fast-math is basically telling the
> compiler “My code has no NaNs. I won’t try to do anything with them, and
> you should optimize assuming they aren’t there”. If a developer does their
> part, why should it matter to them that isnan() might work?
>
>
>
> Thanks,
>
>    Chris Tetreault
>
>
>
>
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 11:27 AM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Krzysztof Parzyszek <kparzysz at quicinc.com>; cfe-dev at
lists.llvm.org;
> llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Fri, Sep 10, 2021 at 1:03 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> In this case, I think it’s perfectly reasonable to reinterpret_cast the
> floats to uint32_t, and then inspect the bit pattern. Since NaN is being
> used as a sentinel value, I assume it’s a known bit pattern, and not just
> any old NaN.
>
>
>
> C standard defines a function to determine if a value is NaN. The fact
> that it does not work in this case demonstrates that the optimization is
> incorrect. Again, if isnan comes from libc implementation, it will work,
> but if it is provided by the compiler, it does not. Users expect consistent
> behavior.
>
>
>
> If NaNs are not prohibited at all in -ffinite-math-only mode, isnan must
> work as specified in the standard.
>
>
>
>
>
> I think it’s fine that fast-math renders isnan useless. As far as I know,
> the C++ standard wasn’t written to account for compilers providing
> fast-math flags. fast-math is itself a workaround for “IEEE floats do not
> behave like actual real numbers”, so working around a workaround seems
> reasonable to me.
>
>
>
> I feel you are right with fast-math as a workaround, but the compiler is a
> practical tool and it must be convenient and suitable for a wide set of
> tasks. The situation when a user has to invent workarounds because some
> optimization changes semantics of a standard function is not good.
>
>
>
> As for ffinite-math-only, it is actually more or less a safe mode. When we
> use integer division, we know that the divisor must not be zero. The case
> of ffinite-math-only is similar.
>
>
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 10:34 AM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Krzysztof Parzyszek <kparzysz at quicinc.com>; cfe-dev at
lists.llvm.org;
> llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> Let me describe a real life example.
>
>
>
> There is a realtime program that processes float values from a huge array.
> Calculations do not produce NaNs and do not expect them. Using
> -ffinite-math-only substantially speeds up the program, so it is highly
> desirable to use it. The problem is that the array contains NaNs, they mark
> elements that should not be processed.
>
>
>
> An obvious solution is to check an element for NaN, and if it is not,
> process it. Now there is no clean way to do so. Only workarounds, like
> using integer arithmetics. The function 'isnan' became useless. And
there
> are many cases when users complain of this optimization.
>
>
>
> Thanks,
>
> --Serge
>
>
>
>
>
> On Fri, Sep 10, 2021 at 12:09 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> If the issue is that users want their asserts to fire, then they should be
> encouraged to only enable fast math in release builds.
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of
*Serge
> Pavlov via llvm-dev
> *Sent:* Thursday, September 9, 2021 9:53 AM
> *To:* Krzysztof Parzyszek <kparzysz at quicinc.com>
> *Cc:* LLVM Developers <llvm-dev at lists.llvm.org>; cfe-dev at
lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Thu, Sep 9, 2021 at 11:29 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
> wrote:
>
> This goes back to what these options actually imply.  The interpretation
> that I favor is “this code will never see a NaN”, or “the program can
> assume that no floating point expression will evaluate to a NaN”.  The
> benefit of that is that it’s intuitively clear.  In that case “isnan(x)” is
> false, because x cannot be a NaN.  There is no distinction between
> “isnan(x+x)” and “isnan(x)”.  If the user wants to preserve “isnan(x)”,
> they can apply some pragma (which clang may actually have already).
>
>
>
> It is apparent simplicity. As the discussion in gcc mail list demonstrated
> (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this
> is actually an impromissing way. From a practical viewpoint it is also a
> bad solution as users cannot even check the assertions.
>
>
>
>
>
> To be honest, I’m not sure that I understand your argument.  Are you
> saying that under your interpretation we could optimize “isnan(x+x) ->
> false”, but not “isnan(x) -> false”?
>
>
>
> Argument of `isnan(x+x)` is a result of arithmetic operation. According to
> the meaning of -ffinite-math-only it cannot produce NaN. So this call can
> be optimized out. In the general case `isnan(x)` value may be, say, loaded
> from memory. Load is not an arithmetic operation, so nothing prevents from
> loading NaN. Optimizing the call out is dangerous in this case.
>
>
>
>
>
>
>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 11:10 AM
> *To:* Krzysztof Parzyszek <kparzysz at quicinc.com>
> *Cc:* Chris Lattner <clattner at nondot.org>; James Y Knight <
> jyknight at google.com>; LLVM Developers <llvm-dev at
lists.llvm.org>;
> cfe-dev at lists.llvm.org
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Thu, Sep 9, 2021 at 8:30 PM Krzysztof Parzyszek via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> If we say that the fast-math flags are “enabling optimizations that the
> presence of nans otherwise prohibits”, then there is no reason for clang to
> keep calls to “isnan” around, or to keep checks like “fpclassify(x) =>
it’s_a_nan” unfolded.  These are exactly the types of optimizations that
> the presence of NaNs would prohibit.
>
>
>
> Transformation 'x * 0 -> 0' is an optimization allowed in the
absence of
> nans as arguments, because it produces a program that behaves identically
> under the given restrictions. Replacement of `isnan(x + x)` is also an
> optimization under the same restrictions. Replacement of `isnan(x)` in
> general case is not, because we cannot assume that x cannot be a NaN.
>
>
>
>
>
> I understand the need for having some NaN-handling preserved in an
> otherwise finite-math code.  We already have fast-math-related attributes
> attached to each function in the LLVM IR, so we could introduce a
> source-level attribute for enabling/disabling these flags per function.
>
>
>
> GCC allows using `#pragma GCC optimize ("finite-math-only")` or
`#pragma
> GCC optimize ("no-finite-math-only")` to enable/disable
optimization per
> function basis. Clang could support this pragmf or maybe `#pragma clang fp`
> can be extended to support similar functionality.
>
>
>
>
>
>
>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of
*Chris
> Lattner via cfe-dev
> *Sent:* Wednesday, September 8, 2021 5:51 PM
> *To:* James Y Knight <jyknight at google.com>
> *Cc:* LLVM Developers <llvm-dev at lists.llvm.org>; Clang Dev <
> cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Sep 8, 2021, at 3:27 PM, James Y Knight via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> I expressed my strong support for this on the previous thread, but I'll
> just repost the most important piece...
>
>
>
> I believe the proposed semantics from the Clang level ought to be:
>
>   The -ffinite-math-only and -fno-signed-zeros options do not impact the
> ability to accurately load, store, copy, or pass or return such values from
> general function calls. They also do not impact any of the
> "non-computational" and "quiet-computational" IEEE-754
operations, which
> includes classification functions (fpclassify, signbit, isinf/isnan/etc),
> sign-modification (copysign, fabs, and negation `-(x)`), as well as
> the totalorder and totalordermag functions. Those correctly handle NaN,
> Inf, and signed zeros even when the flags are in effect. These flags *do*
affect
> the behavior of other expressions and math standard-library calls, as well
> as comparison operations.
>
>
>
> FWIW, I completely agree - these flags are about enabling optimizations
> that the presence of nans otherwise prohibits.  We shouldn’t take a literal
> interpretation of an old GCC manual, as that would not be useful.
>
>
>
> If we converge on this definition, I think it should be documented.  This
> is a source of confusion that comes up periodically.
>
>
>
> -Chris
>
>
>
>
>
>
>
> I would not expect this to have an actual negative impact on the
> performance benefit of those flags, since the optimization benefits mainly
> arise from comparisons and the general computation instructions which are
> unchanged.
>
>
>
> In further support of this position, I note that the previous thread
> uncovered at least one vendor -- Apple (
>
https://opensource.apple.com/source/Libm/Libm-2026/Source/Intel/math.h.auto.html)
> -- going out of their way to cause isnan and friends to function properly
> with -ffast-math enabled.
>
>
>
>
>
>
>
> On Wed, Sep 8, 2021 at 1:02 PM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> Hi all,
>
>
>
> One of the purposes of `llvm::isnan` was to help preserve the check made
> by `isnan` if fast-math mode is
>
> specified (https://reviews.llvm.org/D104854). I'd like to describe
reason
> for that and propose to use the behavior
>
> implemented in that patch.
>
>
>
> The option `-ffast-math` is often used when performance is important, as
> it allows a compiler to generate faster code.
>
> This option itself is a collection of different optimization techniques,
> each having its own option. For this topic only the
>
> option `-ffinite-math-only` is of interest. With it the compiler treats
> floating point numbers as mathematical real numbers,
>
> so transformations like `0 * x -> 0` become valid.
>
>
>
> In clang documentation (
> https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math) this
> option is described as:
>
>     "Allow floating-point optimizations that assume arguments and
results
> are not NaNs or +-Inf."
>
> GCC documentation (
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) is a bit more
> concrete:
>
>     "Allow optimizations for floating-point arithmetic that assume
that
> arguments and results are not NaNs or +-Infs."
>
>
>
> **What is the issue?**
>
> C standard defines a macro `isnan`, which can be mapped to an intrinsic
> function provided by the compiler. For both
>
> clang and gcc it is `__builtin_isnan`. How should this function behave if
> `-ffinite-math-only` is specified? Should it make a
>
> real check or the compiler can assume that it always returns false?
>
> GCC optimizes out `isnan`. It follows from the viewpoint that (
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724#c1):
>
>     "With -ffinite-math-only you are telling that there are no NaNs
and
> thus GCC optimizes isnan (x) to 0."
>
>
>
> Such treatment of `-ffinite-math-only` has sufficient drawbacks. In
> particular it makes it impossible to check validity of
>
> data: a user cannot write
>
>
>
> assert(!isnan(x));
>
>
>
> because the compiler replaces the actual function call with its expected
> value. There are many complaints in GCC bug
>
> tracker (for instance https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84949
> or https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724)
>
> as well as in forums (
>
https://stackoverflow.com/questions/47703436/isnan-does-not-work-correctly-with-ofast-flags
> or
>
>
https://stackoverflow.com/questions/22931147/stdisinf-does-not-work-with-ffast-math-how-to-check-for-infinity).
> Proposed
>
> solutions are using integer operations to make the check, to turn off
> `-ffinite-math-only` in some parts of the code or to
>
> ensure that libc function is called. It clearly demonstrates that `isnan`
> in this case is useless, but users need its functionality
>
> and do not have a proper tool to make required checks. The similar
> direction was criticized in llvm as well (
> https://reviews.llvm.org/D18513#387418).
>
>
>
> **Why imposing restrictions on floating types is bad?**
>
> If `-ffinite-math-only` modifies properties of `double` type, several
> issues arise, for instance:
> - What should return `std::numeric_limits<double>::has_quiet_NaN()`?
> - What body should have this function if it is used in a program where
> some functions are compiled with `fast-math` and some without?
> - Should inlining of a function compiled with `fast-math` to a function
> compiled without it be prohibited in inliner?
> - Should `std::isnan(std::numeric_limits<float>::quiet_NaN())` be
true?
>
> If the type `double` cannot have NaN value, it means that `double` and
> `double` under `-ffinite-math-only` are different types
>
> (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html). Such
> a way can solve these problems but it is so expensive
>
> that hardly it has a chance to be realized.
>
>
>
> **The solution**
>
> Instead of modifying properties of floating point types, the effect of
> `-ffinite-math-only` can be expressed as a restriction on
>
> operation usage.  Actually clang and gcc documentation already follows
> this way. Fast-math flags in llvm IR also are attributes
>
> of instructions. The only question is whether `isnan` and similar
> functions are floating-point arithmetic.
>
> From a practical viewpoint, treating non-computational functions as
> arithmetic does not add any advantage. If a code extensively
>
> uses `isnan` (so could profit by their removal), it is likely it is not
> suitable for -ffinite-math-only. This interpretation however creates
>
> the problems described above. So it is profitable to consider `isnan` and
> similar functions as non-arithmetical.
>
>
>
> **Why is it safe to leave `isnan`?**
>
> The probable concern of this solution is deviation from gcc behavior.
> There are several reasons why this is not an issue.
>
> 1. -ffinite-math-only is an optimization option. A correct program
> compiled with -ffinite-math-only and without it should behave
>
>    identically, if conditions for using -ffinite-math-only are fulfilled.
> So making the check cannot break functionality.
> 2. `isnan` is implemented by libc, which can map it to a compiler builtin
> or use its own implementation, depending on
>
>    configuration options. `isnan` implemented in libc obviously always
> does the real check.
> 3. ICC and MSVC preserve `isnan` in fast-math mode.
>
>
>
> The proposal is to not consider `isnan` and other such functions as
> arithmetic operations and do not optimize them out
>
> just because -ffinite-math-only is specified. Of course, there are cases
> when `isnan` may be optimized out, for instance,
>
> `isnan(a + b)` may be optimized if -ffinite-math-only is in effect due to
> the assumption (result of arithmetic operation is not NaN).
>
> What are your opinions?
>
> Thanks,
> --Serge
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210909/897fe9cd/attachment.html>

James Y Knight via llvm-dev

2021-Sep-10 14:29 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Thu, Sep 9, 2021, 8:59 PM Richard Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Would it be reasonable to treat operations on Inf and NaN values as UB in
> this mode only if the same operation on a signaling NaN might signal?
> (Approximately, that'd mean we imagine these non-finite value encodings
all
> encode sNaNs that are UB if they would signal.) That means the operations
> that ISO 60559 defines as non-computational or quiet-computational would be
> permitted to receive NaN and Inf as input and produce them as output, but
> that other computational operations would not.
>
> Per ISO 60559, the quiet-computational operations that I think are
> relevant to us are: copy, negate, abs, copySign, and conversions between
> encoding (eg, bitcast). The non-computational operations that I think are
> relevant to us are classification functions (including isNaN).
>
I'm in favor. (Perhaps unsurprisingly, as this is precisely the proposal I
made earlier, worded slightly differently. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210910/4ca40751/attachment.html>

Chris Tetreault via llvm-dev

2021-Sep-10 16:28 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

I’m not super knowledgeable on the actual implementation of floating point math
in clang, but on the surface this seems fine. My position is that we should
provide no guarantees as to the behavior of code with NaN or infinity if
fast-math is enabled. We can go with this behavior, but we shouldn’t tell users
that they can rely on this behavior. Clang should have maximal freedom to
optimize floating point math with fast-math, and any constraint we place
potentially results in missed opportunities. Similarly we should feel free to
change this implementation in the future, the goal not being stability for users
who chose to rely on our implementation details. If users value reproducibility,
they should not be using fast math.

The only thing I think we should guarantee is that casts work. I should be able
to load some bytes from disk, cast the char array to a float array, and any NaNs
that I loaded from disk should not be clobbered. After that, if I should be able
to cast an element of my float array back to another type and inspect the bit
pattern (assuming I did not transform that element in the array in any other way
after casting it from char) to support use cases like Serge’s. Any other
operation should be fair game.

Thanks,
   Chris Tetreault

From: Richard Smith <richard at metafoo.co.uk>
Sent: Thursday, September 9, 2021 5:59 PM
To: Chris Tetreault <ctetreau at quicinc.com>
Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at lists.llvm.org;
cfe-dev at lists.llvm.org
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Thu, 9 Sept 2021 at 13:55, Chris Tetreault via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
The point I was trying to make regarding the C++ standard is that fast-math is a
non-standard language extension.

-ffinite-math-only does not need to be a non-standard language extension.
Neither C nor C++ requires that floating-point types can represent infinity or
NaN, and we could define this flag as meaning that there are (notionally) simply
no such values in the relevant types. Of course, that's not actually
consistent with what we currently do, nor with what GCC does.

Would it be reasonable to treat operations on Inf and NaN values as UB in this
mode only if the same operation on a signaling NaN might signal? (Approximately,
that'd mean we imagine these non-finite value encodings all encode sNaNs
that are UB if they would signal.) That means the operations that ISO 60559
defines as non-computational or quiet-computational would be permitted to
receive NaN and Inf as input and produce them as output, but that other
computational operations would not.

Per ISO 60559, the quiet-computational operations that I think are relevant to
us are: copy, negate, abs, copySign, and conversions between encoding (eg,
bitcast). The non-computational operations that I think are relevant to us are
classification functions (including isNaN).

If you enable it, you should expect the compiler to diverge from the language
standard. I’m sure there’s precedent for this. If I write #pragma once at the
top of my header, and include it twice back to back, the preprocessor won’t
paste my header twice. Should #pragma once be removed because it breaks
#include?

Now, you have a real-world example that uses NaN as a sentinel value. In your
case, it would be nice if the compiler worked as you suggest. Now, suppose I
have a “safe matrix multiply”:

```
std::optional<MyMatrixT> safeMul(const MyMatrixT & lhs, const
MyMatrixT & rhs) {
  for (int i = 0; i < lhs.rows; ++i) {
    for (int j = 0; j < lhs.cols; ++j) {
      if (isnan(lhs[i][j])) {
        return {};
      }
    }
  }
  for (int i = 0; i < rhs.rows; ++i) {
    for (int j = 0; j < rhs.cols; ++j) {
      if (isnan(rhs[i][j])) {
        return {};
      }
    }
 }

  // do the multiply
}
```

In this case, if isnan(x) can be constant folded to false with fast-math
enabled, then these two loops can be completely eliminated since they are empty
and do nothing. If MyMatrixT is a 100 x 100 matrix, and/or safeMul is called in
a hot loop, this could be huge. What should I do instead here?

Really, it would be much more consistent if we apply the clang documentation for
fast-math “Operands to floating-point operations are not equal to NaN and Inf”
literally, and not actually implement “Operands to floating-point operations are
not equal to NaN and Inf, except in the case of isnan(), but only if the
argument to isnan() is a value stored in a variable and not an expression”. As
far as using isnan from the standard library compiled without fast-math vs a
compiler builtin, I don’t think this is an issue. Really, enabling fast-math is
basically telling the compiler “My code has no NaNs. I won’t try to do anything
with them, and you should optimize assuming they aren’t there”. If a developer
does their part, why should it matter to them that isnan() might work?

Thanks,
   Chris Tetreault



From: Serge Pavlov <sepavloff at gmail.com<mailto:sepavloff at
gmail.com>>
Sent: Thursday, September 9, 2021 11:27 AM
To: Chris Tetreault <ctetreau at quicinc.com<mailto:ctetreau at
quicinc.com>>
Cc: Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at
lists.llvm.org>; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Fri, Sep 10, 2021 at 1:03 AM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
In this case, I think it’s perfectly reasonable to reinterpret_cast the floats
to uint32_t, and then inspect the bit pattern. Since NaN is being used as a
sentinel value, I assume it’s a known bit pattern, and not just any old NaN.

C standard defines a function to determine if a value is NaN. The fact that it
does not work in this case demonstrates that the optimization is incorrect.
Again, if isnan comes from libc implementation, it will work, but if it is
provided by the compiler, it does not. Users expect consistent behavior.

If NaNs are not prohibited at all in -ffinite-math-only mode, isnan must work as
specified in the standard.


I think it’s fine that fast-math renders isnan useless. As far as I know, the
C++ standard wasn’t written to account for compilers providing fast-math flags.
fast-math is itself a workaround for “IEEE floats do not behave like actual real
numbers”, so working around a workaround seems reasonable to me.

I feel you are right with fast-math as a workaround, but the compiler is a
practical tool and it must be convenient and suitable for a wide set of tasks.
The situation when a user has to invent workarounds because some optimization
changes semantics of a standard function is not good.

As for ffinite-math-only, it is actually more or less a safe mode. When we use
integer division, we know that the divisor must not be zero. The case of
ffinite-math-only is similar.


From: Serge Pavlov <sepavloff at gmail.com<mailto:sepavloff at
gmail.com>>
Sent: Thursday, September 9, 2021 10:34 AM
To: Chris Tetreault <ctetreau at quicinc.com<mailto:ctetreau at
quicinc.com>>
Cc: Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at
lists.llvm.org>; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
Let me describe a real life example.

There is a realtime program that processes float values from a huge array.
Calculations do not produce NaNs and do not expect them. Using
-ffinite-math-only substantially speeds up the program, so it is highly
desirable to use it. The problem is that the array contains NaNs, they mark
elements that should not be processed.

An obvious solution is to check an element for NaN, and if it is not, process
it. Now there is no clean way to do so. Only workarounds, like using integer
arithmetics. The function 'isnan' became useless. And there are many
cases when users complain of this optimization.

Thanks,
--Serge


On Fri, Sep 10, 2021 at 12:09 AM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
If the issue is that users want their asserts to fire, then they should be
encouraged to only enable fast math in release builds.

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> On Behalf Of Serge Pavlov via llvm-dev
Sent: Thursday, September 9, 2021 9:53 AM
To: Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>>
Cc: LLVM Developers <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Thu, Sep 9, 2021 at 11:29 PM Krzysztof Parzyszek <kparzysz at
quicinc.com<mailto:kparzysz at quicinc.com>> wrote:
This goes back to what these options actually imply.  The interpretation that I
favor is “this code will never see a NaN”, or “the program can assume that no
floating point expression will evaluate to a NaN”.  The benefit of that is that
it’s intuitively clear.  In that case “isnan(x)” is false, because x cannot be a
NaN.  There is no distinction between “isnan(x+x)” and “isnan(x)”.  If the user
wants to preserve “isnan(x)”, they can apply some pragma (which clang may
actually have already).

It is apparent simplicity. As the discussion in gcc mail list demonstrated
(https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this is
actually an impromissing way. From a practical viewpoint it is also a bad
solution as users cannot even check the assertions.


To be honest, I’m not sure that I understand your argument.  Are you saying that
under your interpretation we could optimize “isnan(x+x) -> false”, but not
“isnan(x) -> false”?

Argument of `isnan(x+x)` is a result of arithmetic operation. According to the
meaning of -ffinite-math-only it cannot produce NaN. So this call can be
optimized out. In the general case `isnan(x)` value may be, say, loaded from
memory. Load is not an arithmetic operation, so nothing prevents from loading
NaN. Optimizing the call out is dangerous in this case.



--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: Serge Pavlov <sepavloff at gmail.com<mailto:sepavloff at
gmail.com>>
Sent: Thursday, September 9, 2021 11:10 AM
To: Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>>
Cc: Chris Lattner <clattner at nondot.org<mailto:clattner at
nondot.org>>; James Y Knight <jyknight at google.com<mailto:jyknight
at google.com>>; LLVM Developers <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Thu, Sep 9, 2021 at 8:30 PM Krzysztof Parzyszek via cfe-dev <cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
If we say that the fast-math flags are “enabling optimizations that the presence
of nans otherwise prohibits”, then there is no reason for clang to keep calls to
“isnan” around, or to keep checks like “fpclassify(x) == it’s_a_nan” unfolded. 
These are exactly the types of optimizations that the presence of NaNs would
prohibit.

Transformation 'x * 0 -> 0' is an optimization allowed in the absence
of nans as arguments, because it produces a program that behaves identically
under the given restrictions. Replacement of `isnan(x + x)` is also an
optimization under the same restrictions. Replacement of `isnan(x)` in general
case is not, because we cannot assume that x cannot be a NaN.


I understand the need for having some NaN-handling preserved in an otherwise
finite-math code.  We already have fast-math-related attributes attached to each
function in the LLVM IR, so we could introduce a source-level attribute for
enabling/disabling these flags per function.

GCC allows using `#pragma GCC optimize ("finite-math-only")` or
`#pragma GCC optimize ("no-finite-math-only")` to enable/disable
optimization per function basis. Clang could support this pragmf or maybe
`#pragma clang fp` can be extended to support similar functionality.



--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: cfe-dev <cfe-dev-bounces at lists.llvm.org<mailto:cfe-dev-bounces at
lists.llvm.org>> On Behalf Of Chris Lattner via cfe-dev
Sent: Wednesday, September 8, 2021 5:51 PM
To: James Y Knight <jyknight at google.com<mailto:jyknight at
google.com>>
Cc: LLVM Developers <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; Clang Dev <cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Sep 8, 2021, at 3:27 PM, James Y Knight via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

I expressed my strong support for this on the previous thread, but I'll just
repost the most important piece...

I believe the proposed semantics from the Clang level ought to be:
  The -ffinite-math-only and -fno-signed-zeros options do not impact the ability
to accurately load, store, copy, or pass or return such values from general
function calls. They also do not impact any of the "non-computational"
and "quiet-computational" IEEE-754 operations, which includes
classification functions (fpclassify, signbit, isinf/isnan/etc),
sign-modification (copysign, fabs, and negation `-(x)`), as well as the
totalorder and totalordermag functions. Those correctly handle NaN, Inf, and
signed zeros even when the flags are in effect. These flags do affect the
behavior of other expressions and math standard-library calls, as well as
comparison operations.

FWIW, I completely agree - these flags are about enabling optimizations that the
presence of nans otherwise prohibits.  We shouldn’t take a literal
interpretation of an old GCC manual, as that would not be useful.

If we converge on this definition, I think it should be documented.  This is a
source of confusion that comes up periodically.

-Chris



I would not expect this to have an actual negative impact on the performance
benefit of those flags, since the optimization benefits mainly arise from
comparisons and the general computation instructions which are unchanged.

In further support of this position, I note that the previous thread uncovered
at least one vendor -- Apple
(https://opensource.apple.com/source/Libm/Libm-2026/Source/Intel/math.h.auto.html)
-- going out of their way to cause isnan and friends to function properly with
-ffast-math enabled.



On Wed, Sep 8, 2021 at 1:02 PM Serge Pavlov via cfe-dev <cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Hi all,

One of the purposes of `llvm::isnan` was to help preserve the check made by
`isnan` if fast-math mode is
specified (https://reviews.llvm.org/D104854). I'd like to describe reason
for that and propose to use the behavior
implemented in that patch.

The option `-ffast-math` is often used when performance is important, as it
allows a compiler to generate faster code.
This option itself is a collection of different optimization techniques, each
having its own option. For this topic only the
option `-ffinite-math-only` is of interest. With it the compiler treats floating
point numbers as mathematical real numbers,
so transformations like `0 * x -> 0` become valid.

In clang documentation
(https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math) this option
is described as:

    "Allow floating-point optimizations that assume arguments and results
are not NaNs or +-Inf."

GCC documentation (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) is
a bit more concrete:

    "Allow optimizations for floating-point arithmetic that assume that
arguments and results are not NaNs or +-Infs."

**What is the issue?**

C standard defines a macro `isnan`, which can be mapped to an intrinsic function
provided by the compiler. For both
clang and gcc it is `__builtin_isnan`. How should this function behave if
`-ffinite-math-only` is specified? Should it make a
real check or the compiler can assume that it always returns false?

GCC optimizes out `isnan`. It follows from the viewpoint that
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724#c1):

    "With -ffinite-math-only you are telling that there are no NaNs and
thus GCC optimizes isnan (x) to 0."

Such treatment of `-ffinite-math-only` has sufficient drawbacks. In particular
it makes it impossible to check validity of
data: a user cannot write

assert(!isnan(x));

because the compiler replaces the actual function call with its expected value.
There are many complaints in GCC bug
tracker (for instance https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84949 or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724)
as well as in forums
(https://stackoverflow.com/questions/47703436/isnan-does-not-work-correctly-with-ofast-flags
or
https://stackoverflow.com/questions/22931147/stdisinf-does-not-work-with-ffast-math-how-to-check-for-infinity).
Proposed
solutions are using integer operations to make the check, to turn off
`-ffinite-math-only` in some parts of the code or to
ensure that libc function is called. It clearly demonstrates that `isnan` in
this case is useless, but users need its functionality
and do not have a proper tool to make required checks. The similar direction was
criticized in llvm as well (https://reviews.llvm.org/D18513#387418).

**Why imposing restrictions on floating types is bad?**

If `-ffinite-math-only` modifies properties of `double` type, several issues
arise, for instance:
- What should return `std::numeric_limits<double>::has_quiet_NaN()`?
- What body should have this function if it is used in a program where some
functions are compiled with `fast-math` and some without?
- Should inlining of a function compiled with `fast-math` to a function compiled
without it be prohibited in inliner?
- Should `std::isnan(std::numeric_limits<float>::quiet_NaN())` be true?

If the type `double` cannot have NaN value, it means that `double` and `double`
under `-ffinite-math-only` are different types
(https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html). Such a way
can solve these problems but it is so expensive
that hardly it has a chance to be realized.

**The solution**

Instead of modifying properties of floating point types, the effect of
`-ffinite-math-only` can be expressed as a restriction on
operation usage.  Actually clang and gcc documentation already follows this way.
Fast-math flags in llvm IR also are attributes
of instructions. The only question is whether `isnan` and similar functions are
floating-point arithmetic.

From a practical viewpoint, treating non-computational functions as arithmetic
does not add any advantage. If a code extensively
uses `isnan` (so could profit by their removal), it is likely it is not suitable
for -ffinite-math-only. This interpretation however creates
the problems described above. So it is profitable to consider `isnan` and
similar functions as non-arithmetical.

**Why is it safe to leave `isnan`?**

The probable concern of this solution is deviation from gcc behavior. There are
several reasons why this is not an issue.

1. -ffinite-math-only is an optimization option. A correct program compiled with
-ffinite-math-only and without it should behave
   identically, if conditions for using -ffinite-math-only are fulfilled. So
making the check cannot break functionality.
2. `isnan` is implemented by libc, which can map it to a compiler builtin or use
its own implementation, depending on
   configuration options. `isnan` implemented in libc obviously always does the
real check.
3. ICC and MSVC preserve `isnan` in fast-math mode.

The proposal is to not consider `isnan` and other such functions as arithmetic
operations and do not optimize them out
just because -ffinite-math-only is specified. Of course, there are cases when
`isnan` may be optimized out, for instance,
`isnan(a + b)` may be optimized if -ffinite-math-only is in effect due to the
assumption (result of arithmetic operation is not NaN).

What are your opinions?
Thanks,
--Serge
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210910/9280e6d2/attachment-0001.html>

Serge Pavlov via llvm-dev

2021-Sep-10 17:41 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

If clang does not remove `__builtin_isnan` in `-ffinite-math-only` mode and
a user wants calls to `isnan` be optimized out, they can do it in a
literally couple of lines:

#undef isnan
#define isnan(x) false

If clang optimizes out `__builtin_isnan` and a user wants to check if some
float is NaN, they have no appropriate way for that, only hacks and kludges.

Approach that -ffast-math-only means that "there are no NaNs" is too
rigid,
it prevents several coding techniques, does not provide additional
optimization possibilities and provokes user complaints.

Thanks,
--Serge


On Fri, Sep 10, 2021 at 11:28 PM Chris Tetreault <ctetreau at quicinc.com>
wrote:
> I’m not super knowledgeable on the actual implementation of floating point
> math in clang, but on the surface this seems fine. My position is that we
> should provide no guarantees as to the behavior of code with NaN or
> infinity if fast-math is enabled. We can go with this behavior, but we
> shouldn’t tell users that they can rely on this behavior. Clang should have
> maximal freedom to optimize floating point math with fast-math, and any
> constraint we place potentially results in missed opportunities. Similarly
> we should feel free to change this implementation in the future, the goal
> not being stability for users who chose to rely on our implementation
> details. If users value reproducibility, they should not be using fast
math.
>
>
>
> The only thing I think we should guarantee is that casts work. I should be
> able to load some bytes from disk, cast the char array to a float array,
> and any NaNs that I loaded from disk should not be clobbered. After that,
> if I should be able to cast an element of my float array back to another
> type and inspect the bit pattern (assuming I did not transform that element
> in the array in any other way after casting it from char) to support use
> cases like Serge’s. Any other operation should be fair game.
>
>
>
> Thanks,
>
>    Chris Tetreault
>
>
>
> *From:* Richard Smith <richard at metafoo.co.uk>
> *Sent:* Thursday, September 9, 2021 5:59 PM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Serge Pavlov <sepavloff at gmail.com>; llvm-dev at
lists.llvm.org;
> cfe-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Thu, 9 Sept 2021 at 13:55, Chris Tetreault via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> The point I was trying to make regarding the C++ standard is that
> fast-math is a non-standard language extension.
>
>
>
> -ffinite-math-only does not need to be a non-standard language extension.
> Neither C nor C++ requires that floating-point types can represent infinity
> or NaN, and we could define this flag as meaning that there are
> (notionally) simply no such values in the relevant types. Of course,
that's
> not actually consistent with what we currently do, nor with what GCC does.
>
>
>
> Would it be reasonable to treat operations on Inf and NaN values as UB in
> this mode only if the same operation on a signaling NaN might signal?
> (Approximately, that'd mean we imagine these non-finite value encodings
all
> encode sNaNs that are UB if they would signal.) That means the operations
> that ISO 60559 defines as non-computational or quiet-computational would be
> permitted to receive NaN and Inf as input and produce them as output, but
> that other computational operations would not.
>
>
>
> Per ISO 60559, the quiet-computational operations that I think are
> relevant to us are: copy, negate, abs, copySign, and conversions between
> encoding (eg, bitcast). The non-computational operations that I think are
> relevant to us are classification functions (including isNaN).
>
>
>
> If you enable it, you should expect the compiler to diverge from the
> language standard. I’m sure there’s precedent for this. If I write #pragma
> once at the top of my header, and include it twice back to back, the
> preprocessor won’t paste my header twice. Should #pragma once be removed
> because it breaks #include?
>
>
>
> Now, you have a real-world example that uses NaN as a sentinel value. In
> your case, it would be nice if the compiler worked as you suggest. Now,
> suppose I have a “safe matrix multiply”:
>
>
>
> ```
>
> std::optional<MyMatrixT> safeMul(const MyMatrixT & lhs, const
MyMatrixT &
> rhs) {
>
>   for (int i = 0; i < lhs.rows; ++i) {
>
>     for (int j = 0; j < lhs.cols; ++j) {
>
>       if (isnan(lhs[i][j])) {
>
>         return {};
>
>       }
>
>     }
>
>   }
>
>   for (int i = 0; i < rhs.rows; ++i) {
>
>     for (int j = 0; j < rhs.cols; ++j) {
>
>       if (isnan(rhs[i][j])) {
>
>         return {};
>
>       }
>
>     }
>
>  }
>
>
>
>   // do the multiply
>
> }
>
> ```
>
>
>
> In this case, if isnan(x) can be constant folded to false with fast-math
> enabled, then these two loops can be completely eliminated since they are
> empty and do nothing. If MyMatrixT is a 100 x 100 matrix, and/or safeMul is
> called in a hot loop, this could be huge. What should I do instead here?
>
>
>
> Really, it would be much more consistent if we apply the clang
> documentation for fast-math “Operands to floating-point operations are not
> equal to NaN and Inf” literally, and not actually implement “Operands to
> floating-point operations are not equal to NaN and Inf, except in the case
> of isnan(), but only if the argument to isnan() is a value stored in a
> variable and not an expression”. As far as using isnan from the standard
> library compiled without fast-math vs a compiler builtin, I don’t think
> this is an issue. Really, enabling fast-math is basically telling the
> compiler “My code has no NaNs. I won’t try to do anything with them, and
> you should optimize assuming they aren’t there”. If a developer does their
> part, why should it matter to them that isnan() might work?
>
>
>
> Thanks,
>
>    Chris Tetreault
>
>
>
>
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 11:27 AM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Krzysztof Parzyszek <kparzysz at quicinc.com>; cfe-dev at
lists.llvm.org;
> llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Fri, Sep 10, 2021 at 1:03 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> In this case, I think it’s perfectly reasonable to reinterpret_cast the
> floats to uint32_t, and then inspect the bit pattern. Since NaN is being
> used as a sentinel value, I assume it’s a known bit pattern, and not just
> any old NaN.
>
>
>
> C standard defines a function to determine if a value is NaN. The fact
> that it does not work in this case demonstrates that the optimization is
> incorrect. Again, if isnan comes from libc implementation, it will work,
> but if it is provided by the compiler, it does not. Users expect consistent
> behavior.
>
>
>
> If NaNs are not prohibited at all in -ffinite-math-only mode, isnan must
> work as specified in the standard.
>
>
>
>
>
> I think it’s fine that fast-math renders isnan useless. As far as I know,
> the C++ standard wasn’t written to account for compilers providing
> fast-math flags. fast-math is itself a workaround for “IEEE floats do not
> behave like actual real numbers”, so working around a workaround seems
> reasonable to me.
>
>
>
> I feel you are right with fast-math as a workaround, but the compiler is a
> practical tool and it must be convenient and suitable for a wide set of
> tasks. The situation when a user has to invent workarounds because some
> optimization changes semantics of a standard function is not good.
>
>
>
> As for ffinite-math-only, it is actually more or less a safe mode. When we
> use integer division, we know that the divisor must not be zero. The case
> of ffinite-math-only is similar.
>
>
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 10:34 AM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Krzysztof Parzyszek <kparzysz at quicinc.com>; cfe-dev at
lists.llvm.org;
> llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> Let me describe a real life example.
>
>
>
> There is a realtime program that processes float values from a huge array.
> Calculations do not produce NaNs and do not expect them. Using
> -ffinite-math-only substantially speeds up the program, so it is highly
> desirable to use it. The problem is that the array contains NaNs, they mark
> elements that should not be processed.
>
>
>
> An obvious solution is to check an element for NaN, and if it is not,
> process it. Now there is no clean way to do so. Only workarounds, like
> using integer arithmetics. The function 'isnan' became useless. And
there
> are many cases when users complain of this optimization.
>
>
>
> Thanks,
>
> --Serge
>
>
>
>
>
> On Fri, Sep 10, 2021 at 12:09 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> If the issue is that users want their asserts to fire, then they should be
> encouraged to only enable fast math in release builds.
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of
*Serge
> Pavlov via llvm-dev
> *Sent:* Thursday, September 9, 2021 9:53 AM
> *To:* Krzysztof Parzyszek <kparzysz at quicinc.com>
> *Cc:* LLVM Developers <llvm-dev at lists.llvm.org>; cfe-dev at
lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Thu, Sep 9, 2021 at 11:29 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
> wrote:
>
> This goes back to what these options actually imply.  The interpretation
> that I favor is “this code will never see a NaN”, or “the program can
> assume that no floating point expression will evaluate to a NaN”.  The
> benefit of that is that it’s intuitively clear.  In that case “isnan(x)” is
> false, because x cannot be a NaN.  There is no distinction between
> “isnan(x+x)” and “isnan(x)”.  If the user wants to preserve “isnan(x)”,
> they can apply some pragma (which clang may actually have already).
>
>
>
> It is apparent simplicity. As the discussion in gcc mail list demonstrated
> (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this
> is actually an impromissing way. From a practical viewpoint it is also a
> bad solution as users cannot even check the assertions.
>
>
>
>
>
> To be honest, I’m not sure that I understand your argument.  Are you
> saying that under your interpretation we could optimize “isnan(x+x) ->
> false”, but not “isnan(x) -> false”?
>
>
>
> Argument of `isnan(x+x)` is a result of arithmetic operation. According to
> the meaning of -ffinite-math-only it cannot produce NaN. So this call can
> be optimized out. In the general case `isnan(x)` value may be, say, loaded
> from memory. Load is not an arithmetic operation, so nothing prevents from
> loading NaN. Optimizing the call out is dangerous in this case.
>
>
>
>
>
>
>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Thursday, September 9, 2021 11:10 AM
> *To:* Krzysztof Parzyszek <kparzysz at quicinc.com>
> *Cc:* Chris Lattner <clattner at nondot.org>; James Y Knight <
> jyknight at google.com>; LLVM Developers <llvm-dev at
lists.llvm.org>;
> cfe-dev at lists.llvm.org
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Thu, Sep 9, 2021 at 8:30 PM Krzysztof Parzyszek via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> If we say that the fast-math flags are “enabling optimizations that the
> presence of nans otherwise prohibits”, then there is no reason for clang to
> keep calls to “isnan” around, or to keep checks like “fpclassify(x) =>
it’s_a_nan” unfolded.  These are exactly the types of optimizations that
> the presence of NaNs would prohibit.
>
>
>
> Transformation 'x * 0 -> 0' is an optimization allowed in the
absence of
> nans as arguments, because it produces a program that behaves identically
> under the given restrictions. Replacement of `isnan(x + x)` is also an
> optimization under the same restrictions. Replacement of `isnan(x)` in
> general case is not, because we cannot assume that x cannot be a NaN.
>
>
>
>
>
> I understand the need for having some NaN-handling preserved in an
> otherwise finite-math code.  We already have fast-math-related attributes
> attached to each function in the LLVM IR, so we could introduce a
> source-level attribute for enabling/disabling these flags per function.
>
>
>
> GCC allows using `#pragma GCC optimize ("finite-math-only")` or
`#pragma
> GCC optimize ("no-finite-math-only")` to enable/disable
optimization per
> function basis. Clang could support this pragmf or maybe `#pragma clang fp`
> can be extended to support similar functionality.
>
>
>
>
>
>
>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of
*Chris
> Lattner via cfe-dev
> *Sent:* Wednesday, September 8, 2021 5:51 PM
> *To:* James Y Knight <jyknight at google.com>
> *Cc:* LLVM Developers <llvm-dev at lists.llvm.org>; Clang Dev <
> cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Sep 8, 2021, at 3:27 PM, James Y Knight via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> I expressed my strong support for this on the previous thread, but I'll
> just repost the most important piece...
>
>
>
> I believe the proposed semantics from the Clang level ought to be:
>
>   The -ffinite-math-only and -fno-signed-zeros options do not impact the
> ability to accurately load, store, copy, or pass or return such values from
> general function calls. They also do not impact any of the
> "non-computational" and "quiet-computational" IEEE-754
operations, which
> includes classification functions (fpclassify, signbit, isinf/isnan/etc),
> sign-modification (copysign, fabs, and negation `-(x)`), as well as
> the totalorder and totalordermag functions. Those correctly handle NaN,
> Inf, and signed zeros even when the flags are in effect. These flags *do*
affect
> the behavior of other expressions and math standard-library calls, as well
> as comparison operations.
>
>
>
> FWIW, I completely agree - these flags are about enabling optimizations
> that the presence of nans otherwise prohibits.  We shouldn’t take a literal
> interpretation of an old GCC manual, as that would not be useful.
>
>
>
> If we converge on this definition, I think it should be documented.  This
> is a source of confusion that comes up periodically.
>
>
>
> -Chris
>
>
>
>
>
>
>
> I would not expect this to have an actual negative impact on the
> performance benefit of those flags, since the optimization benefits mainly
> arise from comparisons and the general computation instructions which are
> unchanged.
>
>
>
> In further support of this position, I note that the previous thread
> uncovered at least one vendor -- Apple (
>
https://opensource.apple.com/source/Libm/Libm-2026/Source/Intel/math.h.auto.html)
> -- going out of their way to cause isnan and friends to function properly
> with -ffast-math enabled.
>
>
>
>
>
>
>
> On Wed, Sep 8, 2021 at 1:02 PM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> Hi all,
>
>
>
> One of the purposes of `llvm::isnan` was to help preserve the check made
> by `isnan` if fast-math mode is
>
> specified (https://reviews.llvm.org/D104854). I'd like to describe
reason
> for that and propose to use the behavior
>
> implemented in that patch.
>
>
>
> The option `-ffast-math` is often used when performance is important, as
> it allows a compiler to generate faster code.
>
> This option itself is a collection of different optimization techniques,
> each having its own option. For this topic only the
>
> option `-ffinite-math-only` is of interest. With it the compiler treats
> floating point numbers as mathematical real numbers,
>
> so transformations like `0 * x -> 0` become valid.
>
>
>
> In clang documentation (
> https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math) this
> option is described as:
>
>     "Allow floating-point optimizations that assume arguments and
results
> are not NaNs or +-Inf."
>
> GCC documentation (
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) is a bit more
> concrete:
>
>     "Allow optimizations for floating-point arithmetic that assume
that
> arguments and results are not NaNs or +-Infs."
>
>
>
> **What is the issue?**
>
> C standard defines a macro `isnan`, which can be mapped to an intrinsic
> function provided by the compiler. For both
>
> clang and gcc it is `__builtin_isnan`. How should this function behave if
> `-ffinite-math-only` is specified? Should it make a
>
> real check or the compiler can assume that it always returns false?
>
> GCC optimizes out `isnan`. It follows from the viewpoint that (
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724#c1):
>
>     "With -ffinite-math-only you are telling that there are no NaNs
and
> thus GCC optimizes isnan (x) to 0."
>
>
>
> Such treatment of `-ffinite-math-only` has sufficient drawbacks. In
> particular it makes it impossible to check validity of
>
> data: a user cannot write
>
>
>
> assert(!isnan(x));
>
>
>
> because the compiler replaces the actual function call with its expected
> value. There are many complaints in GCC bug
>
> tracker (for instance https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84949
> or https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724)
>
> as well as in forums (
>
https://stackoverflow.com/questions/47703436/isnan-does-not-work-correctly-with-ofast-flags
> or
>
>
https://stackoverflow.com/questions/22931147/stdisinf-does-not-work-with-ffast-math-how-to-check-for-infinity).
> Proposed
>
> solutions are using integer operations to make the check, to turn off
> `-ffinite-math-only` in some parts of the code or to
>
> ensure that libc function is called. It clearly demonstrates that `isnan`
> in this case is useless, but users need its functionality
>
> and do not have a proper tool to make required checks. The similar
> direction was criticized in llvm as well (
> https://reviews.llvm.org/D18513#387418).
>
>
>
> **Why imposing restrictions on floating types is bad?**
>
> If `-ffinite-math-only` modifies properties of `double` type, several
> issues arise, for instance:
> - What should return `std::numeric_limits<double>::has_quiet_NaN()`?
> - What body should have this function if it is used in a program where
> some functions are compiled with `fast-math` and some without?
> - Should inlining of a function compiled with `fast-math` to a function
> compiled without it be prohibited in inliner?
> - Should `std::isnan(std::numeric_limits<float>::quiet_NaN())` be
true?
>
> If the type `double` cannot have NaN value, it means that `double` and
> `double` under `-ffinite-math-only` are different types
>
> (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html). Such
> a way can solve these problems but it is so expensive
>
> that hardly it has a chance to be realized.
>
>
>
> **The solution**
>
> Instead of modifying properties of floating point types, the effect of
> `-ffinite-math-only` can be expressed as a restriction on
>
> operation usage.  Actually clang and gcc documentation already follows
> this way. Fast-math flags in llvm IR also are attributes
>
> of instructions. The only question is whether `isnan` and similar
> functions are floating-point arithmetic.
>
> From a practical viewpoint, treating non-computational functions as
> arithmetic does not add any advantage. If a code extensively
>
> uses `isnan` (so could profit by their removal), it is likely it is not
> suitable for -ffinite-math-only. This interpretation however creates
>
> the problems described above. So it is profitable to consider `isnan` and
> similar functions as non-arithmetical.
>
>
>
> **Why is it safe to leave `isnan`?**
>
> The probable concern of this solution is deviation from gcc behavior.
> There are several reasons why this is not an issue.
>
> 1. -ffinite-math-only is an optimization option. A correct program
> compiled with -ffinite-math-only and without it should behave
>
>    identically, if conditions for using -ffinite-math-only are fulfilled.
> So making the check cannot break functionality.
> 2. `isnan` is implemented by libc, which can map it to a compiler builtin
> or use its own implementation, depending on
>
>    configuration options. `isnan` implemented in libc obviously always
> does the real check.
> 3. ICC and MSVC preserve `isnan` in fast-math mode.
>
>
>
> The proposal is to not consider `isnan` and other such functions as
> arithmetic operations and do not optimize them out
>
> just because -ffinite-math-only is specified. Of course, there are cases
> when `isnan` may be optimized out, for instance,
>
> `isnan(a + b)` may be optimized if -ffinite-math-only is in effect due to
> the assumption (result of arithmetic operation is not NaN).
>
> What are your opinions?
>
> Thanks,
> --Serge
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210911/9c83d27a/attachment-0001.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?