thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Serge Pavlov via llvm-dev

2021-Sep-14 14:22 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Tue, Sep 14, 2021 at 8:21 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
wrote:
> If `has_nan` returns "true", it means that the explanation
"there are no
> NaNs" does not work anymore and something more complex is needed to
explain
> the effect of the option. In this case it is difficult to say that this
> approach is "intuitively clear".
>
>
>
> If your program has “x = *p”, it means that at this point p is never a
> null pointer.  Does this imply that the type of p can no longer represent a
> null pointer?
>
Good example! If you use integer division `r = a / b`, you promise that `b`
is not zero. It however does not mean  that preceding check `b == 0` may be
optimized to `false`.

The statement "there are no NaNs" means that properties of type
`float` are
modified so that NaN is no longer an allowed value of it. In this case it
is allowed to optimize out `isnan`. If the guarantee is only that NaN
cannot be an argument of an arithmetic operation, NaN is still a valid
value of `float` and `isnan` cannot be replaced with `false`.

>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of
*Serge
> Pavlov via cfe-dev
> *Sent:* Tuesday, September 14, 2021 7:04 AM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Mon, Sep 13, 2021 at 9:03 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
> wrote:
>
> If the compiler provides “isnan”, the user can’t redefine it.
> Redefining/undefining any function or a macro provided by a compiler is UB.
>
>
>
> Actually it does not matter. This is needed only to emulate the
"old"
> behavior, which itself breaks the standard.
>
>
>
> The “old” behavior can be tuned with #pragmas to restore the functionality
> of NaNs where needed.
>
>
>
> Did you mean `#pragma GCC optimize("ffinite-math-only")`? Clang
does not
> support it.
>
>
>
> The “old” behavior doesn’t have a problem with “has_nan”---it returns
> “true”.  What other issues are there?
>
>
>
>  If `has_nan` returns "true", it means that the explanation
"there are no
> NaNs" does not work anymore and something more complex is needed to
explain
> the effect of the option. In this case it is difficult to say that this
> approach is "intuitively clear".
>
>
>
> On Mon, Sep 13, 2021 at 10:28 PM Arthur O'Dwyer <arthur.j.odwyer at
gmail.com>
> wrote:
>
>
>
> Btw, I don't think this thread has paid enough attention to Richard
> Smith's suggestion:
>
>
>
> I can only subscribe to James Y Knight's opinion. Indeed, it can be a
good
> criterion of which operations should work in finite-math-only mode and
> which can not work. The only thing which I worry about is the possibility
> of checking the operation result for infinity (and nan for symmetry). But
> the suggested criterion is formulated in terms of arguments, not results,
> so it must allow such checks.
>
>
>
> Thanks,
> --Serge
>
>
>
>
>
> On Tue, Sep 14, 2021 at 12:50 AM Serge Pavlov <sepavloff at
gmail.com> wrote:
>
> What I'd like to emphasize is that this option was introduced not for
> logical consistency, but for practical needs. It allows users to get faster
> code and this is why it is an important option. We are discussing two ways,
> which are not equivalent. If `isnan` is unconditionally optimized out,
> users that need it have to use workarounds, which leads to loss of
> portability and performance. If `isnan` is preserved, no workarounds are
> required, simple redefinition results in the "old" behavior. It
seems to me
> that implementation of this option should pursue practical needs and should
> enable most use cases. The current implementation does not fit user needs,
> as it follows from the complaints in gcc bug tracker and forums. We could
> make clang more user-friendly if this option would be implemented slightly
> differently than now.
>
>
>
> On Mon, Sep 13, 2021 at 11:46 PM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> … is guaranteed to work, and I read that fast-math enables the compiler to
> reason about constructs like `x + 0` being equal to `x`, then I’m going to
> be very confused when:
>
>
>
> You are right, this was a bad idea. Compiler may optimize out `isnan` but
> only when it deduces that the value cannot be NaN, but not due to the
> user's promise. It is especially important for `isinf`. Addition of two
> finite values may produce infinity and there is no universal way to predict
> it. It is probably not an issue for types like float or double, but ML
> cores use halfs or even minifloats, where overflow is much more probable.
> If in the code:
>
> ```
>
> float r = a + b;
>
> if (isinf(r)) {...
>
> ```
>
> `isinf` were optimized out just because -ffinite-math-only is in effect,
> the user cannot check if overflow did not occur. This contrasts with the
> definition of `ninf` in LLVM IR:
>
>
>
> "No Infs - Allow optimizations to assume the arguments and result are
not
> +/-Inf."
>
>
>
> It is possible to ensure that arguments are not Infs but for the result it
> is much more difficult to guarantee.
>
>
> Thanks,
> --Serge
>
>
>
>
>
> On Mon, Sep 13, 2021 at 11:46 PM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> Honestly, we can do this until the end of time. I think we both agree,
> that for either scheme, there exists workarounds. The question is which
> workarounds are more palatable, which is a matter of opinion. I think we’ve
> come to an impasse, so let me just state that my opinion on the question
> “Should isnan be optimized out in fast-math mode?” is “Yes”, which is what
> you asked to get in your original message. I think that the implementation
> of fast-math will be cleaner if we don’t special case a bunch of random
> constructs in order to do what the user meant instead of what they said. I
> think fast-math is a notorious footgun, and any attempts to mitigate this
> will only reduce the effectiveness of the tool, while not really improving
> the user experience.
>
>
>
> As a user, if I read that:
>
>
>
> ```
>
> if (isnan(x)) {
>
> ```
>
>
>
> … is guaranteed to work, and I read that fast-math enables the compiler to
> reason about constructs like `x + 0` being equal to `x`, then I’m going to
> be very confused when:
>
>
>
> ```
>
> if (isnan(x + 0)) {
>
> ```
>
>
>
> … does not also work. I’m going to open a bug and complain, and the slide
> down the slippery slope will continue. You and I understand the difference,
> and the technical reason why `isnan(x)` is supported but `isnan(x + 0)`
> isn’t, but Joe Coder just trying to figure out why he’s got NaN in his
> matrices despite his careful NaN handling code. Joe is not a compiler
> expert, and on the face of it, it seems like a silly limitation. This will
> never end until fast-math is gutted.
>
>
>
> Thanks,
>
>    Chris Tetreault
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Friday, September 10, 2021 9:21 PM
> *To:* Chris Tetreault <ctetreau at quicinc.com>
> *Cc:* Richard Smith <richard at metafoo.co.uk>; llvm-dev at
lists.llvm.org;
> cfe-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> The problem is that math code is often templated, so `template <typename
> T>  MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is
going to be in a
> header.
>
>
>
> No problem, the user can write:
>
> ```
>
> #ifdef __FAST_MATH__
>
> #undef isnan
> #define isnan(x) false
>
> #endif
>
> ```
> and put it somewhere in the headers.
>
>
>
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> Regardless, my position isn’t “there is no NaN”. My position is “you
> cannot count on operations on NaN working”.
>
>
>
> Exactly. Attempts to express the condition of -ffast-math as restrictions
> on types are not fruitful. I think it is the reason why GCC documentation
> does not use simple and clear "there is no NaN" but prefers more
> complicated wording about arithmetic.
>
>
>
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
> I think working around these sorts of issues is something that C and C++
> developers are used to. These sorts of “inconsistent” between compilers
> behaviors is something we accept because we know it comes with improved
> performance. In this case, the fix is easy, so I don’t think this corner
> case is worth supporting. Especially when the fix is also just one line:
> ```
> #define myIsNan(x) (reinterpret_cast<uint32_t>(x) =>
THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
> ```
>
>
>
> It won't work in this way. If `x == 5.0`, then
> `reinterpret_cast<uint32_t>(x) == 5`. What you need there is a
bitcast.
> Standard C does not have such. To emulate it a reinterpret_cast of memory
> can be used: `*reinterpret_cast<int *>(&x)`. Another way is to
use a
> union. Both these solutions require operations with memory, which is not
> good for performance, especially on GPU and ML cores. Of course, a smart
> compiler can eliminate memory operation, but it does not have to do it
> always, as it is only optimization. Moving a value between float and
> integer pipelines also may incur a performance penalty. At the same time
> this check often may be done with a single instruction.
>
>
>
> Thanks,
> --Serge
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210914/929baf5d/attachment.html>

Krzysztof Parzyszek via llvm-dev

2021-Sep-14 15:01 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

Good example! If you use integer division `r = a / b`, you promise that `b` is
not zero. It however does not mean  that preceding check `b == 0` may be
optimized to `false`.

The statement "there are no NaNs" means that properties of type
`float` are modified so that NaN is no longer an allowed value of it. In this
case it is allowed to optimize out `isnan`. If the guarantee is only that NaN
cannot be an argument of an arithmetic operation, NaN is still a valid value of
`float` and `isnan` cannot be replaced with `false`.

Granted, the statement “there are no NaNs” is somewhat ambiguous, but taken to
mean “NaNs will not happen at runtime” it would allow you to remove the NaN
equivalent of “b == 0” without changing the meaning of “float”.  This is the
interpretation I’m arguing for.

--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: Serge Pavlov <sepavloff at gmail.com>
Sent: Tuesday, September 14, 2021 9:22 AM
To: Krzysztof Parzyszek <kparzysz at quicinc.com>
Cc: Chris Tetreault <ctetreau at quicinc.com>; llvm-dev at lists.llvm.org;
cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Tue, Sep 14, 2021 at 8:21 PM Krzysztof Parzyszek <kparzysz at
quicinc.com<mailto:kparzysz at quicinc.com>> wrote:
If `has_nan` returns "true", it means that the explanation "there
are no NaNs" does not work anymore and something more complex is needed to
explain the effect of the option. In this case it is difficult to say that this
approach is "intuitively clear".

If your program has “x = *p”, it means that at this point p is never a null
pointer.  Does this imply that the type of p can no longer represent a null
pointer?

Good example! If you use integer division `r = a / b`, you promise that `b` is
not zero. It however does not mean  that preceding check `b == 0` may be
optimized to `false`.

The statement "there are no NaNs" means that properties of type
`float` are modified so that NaN is no longer an allowed value of it. In this
case it is allowed to optimize out `isnan`. If the guarantee is only that NaN
cannot be an argument of an arithmetic operation, NaN is still a valid value of
`float` and `isnan` cannot be replaced with `false`.


--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: cfe-dev <cfe-dev-bounces at lists.llvm.org<mailto:cfe-dev-bounces at
lists.llvm.org>> On Behalf Of Serge Pavlov via cfe-dev
Sent: Tuesday, September 14, 2021 7:04 AM
To: Chris Tetreault <ctetreau at quicinc.com<mailto:ctetreau at
quicinc.com>>
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>; cfe-dev
at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Mon, Sep 13, 2021 at 9:03 PM Krzysztof Parzyszek <kparzysz at
quicinc.com<mailto:kparzysz at quicinc.com>> wrote:
If the compiler provides “isnan”, the user can’t redefine it. 
Redefining/undefining any function or a macro provided by a compiler is UB.

Actually it does not matter. This is needed only to emulate the "old"
behavior, which itself breaks the standard.

The “old” behavior can be tuned with #pragmas to restore the functionality of
NaNs where needed.

Did you mean `#pragma GCC optimize("ffinite-math-only")`? Clang does
not support it.

The “old” behavior doesn’t have a problem with “has_nan”---it returns “true”. 
What other issues are there?

 If `has_nan` returns "true", it means that the explanation
"there are no NaNs" does not work anymore and something more complex
is needed to explain the effect of the option. In this case it is difficult to
say that this approach is "intuitively clear".

On Mon, Sep 13, 2021 at 10:28 PM Arthur O'Dwyer <arthur.j.odwyer at
gmail.com<mailto:arthur.j.odwyer at gmail.com>> wrote:

Btw, I don't think this thread has paid enough attention to Richard
Smith's suggestion:

I can only subscribe to James Y Knight's opinion. Indeed, it can be a good
criterion of which operations should work in finite-math-only mode and which can
not work. The only thing which I worry about is the possibility of checking the
operation result for infinity (and nan for symmetry). But the suggested
criterion is formulated in terms of arguments, not results, so it must allow
such checks.

Thanks,
--Serge


On Tue, Sep 14, 2021 at 12:50 AM Serge Pavlov <sepavloff at
gmail.com<mailto:sepavloff at gmail.com>> wrote:
What I'd like to emphasize is that this option was introduced not for
logical consistency, but for practical needs. It allows users to get faster code
and this is why it is an important option. We are discussing two ways, which are
not equivalent. If `isnan` is unconditionally optimized out, users that need it
have to use workarounds, which leads to loss of portability and performance. If
`isnan` is preserved, no workarounds are required, simple redefinition results
in the "old" behavior. It seems to me that implementation of this
option should pursue practical needs and should enable most use cases. The
current implementation does not fit user needs, as it follows from the
complaints in gcc bug tracker and forums. We could make clang more user-friendly
if this option would be implemented slightly differently than now.

On Mon, Sep 13, 2021 at 11:46 PM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
… is guaranteed to work, and I read that fast-math enables the compiler to
reason about constructs like `x + 0` being equal to `x`, then I’m going to be
very confused when:

You are right, this was a bad idea. Compiler may optimize out `isnan` but only
when it deduces that the value cannot be NaN, but not due to the user's
promise. It is especially important for `isinf`. Addition of two finite values
may produce infinity and there is no universal way to predict it. It is probably
not an issue for types like float or double, but ML cores use halfs or even
minifloats, where overflow is much more probable. If in the code:
```
float r = a + b;
if (isinf(r)) {...
```
`isinf` were optimized out just because -ffinite-math-only is in effect, the
user cannot check if overflow did not occur. This contrasts with the definition
of `ninf` in LLVM IR:

"No Infs - Allow optimizations to assume the arguments and result are not
+/-Inf."

It is possible to ensure that arguments are not Infs but for the result it is
much more difficult to guarantee.

Thanks,
--Serge


On Mon, Sep 13, 2021 at 11:46 PM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
Honestly, we can do this until the end of time. I think we both agree, that for
either scheme, there exists workarounds. The question is which workarounds are
more palatable, which is a matter of opinion. I think we’ve come to an impasse,
so let me just state that my opinion on the question “Should isnan be optimized
out in fast-math mode?” is “Yes”, which is what you asked to get in your
original message. I think that the implementation of fast-math will be cleaner
if we don’t special case a bunch of random constructs in order to do what the
user meant instead of what they said. I think fast-math is a notorious footgun,
and any attempts to mitigate this will only reduce the effectiveness of the
tool, while not really improving the user experience.

As a user, if I read that:

```
if (isnan(x)) {
```

… is guaranteed to work, and I read that fast-math enables the compiler to
reason about constructs like `x + 0` being equal to `x`, then I’m going to be
very confused when:

```
if (isnan(x + 0)) {
```

… does not also work. I’m going to open a bug and complain, and the slide down
the slippery slope will continue. You and I understand the difference, and the
technical reason why `isnan(x)` is supported but `isnan(x + 0)` isn’t, but Joe
Coder just trying to figure out why he’s got NaN in his matrices despite his
careful NaN handling code. Joe is not a compiler expert, and on the face of it,
it seems like a silly limitation. This will never end until fast-math is gutted.

Thanks,
   Chris Tetreault

From: Serge Pavlov <sepavloff at gmail.com<mailto:sepavloff at
gmail.com>>
Sent: Friday, September 10, 2021 9:21 PM
To: Chris Tetreault <ctetreau at quicinc.com<mailto:ctetreau at
quicinc.com>>
Cc: Richard Smith <richard at metafoo.co.uk<mailto:richard at
metafoo.co.uk>>; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>; cfe-dev at lists.llvm.org<mailto:cfe-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
The problem is that math code is often templated, so `template <typename
T>  MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is going
to be in a header.

No problem, the user can write:
```
#ifdef __FAST_MATH__
#undef isnan
#define isnan(x) false
#endif
```
and put it somewhere in the headers.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
Regardless, my position isn’t “there is no NaN”. My position is “you cannot
count on operations on NaN working”.

Exactly. Attempts to express the condition of -ffast-math as restrictions on
types are not fruitful. I think it is the reason why GCC documentation does not
use simple and clear "there is no NaN" but prefers more complicated
wording about arithmetic.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at
quicinc.com<mailto:ctetreau at quicinc.com>> wrote:
I think working around these sorts of issues is something that C and C++
developers are used to. These sorts of “inconsistent” between compilers
behaviors is something we accept because we know it comes with improved
performance. In this case, the fix is easy, so I don’t think this corner case is
worth supporting. Especially when the fix is also just one line:
```
#define myIsNan(x) (reinterpret_cast<uint32_t>(x) ==
THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
```

It won't work in this way. If `x == 5.0`, then
`reinterpret_cast<uint32_t>(x) == 5`. What you need there is a bitcast.
Standard C does not have such. To emulate it a reinterpret_cast of memory can be
used: `*reinterpret_cast<int *>(&x)`. Another way is to use a union.
Both these solutions require operations with memory, which is not good for
performance, especially on GPU and ML cores. Of course, a smart compiler can
eliminate memory operation, but it does not have to do it always, as it is only
optimization. Moving a value between float and integer pipelines also may incur
a performance penalty. At the same time this check often may be done with a
single instruction.

Thanks,
--Serge
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210914/74c02ee7/attachment-0001.html>

Arthur O'Dwyer via llvm-dev

2021-Sep-14 15:15 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Tue, Sep 14, 2021 at 9:22 AM Serge Pavlov via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> On Tue, Sep 14, 2021 at 8:21 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
> wrote:
>
>> If `has_nan` returns "true", it means that the explanation
"there are no
>> NaNs" does not work anymore and something more complex is needed
to explain
>> the effect of the option. In this case it is difficult to say that this
>> approach is "intuitively clear".
>>
>>
>>
>> If your program has “x = *p”, it means that at this point p is never a
>> null pointer.  Does this imply that the type of p can no longer
represent a
>> null pointer?
>>
>
> Good example! If you use integer division `r = a / b`, you promise that
> `b` is not zero. It however does not mean  that preceding check `b == 0`
> may be optimized to `false`.
>
In C and C++, it actually *does* mean that, although of the compilers I
just tested on Godbolt, only MSVC seems to take advantage of that
permission.
https://godbolt.org/z/11ss5T7e8

The question of whether it is acceptable to treat as equivalent the
statements "p is known to be dereferenced in all successors of B" and
"p is
known to be non-null in B," was discussed extensively about 20 years ago,
and then again 12 years ago when it bit someone in the Linux kernel:
https://www.gnu.org/software/gcc/news/null.html
https://lwn.net/Articles/342330/
https://lwn.net/Articles/342420/
https://qinsb.blogspot.com/2018/03/ub-will-delete-your-null-checks.html

On Mon, Sep 13, 2021 at 10:28 PM Arthur O'Dwyer <arthur.j.odwyer at
gmail.com>>> wrote:
>>
>> Btw, I don't think this thread has paid enough attention to Richard
>> Smith's suggestion:
>>
>> I can only subscribe to James Y Knight's opinion. Indeed, it can be
a
>> good criterion of which operations should work in finite-math-only mode
and
>> which can not work. The only thing which I worry about is the
possibility
>> of checking the operation result for infinity (and nan for symmetry).
But
>> the suggested criterion is formulated in terms of arguments, not
results,
>> so it must allow such checks.
>>
>*What* is the opinion to which you subscribe?

Anyway, Richard's "quiet is signaling and signals are unspecified
values"
is really the only way out of the difficulty, as far as compiler people are
concerned. You two (Serge and Krzysztof) can keep talking past each other
at the application level, but the compiler people are going to have to do
*something* in the code eventually, and that *something* is going to have
to be expressed in terms similar to what Richard and I have been saying,
because these are the terms that the compiler understands.

Thanks,
Arthur
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210914/4e239a5a/attachment.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?