thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Chris Tetreault via llvm-dev

2021-Sep-10 19:39 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

The problem is that math code is often templated, so `template <typename
T>  MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is going
to be in a header.

Regardless, my position isn’t “there is no NaN”. My position is “you cannot
count on operations on NaN working”. Just like sometimes you can dereference a
pointer after it is free’d, but you should not count on this working. If the
compiler I’m using emits a call to a library function instead of providing a
macro, and this results in isnan actually computing if x is NaN, then so be it.
But if the compiler provides a macro that evaluates to false under fast-math,
then the two loops in safeMul can be optimized. Either way, as a developer, I
know that I turned on fast-math, and I write code accordingly.

I think working around these sorts of issues is something that C and C++
developers are used to. These sorts of “inconsistent” between compilers
behaviors is something we accept because we know it comes with improved
performance. In this case, the fix is easy, so I don’t think this corner case is
worth supporting. Especially when the fix is also just one line:

```
#define myIsNan(x) (reinterpret_cast<uint32_t>(x) ==
THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
```

I would probably call the macro something else like `shouldProcessElement`.

Thanks,
   Chris Tetreault

From: Serge Pavlov <sepavloff at gmail.com>
Sent: Friday, September 10, 2021 11:26 AM
To: Chris Tetreault <ctetreau at quicinc.com>
Cc: Richard Smith <richard at metafoo.co.uk>; llvm-dev at lists.llvm.org;
cfe-dev at lists.llvm.org
Subject: Re: [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math
mode?


WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
It should not be done in headers of course. Redefinition of this macro in the
source file which is compiled with -ffinite-math-only is free from the described
drawbacks. Besides, the macro `isnan` is defined by libc, not compiler and IIRC
it is defined as macro to allow such manipulations.

Influence of libc on behavior of `isnan` in -ffinite-math-only is also an
argument against "there are no NaNs". It causes inconsistency in the
behavior. Libc can provide its own implementation, which does not rely on
compiler `__builtin_isnan` and user code that uses `isnan` would work. But at
some point configuration script changes or libc changed the macro and your code
works wrong, as it happened after commit 767eadd78 in llvm libcxx project.
Keeping `isnan` would make changes in libc less harmful.

Thanks,
--Serge

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210910/b322d7c1/attachment.html>

Serge Pavlov via llvm-dev

2021-Sep-11 04:20 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:
> The problem is that math code is often templated, so `template <typename
> T>  MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is
going to be in a
> header.
>
No problem, the user can write:
```
#ifdef __FAST_MATH__
#undef isnan
#define isnan(x) false
#endif
```
and put it somewhere in the headers.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:
> Regardless, my position isn’t “there is no NaN”. My position is “you
> cannot count on operations on NaN working”.

Exactly. Attempts to express the condition of -ffast-math as restrictions
on types are not fruitful. I think it is the reason why GCC documentation
does not use simple and clear "there is no NaN" but prefers more
complicated wording about arithmetic.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:
> I think working around these sorts of issues is something that C and C++
> developers are used to. These sorts of “inconsistent” between compilers
> behaviors is something we accept because we know it comes with improved
> performance. In this case, the fix is easy, so I don’t think this corner
> case is worth supporting. Especially when the fix is also just one line:
> ```
> #define myIsNan(x) (reinterpret_cast<uint32_t>(x) =>
THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
> ```

It won't work in this way. If `x == 5.0`, then
`reinterpret_cast<uint32_t>(x) == 5`. What you need there is a bitcast.
Standard C does not have such. To emulate it a reinterpret_cast of memory
can be used: `*reinterpret_cast<int *>(&x)`. Another way is to use a
union.
Both these solutions require operations with memory, which is not good for
performance, especially on GPU and ML cores. Of course, a smart compiler
can eliminate memory operation, but it does not have to do it always, as it
is only optimization. Moving a value between float and integer
pipelines also may incur a performance penalty. At the same time this check
often may be done with a single instruction.

Thanks,
--Serge
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210911/5a82a8f7/attachment.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?