thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Serge Pavlov via llvm-dev

2021-Sep-13 14:31 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

`isnan` does not begin with an underscore, so it is not a reserved
identifier. Why is its redefinition an UB?

Thanks,
--Serge


On Mon, Sep 13, 2021 at 9:03 PM Krzysztof Parzyszek <kparzysz at
quicinc.com>
wrote:
> If the compiler provides “isnan”, the user can’t redefine it.
> Redefining/undefining any function or a macro provided by a compiler is UB.
>
>
>
> The “old” behavior can be tuned with #pragmas to restore the functionality
> of NaNs where needed.
>
> The “old” behavior doesn’t have a problem with “has_nan”---it returns
> “true”.  What other issues are there?
>
>
>
> --
>
> Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development
>
>
>
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of
*Serge
> Pavlov via cfe-dev
> *Sent:* Monday, September 13, 2021 8:50 AM
> *To:* James Y Knight <jyknight at google.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <
> cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> Let's weigh the alternatives.
>
> We are discussing two approaches for handling `isnan` and similar
> functions in -ffinite-math-only mode:
> 1. "Old" behavior: "with -ffinite-math-only you are telling
that there are
> no NaNs", so `isnan` may be optimized to `false`.
> 2. "New" behavior: with -ffinite-math-only you are telling that
the
> operands of arithmetic operations are not NaNs but otherwise NaN may be
> used. As `isnan` is not an arithmetic operation, it should be preserved.
>
> Advantages of the "old" behavior are:
> - " it’s intuitively clear".
> - It is close to the GCC current behavior.
>
> Advantages of the "new" behavior are:
> - `isnan` is still available to the user, which allows, for instance,
> validation of working data or selection between fast and slow path.
> - NaN is available and may be used, for instance, as sentinel.
> - Consistency between compiler and library implementations, both would
> behave similarly.
> - In most real cases the "old" behavior can be easily obtained by
> redefinition of `isnan`.
> - It is free from issues like "what returns
> numeric_limits<float>::has_quite_NaN()?".
>
> It is unlikely that "old" behavior gives noticeable performance
gain.
> Anyway, `isnan` may be redefined to `false` if it actually does.
>
> Intuitive clarity of the "old" way is questionable for users,
because it
> is not clear why functions like `isnan` silently disappeared or what body
> should have specializations of `numeric_limit` methods.
>
> There are cases when checking for NaN is needed even in -ffinite-math-only
> mode. To make it, users have to use workarounds like doing integer
> arithmetic on float values, which reduce clarity of code, make it
> unportable and slower.
>
> Are there any other advantages/disadvantages of these approaches?
>
>
>
> Thanks,
> --Serge
>
>
>
>
>
> On Mon, Sep 13, 2021 at 7:00 PM James Y Knight <jyknight at
google.com>
> wrote:
>
> On Mon, Sep 13, 2021, 2:02 AM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> The working construct is `reinterpret_cast<uint32_t&>(x)`. It
however
> possesses the same drawback, it requires `x` be in memory.
>
>
>
> We're getting rather far afield of the thread topic here, but .. that
is
> UB, don't do that.
>
>
>
> Instead, always memcpy, e.g.
>
> uint32_t y;
>
> memcpy(&y, &flo, sizeof(uint32_t));
>
>
>
> Or use a wrapper like std::bit_cast or absl::bit_cast (
>
https://github.com/abseil/abseil-cpp/blob/cfbf5bf948a2656bda7ddab59d3bcb29595c144c/absl/base/casts.h#L106
> ).
>
>
>
> This has effectively no runtime overhead, the compiler is extremely good
> at deleting calls to memcpy when it has a constant smallish size. And
> remember that *every* local variable started out in memory. Only through
> optimizations does the memory location and the loads/stores for every
> access get eliminated.
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210913/01d3cae5/attachment.html>

Krzysztof Parzyszek via llvm-dev

2021-Sep-13 15:09 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

The standard says so, but I can’t find the corresponding passage in the draft...

--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: Serge Pavlov <sepavloff at gmail.com>
Sent: Monday, September 13, 2021 9:31 AM
To: Krzysztof Parzyszek <kparzysz at quicinc.com>
Cc: James Y Knight <jyknight at google.com>; llvm-dev <llvm-dev at
lists.llvm.org>; cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?

WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
`isnan` does not begin with an underscore, so it is not a reserved identifier.
Why is its redefinition an UB?

Thanks,
--Serge

On Mon, Sep 13, 2021 at 9:03 PM Krzysztof Parzyszek <kparzysz at
quicinc.com<mailto:kparzysz at quicinc.com>> wrote:
If the compiler provides “isnan”, the user can’t redefine it. 
Redefining/undefining any function or a macro provided by a compiler is UB.

The “old” behavior can be tuned with #pragmas to restore the functionality of
NaNs where needed.
The “old” behavior doesn’t have a problem with “has_nan”---it returns “true”. 
What other issues are there?

--
Krzysztof Parzyszek  kparzysz at quicinc.com<mailto:kparzysz at
quicinc.com>   AI tools development

From: cfe-dev <cfe-dev-bounces at lists.llvm.org<mailto:cfe-dev-bounces at
lists.llvm.org>> On Behalf Of Serge Pavlov via cfe-dev
Sent: Monday, September 13, 2021 8:50 AM
To: James Y Knight <jyknight at google.com<mailto:jyknight at
google.com>>
Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; Clang Dev <cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math
mode?

WARNING: This email originated from outside of Qualcomm. Please be wary of any
links or attachments, and do not enable macros.
Let's weigh the alternatives.

We are discussing two approaches for handling `isnan` and similar functions in
-ffinite-math-only mode:
1. "Old" behavior: "with -ffinite-math-only you are telling that
there are no NaNs", so `isnan` may be optimized to `false`.
2. "New" behavior: with -ffinite-math-only you are telling that the
operands of arithmetic operations are not NaNs but otherwise NaN may be used. As
`isnan` is not an arithmetic operation, it should be preserved.

Advantages of the "old" behavior are:
- " it’s intuitively clear".
- It is close to the GCC current behavior.

Advantages of the "new" behavior are:
- `isnan` is still available to the user, which allows, for instance, validation
of working data or selection between fast and slow path.
- NaN is available and may be used, for instance, as sentinel.
- Consistency between compiler and library implementations, both would behave
similarly.
- In most real cases the "old" behavior can be easily obtained by
redefinition of `isnan`.
- It is free from issues like "what returns
numeric_limits<float>::has_quite_NaN()?".

It is unlikely that "old" behavior gives noticeable performance gain.
Anyway, `isnan` may be redefined to `false` if it actually does.

Intuitive clarity of the "old" way is questionable for users, because
it is not clear why functions like `isnan` silently disappeared or what body
should have specializations of `numeric_limit` methods.

There are cases when checking for NaN is needed even in -ffinite-math-only mode.
To make it, users have to use workarounds like doing integer arithmetic on float
values, which reduce clarity of code, make it unportable and slower.

Are there any other advantages/disadvantages of these approaches?

Thanks,
--Serge

On Mon, Sep 13, 2021 at 7:00 PM James Y Knight <jyknight at
google.com<mailto:jyknight at google.com>> wrote:
On Mon, Sep 13, 2021, 2:02 AM Serge Pavlov via cfe-dev <cfe-dev at
lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
The working construct is `reinterpret_cast<uint32_t&>(x)`. It however
possesses the same drawback, it requires `x` be in memory.

We're getting rather far afield of the thread topic here, but .. that is UB,
don't do that.

Instead, always memcpy, e.g.
uint32_t y;
memcpy(&y, &flo, sizeof(uint32_t));

Or use a wrapper like std::bit_cast or absl::bit_cast
(https://github.com/abseil/abseil-cpp/blob/cfbf5bf948a2656bda7ddab59d3bcb29595c144c/absl/base/casts.h#L106).

This has effectively no runtime overhead, the compiler is extremely good at
deleting calls to memcpy when it has a constant smallish size. And remember that
every local variable started out in memory. Only through optimizations does the
memory location and the loads/stores for every access get eliminated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210913/3cd2a574/attachment.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?