Serge Pavlov via llvm-dev
2021-Sep-13 13:49 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
Let's weigh the alternatives. We are discussing two approaches for handling `isnan` and similar functions in -ffinite-math-only mode: 1. "Old" behavior: "with -ffinite-math-only you are telling that there are no NaNs", so `isnan` may be optimized to `false`. 2. "New" behavior: with -ffinite-math-only you are telling that the operands of arithmetic operations are not NaNs but otherwise NaN may be used. As `isnan` is not an arithmetic operation, it should be preserved. Advantages of the "old" behavior are: - " it’s intuitively clear". - It is close to the GCC current behavior. Advantages of the "new" behavior are: - `isnan` is still available to the user, which allows, for instance, validation of working data or selection between fast and slow path. - NaN is available and may be used, for instance, as sentinel. - Consistency between compiler and library implementations, both would behave similarly. - In most real cases the "old" behavior can be easily obtained by redefinition of `isnan`. - It is free from issues like "what returns numeric_limits<float>::has_quite_NaN()?". It is unlikely that "old" behavior gives noticeable performance gain. Anyway, `isnan` may be redefined to `false` if it actually does. Intuitive clarity of the "old" way is questionable for users, because it is not clear why functions like `isnan` silently disappeared or what body should have specializations of `numeric_limit` methods. There are cases when checking for NaN is needed even in -ffinite-math-only mode. To make it, users have to use workarounds like doing integer arithmetic on float values, which reduce clarity of code, make it unportable and slower. Are there any other advantages/disadvantages of these approaches? Thanks, --Serge On Mon, Sep 13, 2021 at 7:00 PM James Y Knight <jyknight at google.com> wrote:> On Mon, Sep 13, 2021, 2:02 AM Serge Pavlov via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> The working construct is `reinterpret_cast<uint32_t&>(x)`. It however >> possesses the same drawback, it requires `x` be in memory. >> > > We're getting rather far afield of the thread topic here, but .. that is > UB, don't do that. > > Instead, always memcpy, e.g. > uint32_t y; > memcpy(&y, &flo, sizeof(uint32_t)); > > Or use a wrapper like std::bit_cast or absl::bit_cast ( > https://github.com/abseil/abseil-cpp/blob/cfbf5bf948a2656bda7ddab59d3bcb29595c144c/absl/base/casts.h#L106 > ). > > This has effectively no runtime overhead, the compiler is extremely good > at deleting calls to memcpy when it has a constant smallish size. And > remember that *every* local variable started out in memory. Only through > optimizations does the memory location and the loads/stores for every > access get eliminated. > >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210913/5fe23580/attachment.html>
Krzysztof Parzyszek via llvm-dev
2021-Sep-13 14:03 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
If the compiler provides “isnan”, the user can’t redefine it. Redefining/undefining any function or a macro provided by a compiler is UB. The “old” behavior can be tuned with #pragmas to restore the functionality of NaNs where needed. The “old” behavior doesn’t have a problem with “has_nan”---it returns “true”. What other issues are there? -- Krzysztof Parzyszek kparzysz at quicinc.com<mailto:kparzysz at quicinc.com> AI tools development From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Serge Pavlov via cfe-dev Sent: Monday, September 13, 2021 8:50 AM To: James Y Knight <jyknight at google.com> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <cfe-dev at lists.llvm.org> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math mode? WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros. Let's weigh the alternatives. We are discussing two approaches for handling `isnan` and similar functions in -ffinite-math-only mode: 1. "Old" behavior: "with -ffinite-math-only you are telling that there are no NaNs", so `isnan` may be optimized to `false`. 2. "New" behavior: with -ffinite-math-only you are telling that the operands of arithmetic operations are not NaNs but otherwise NaN may be used. As `isnan` is not an arithmetic operation, it should be preserved. Advantages of the "old" behavior are: - " it’s intuitively clear". - It is close to the GCC current behavior. Advantages of the "new" behavior are: - `isnan` is still available to the user, which allows, for instance, validation of working data or selection between fast and slow path. - NaN is available and may be used, for instance, as sentinel. - Consistency between compiler and library implementations, both would behave similarly. - In most real cases the "old" behavior can be easily obtained by redefinition of `isnan`. - It is free from issues like "what returns numeric_limits<float>::has_quite_NaN()?". It is unlikely that "old" behavior gives noticeable performance gain. Anyway, `isnan` may be redefined to `false` if it actually does. Intuitive clarity of the "old" way is questionable for users, because it is not clear why functions like `isnan` silently disappeared or what body should have specializations of `numeric_limit` methods. There are cases when checking for NaN is needed even in -ffinite-math-only mode. To make it, users have to use workarounds like doing integer arithmetic on float values, which reduce clarity of code, make it unportable and slower. Are there any other advantages/disadvantages of these approaches? Thanks, --Serge On Mon, Sep 13, 2021 at 7:00 PM James Y Knight <jyknight at google.com<mailto:jyknight at google.com>> wrote: On Mon, Sep 13, 2021, 2:02 AM Serge Pavlov via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote: The working construct is `reinterpret_cast<uint32_t&>(x)`. It however possesses the same drawback, it requires `x` be in memory. We're getting rather far afield of the thread topic here, but .. that is UB, don't do that. Instead, always memcpy, e.g. uint32_t y; memcpy(&y, &flo, sizeof(uint32_t)); Or use a wrapper like std::bit_cast or absl::bit_cast (https://github.com/abseil/abseil-cpp/blob/cfbf5bf948a2656bda7ddab59d3bcb29595c144c/absl/base/casts.h#L106). This has effectively no runtime overhead, the compiler is extremely good at deleting calls to memcpy when it has a constant smallish size. And remember that every local variable started out in memory. Only through optimizations does the memory location and the loads/stores for every access get eliminated. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210913/7af39540/attachment.html>