thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Serge Pavlov via llvm-dev

2021-Sep-17 06:18 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>
> On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at
quicinc.com>
>> wrote:
>>
>>> The difference there is that doing pointer arithmetic on null
pointers
>>> doesn't *usually* work, unless you turn on -ffast-pointers.
>>>
>>> It seems to me that  most confusion related to -ffast-math is
likely
>>> caused by people who are transitioning to using it. I have some
codebase,
>>> and I turn on fast math, and then a few months down the road I
notice a
>>> strangeness that I did not catch during the initial transition
period. If
>>> you're writing new code with fast-math, you don't do things
like try to use
>>> NaN as a sentinel value in a TU with fast math turned on. This is
the sort
>>> of thing you catch when you try to transition an existing codebase.
Forgive
>>> me for the uncharitable interpretation, but it's much easier to
ask the
>>> compiler to change to accommodate your use case than it is to
refactor your
>>> code.
>>>
>>
>> It is a common way to explain problems with -ffinite-math-only by user
>> ignorance. However user misunderstandings and complaints may indicate a
>> flaw in compiler implementation, which I believe we have in this case.
>>
>> Using NaN as sentinels is a natural way when you cannot spend extra
>> memory for keeping flags for each item, spend extra cycles to read that
>> flag and do not want to pollute cache. It does not depend on reading
>> documentation or writing the code from scratch. It is simply the best
>> solution for storing data. If performance of the data processing is
>> critical, -ffast-math is a good solution. This is a fairly legitimate
use
>> case. The fact that the compiler does not allow it is a compiler
drawback.
>>
>>
>>> To me, I think Mehdi had the best solution: The algorithm that is
the
>>> bottleneck, and experiences the huge speedup using fast-math should
be
>>> separated into its own source file. This source file, and only this
source
>>> file should be compiled with fast-math. The outer driver loop
should not be
>>> compiled with fast math. This solution is clean, (probably) easy,
and
>>> doesn't require a change in the compiler.
>>
>>
>> It is a workaround, it works in some cases but does not in others. ML
>> kernel often is a single translation unit, there may be no such thing
as
>> linker for that processor. At the same time it is computation intensive
and
>> using fast-math in it may be very profitable.
>>
>
> Switching mode in a single TU seems valuable, but could this be handled
> with pragmas or function attributes instead?
>
GCC allows it by using `#pragma GCC optimize()`, but clang does not support
it. No suitable function attribute exists for that.

>
>
>>
>>
>>> Changing the compiler is hard, affects everybody who uses the
compiler,
>>> and creates inconsistency in behavior between clang and gcc (and
msvc with
>>> /fp:fast), and clang and old versions of clang.
>>
>>
>> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is
>> implemented in libc, it is also a real check. Actually removing `isnan`
>> creates inconsistency.
>>
>>
>>> The behavior of fast-math with respect to NaN is consistent across
the
>>> mainstream c/c++ compilers: no promises are made, users should not
assume
>>> that they can use it for anything. Changing it now would create a
major
>>> portability issue for user codebases, which in and of itself is a
very
>>> strong reason to not make this change.
>>>
>>
>> Removing `isnan` is only an optimization, it does not intend to change
>> semantics. So it cannot create portability issues. Quite the contrary,
it
>> helps portability by making behavior consistent between compilers and
libc
>> implementations. The only possible issue is performance loss, this is
>> discussed above, it is an unlikely case. Anyway, if such loss exists
and it
>> is absolutely intolerable for a user, a hack with redefinition of
`isnan`
>> restores the previous code generation.
>>
>>
>>> If the behavior is confusing to users, that's because it's
poorly
>>> explained.
>>
>>
>> What is confusing? That the explicitly written call to a function is
not
>> removed? According to user feedback it is the silent removal of `isnan`
>> that confuses users.
>>
>> Honestly, I think the docs are pretty clear, but "It's clear,
you just
>>> need to learn to read" is never an acceptable answer so it
could certainly
>>> be improved. This is the only thing that needs to be fixed in my
opinion.
>>>
>>
>> The documentation says about -ffinite-math-only:
>>
>>  "Allow optimizations for floating-point arithmetic that assume
that
>> arguments and results are not NaNs or +-Infs."
>>
>> Is it clear whether `isnan` is arithmetic or not?
>>
>
> If the result of a floating point arithmetic is fed directly to `isnan()`,
> are we allowed to eliminate the computation and fold the check to none?
> (seems like it according to the sentence you're quoting).
> Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`?
>
Initially there was such intention, but during the discussion it became
clear that it is not profitable. For `isinf` there is a realistic use case
when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer
demonstrated an example for `isnan`. It is harder to provide guarantees
about the result, than about the arguments.

> --
> Mehdi
>
>
>
>
>>
>>
>>> Thanks,
>>>    Chris Tetreault
>>>
>>> -----Original Message-----
>>> From: Michael Kruse <cfe-dev at meinersbur.de>
>>> Sent: Thursday, September 16, 2021 12:29 PM
>>> To: Chris Tetreault <ctetreau at quicinc.com>
>>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at
lists.llvm.org;
>>> cfe-dev at lists.llvm.org
>>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
>>> fast-math mode?
>>>
>>> WARNING: This email originated from outside of Qualcomm. Please be
wary
>>> of any links or attachments, and do not enable macros.
>>>
>>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via
cfe-dev <
>>> cfe-dev at lists.llvm.org>:
>>> > As a user, if I read that:
>>> >
>>> >
>>> >
>>> > ```
>>> >
>>> > if (isnan(x)) {
>>> >
>>> > ```
>>> >
>>> >
>>> >
>>> > … is guaranteed to work, and I read that fast-math enables the
>>> compiler to reason about constructs like `x + 0` being equal to
`x`, then
>>> I’m going to be very confused when:
>>> >
>>> >
>>> >
>>> > ```
>>> >
>>> > if (isnan(x + 0)) {
>>> >
>>> > ```
>>> >
>>> >
>>> >
>>> > … does not also work. I’m going to open a bug and complain,
and the
>>> slide down the slippery slope will continue. You and I understand
the
>>> difference, and the technical reason why `isnan(x)` is supported
but
>>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why
he’s got
>>> NaN in his matrices despite his careful NaN handling code. Joe is
not a
>>> compiler expert, and on the face of it, it seems like a silly
limitation.
>>> This will never end until fast-math is gutted.
>>>
>>> C/C++ already has cases like this. Pointer arithmetic on null
pointers
>>> is undefined behaviour, even if adding[1,2]/subtracting[3] zero. I
don't
>>> think it is too far fetched to expect from users to know that an
operation
>>> is undefined behaviour even if one of the operands is zero.
>>>
>>> Michael
>>>
>>> [1]
>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c
>>> [2]
>>>
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp
>>> [3]
>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c
>>>
>>>
>>> Michael
>>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/ae003b54/attachment.html>

Mehdi AMINI via llvm-dev

2021-Sep-17 16:16 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Thu, Sep 16, 2021 at 11:19 PM Serge Pavlov <sepavloff at gmail.com>
wrote:
> On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>>
>>
>> On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at
quicinc.com>
>>> wrote:
>>>
>>>> The difference there is that doing pointer arithmetic on null
pointers
>>>> doesn't *usually* work, unless you turn on -ffast-pointers.
>>>>
>>>> It seems to me that  most confusion related to -ffast-math is
likely
>>>> caused by people who are transitioning to using it. I have some
codebase,
>>>> and I turn on fast math, and then a few months down the road I
notice a
>>>> strangeness that I did not catch during the initial transition
period. If
>>>> you're writing new code with fast-math, you don't do
things like try to use
>>>> NaN as a sentinel value in a TU with fast math turned on. This
is the sort
>>>> of thing you catch when you try to transition an existing
codebase. Forgive
>>>> me for the uncharitable interpretation, but it's much
easier to ask the
>>>> compiler to change to accommodate your use case than it is to
refactor your
>>>> code.
>>>>
>>>
>>> It is a common way to explain problems with -ffinite-math-only by
user
>>> ignorance. However user misunderstandings and complaints may
indicate a
>>> flaw in compiler implementation, which I believe we have in this
case.
>>>
>>> Using NaN as sentinels is a natural way when you cannot spend extra
>>> memory for keeping flags for each item, spend extra cycles to read
that
>>> flag and do not want to pollute cache. It does not depend on
reading
>>> documentation or writing the code from scratch. It is simply the
best
>>> solution for storing data. If performance of the data processing is
>>> critical, -ffast-math is a good solution. This is a fairly
legitimate use
>>> case. The fact that the compiler does not allow it is a compiler
drawback.
>>>
>>>
>>>> To me, I think Mehdi had the best solution: The algorithm that
is the
>>>> bottleneck, and experiences the huge speedup using fast-math
should be
>>>> separated into its own source file. This source file, and only
this source
>>>> file should be compiled with fast-math. The outer driver loop
should not be
>>>> compiled with fast math. This solution is clean, (probably)
easy, and
>>>> doesn't require a change in the compiler.
>>>
>>>
>>> It is a workaround, it works in some cases but does not in others.
ML
>>> kernel often is a single translation unit, there may be no such
thing as
>>> linker for that processor. At the same time it is computation
intensive and
>>> using fast-math in it may be very profitable.
>>>
>>
>> Switching mode in a single TU seems valuable, but could this be handled
>> with pragmas or function attributes instead?
>>
>
> GCC allows it by using `#pragma GCC optimize()`, but clang does not
> support it. No suitable function attribute exists for that.
>
Right, I know that clang does not support it, but it could :)
So since we're looking at what provides the best user-experience: isn't
that it? Shouldn't we look into providing this level of granularity?
(whether function-level or finer grain)


>
>>
>>
>>>
>>>
>>>> Changing the compiler is hard, affects everybody who uses the
compiler,
>>>> and creates inconsistency in behavior between clang and gcc
(and msvc with
>>>> /fp:fast), and clang and old versions of clang.
>>>
>>>
>>> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is
>>> implemented in libc, it is also a real check. Actually removing
`isnan`
>>> creates inconsistency.
>>>
>>>
>>>> The behavior of fast-math with respect to NaN is consistent
across the
>>>> mainstream c/c++ compilers: no promises are made, users should
not assume
>>>> that they can use it for anything. Changing it now would create
a major
>>>> portability issue for user codebases, which in and of itself is
a very
>>>> strong reason to not make this change.
>>>>
>>>
>>> Removing `isnan` is only an optimization, it does not intend to
change
>>> semantics. So it cannot create portability issues. Quite the
contrary, it
>>> helps portability by making behavior consistent between compilers
and libc
>>> implementations. The only possible issue is performance loss, this
is
>>> discussed above, it is an unlikely case. Anyway, if such loss
exists and it
>>> is absolutely intolerable for a user, a hack with redefinition of
`isnan`
>>> restores the previous code generation.
>>>
>>>
>>>> If the behavior is confusing to users, that's because
it's poorly
>>>> explained.
>>>
>>>
>>> What is confusing? That the explicitly written call to a function
is not
>>> removed? According to user feedback it is the silent removal of
`isnan`
>>> that confuses users.
>>>
>>> Honestly, I think the docs are pretty clear, but "It's
clear, you just
>>>> need to learn to read" is never an acceptable answer so it
could certainly
>>>> be improved. This is the only thing that needs to be fixed in
my opinion.
>>>>
>>>
>>> The documentation says about -ffinite-math-only:
>>>
>>>  "Allow optimizations for floating-point arithmetic that
assume that
>>> arguments and results are not NaNs or +-Infs."
>>>
>>> Is it clear whether `isnan` is arithmetic or not?
>>>
>>
>> If the result of a floating point arithmetic is fed directly to
>> `isnan()`, are we allowed to eliminate the computation and fold the
check
>> to none? (seems like it according to the sentence you're quoting).
>> Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`?
>>
>
> Initially there was such intention, but during the discussion it became
> clear that it is not profitable. For `isinf` there is a realistic use case
> when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer
> demonstrated an example for `isnan`. It is harder to provide guarantees
> about the result, than about the arguments.
>
>
>> --
>> Mehdi
>>
>>
>>
>>
>>>
>>>
>>>> Thanks,
>>>>    Chris Tetreault
>>>>
>>>> -----Original Message-----
>>>> From: Michael Kruse <cfe-dev at meinersbur.de>
>>>> Sent: Thursday, September 16, 2021 12:29 PM
>>>> To: Chris Tetreault <ctetreau at quicinc.com>
>>>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at
lists.llvm.org;
>>>> cfe-dev at lists.llvm.org
>>>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out
in
>>>> fast-math mode?
>>>>
>>>> WARNING: This email originated from outside of Qualcomm. Please
be wary
>>>> of any links or attachments, and do not enable macros.
>>>>
>>>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via
cfe-dev
>>>> <cfe-dev at lists.llvm.org>:
>>>> > As a user, if I read that:
>>>> >
>>>> >
>>>> >
>>>> > ```
>>>> >
>>>> > if (isnan(x)) {
>>>> >
>>>> > ```
>>>> >
>>>> >
>>>> >
>>>> > … is guaranteed to work, and I read that fast-math enables
the
>>>> compiler to reason about constructs like `x + 0` being equal to
`x`, then
>>>> I’m going to be very confused when:
>>>> >
>>>> >
>>>> >
>>>> > ```
>>>> >
>>>> > if (isnan(x + 0)) {
>>>> >
>>>> > ```
>>>> >
>>>> >
>>>> >
>>>> > … does not also work. I’m going to open a bug and
complain, and the
>>>> slide down the slippery slope will continue. You and I
understand the
>>>> difference, and the technical reason why `isnan(x)` is
supported but
>>>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out
why he’s got
>>>> NaN in his matrices despite his careful NaN handling code. Joe
is not a
>>>> compiler expert, and on the face of it, it seems like a silly
limitation.
>>>> This will never end until fast-math is gutted.
>>>>
>>>> C/C++ already has cases like this. Pointer arithmetic on null
pointers
>>>> is undefined behaviour, even if adding[1,2]/subtracting[3]
zero. I don't
>>>> think it is too far fetched to expect from users to know that
an operation
>>>> is undefined behaviour even if one of the operands is zero.
>>>>
>>>> Michael
>>>>
>>>> [1]
>>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c
>>>> [2]
>>>>
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp
>>>> [3]
>>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c
>>>>
>>>>
>>>> Michael
>>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/afd6adcd/attachment.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?