thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode? [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2021-Sep-17 03:53 UTC

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at
quicinc.com>
> wrote:
>
>> The difference there is that doing pointer arithmetic on null pointers
>> doesn't *usually* work, unless you turn on -ffast-pointers.
>>
>> It seems to me that  most confusion related to -ffast-math is likely
>> caused by people who are transitioning to using it. I have some
codebase,
>> and I turn on fast math, and then a few months down the road I notice a
>> strangeness that I did not catch during the initial transition period.
If
>> you're writing new code with fast-math, you don't do things
like try to use
>> NaN as a sentinel value in a TU with fast math turned on. This is the
sort
>> of thing you catch when you try to transition an existing codebase.
Forgive
>> me for the uncharitable interpretation, but it's much easier to ask
the
>> compiler to change to accommodate your use case than it is to refactor
your
>> code.
>>
>
> It is a common way to explain problems with -ffinite-math-only by user
> ignorance. However user misunderstandings and complaints may indicate a
> flaw in compiler implementation, which I believe we have in this case.
>
> Using NaN as sentinels is a natural way when you cannot spend extra memory
> for keeping flags for each item, spend extra cycles to read that flag and
> do not want to pollute cache. It does not depend on reading documentation
> or writing the code from scratch. It is simply the best solution for
> storing data. If performance of the data processing is critical,
> -ffast-math is a good solution. This is a fairly legitimate use case. The
> fact that the compiler does not allow it is a compiler drawback.
>
>
>> To me, I think Mehdi had the best solution: The algorithm that is the
>> bottleneck, and experiences the huge speedup using fast-math should be
>> separated into its own source file. This source file, and only this
source
>> file should be compiled with fast-math. The outer driver loop should
not be
>> compiled with fast math. This solution is clean, (probably) easy, and
>> doesn't require a change in the compiler.
>
>
> It is a workaround, it works in some cases but does not in others. ML
> kernel often is a single translation unit, there may be no such thing as
> linker for that processor. At the same time it is computation intensive and
> using fast-math in it may be very profitable.
>
Switching mode in a single TU seems valuable, but could this be handled
with pragmas or function attributes instead?

>
>
>> Changing the compiler is hard, affects everybody who uses the compiler,
>> and creates inconsistency in behavior between clang and gcc (and msvc
with
>> /fp:fast), and clang and old versions of clang.
>
>
> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is
> implemented in libc, it is also a real check. Actually removing `isnan`
> creates inconsistency.
>
>
>> The behavior of fast-math with respect to NaN is consistent across the
>> mainstream c/c++ compilers: no promises are made, users should not
assume
>> that they can use it for anything. Changing it now would create a major
>> portability issue for user codebases, which in and of itself is a very
>> strong reason to not make this change.
>>
>
> Removing `isnan` is only an optimization, it does not intend to change
> semantics. So it cannot create portability issues. Quite the contrary, it
> helps portability by making behavior consistent between compilers and libc
> implementations. The only possible issue is performance loss, this is
> discussed above, it is an unlikely case. Anyway, if such loss exists and it
> is absolutely intolerable for a user, a hack with redefinition of `isnan`
> restores the previous code generation.
>
>
>> If the behavior is confusing to users, that's because it's
poorly
>> explained.
>
>
> What is confusing? That the explicitly written call to a function is not
> removed? According to user feedback it is the silent removal of `isnan`
> that confuses users.
>
> Honestly, I think the docs are pretty clear, but "It's clear, you
just
>> need to learn to read" is never an acceptable answer so it could
certainly
>> be improved. This is the only thing that needs to be fixed in my
opinion.
>>
>
> The documentation says about -ffinite-math-only:
>
>  "Allow optimizations for floating-point arithmetic that assume that
> arguments and results are not NaNs or +-Infs."
>
> Is it clear whether `isnan` is arithmetic or not?
>
If the result of a floating point arithmetic is fed directly to `isnan()`,
are we allowed to eliminate the computation and fold the check to none?
(seems like it according to the sentence you're quoting).
Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`?

-- 
Mehdi



>
>
>> Thanks,
>>    Chris Tetreault
>>
>> -----Original Message-----
>> From: Michael Kruse <cfe-dev at meinersbur.de>
>> Sent: Thursday, September 16, 2021 12:29 PM
>> To: Chris Tetreault <ctetreau at quicinc.com>
>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at
lists.llvm.org;
>> cfe-dev at lists.llvm.org
>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
>> fast-math mode?
>>
>> WARNING: This email originated from outside of Qualcomm. Please be wary
>> of any links or attachments, and do not enable macros.
>>
>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via cfe-dev
<
>> cfe-dev at lists.llvm.org>:
>> > As a user, if I read that:
>> >
>> >
>> >
>> > ```
>> >
>> > if (isnan(x)) {
>> >
>> > ```
>> >
>> >
>> >
>> > … is guaranteed to work, and I read that fast-math enables the
compiler
>> to reason about constructs like `x + 0` being equal to `x`, then I’m
going
>> to be very confused when:
>> >
>> >
>> >
>> > ```
>> >
>> > if (isnan(x + 0)) {
>> >
>> > ```
>> >
>> >
>> >
>> > … does not also work. I’m going to open a bug and complain, and
the
>> slide down the slippery slope will continue. You and I understand the
>> difference, and the technical reason why `isnan(x)` is supported but
>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why he’s
got
>> NaN in his matrices despite his careful NaN handling code. Joe is not a
>> compiler expert, and on the face of it, it seems like a silly
limitation.
>> This will never end until fast-math is gutted.
>>
>> C/C++ already has cases like this. Pointer arithmetic on null pointers
is
>> undefined behaviour, even if adding[1,2]/subtracting[3] zero. I
don't think
>> it is too far fetched to expect from users to know that an operation is
>> undefined behaviour even if one of the operands is zero.
>>
>> Michael
>>
>> [1]
>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c
>> [2]
>>
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp
>> [3]
>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c
>>
>>
>> Michael
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/d77c0bc4/attachment.html>

Serge Pavlov via llvm-dev

2021-Sep-17 06:18 UTC

head link

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>
> On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at
quicinc.com>
>> wrote:
>>
>>> The difference there is that doing pointer arithmetic on null
pointers
>>> doesn't *usually* work, unless you turn on -ffast-pointers.
>>>
>>> It seems to me that  most confusion related to -ffast-math is
likely
>>> caused by people who are transitioning to using it. I have some
codebase,
>>> and I turn on fast math, and then a few months down the road I
notice a
>>> strangeness that I did not catch during the initial transition
period. If
>>> you're writing new code with fast-math, you don't do things
like try to use
>>> NaN as a sentinel value in a TU with fast math turned on. This is
the sort
>>> of thing you catch when you try to transition an existing codebase.
Forgive
>>> me for the uncharitable interpretation, but it's much easier to
ask the
>>> compiler to change to accommodate your use case than it is to
refactor your
>>> code.
>>>
>>
>> It is a common way to explain problems with -ffinite-math-only by user
>> ignorance. However user misunderstandings and complaints may indicate a
>> flaw in compiler implementation, which I believe we have in this case.
>>
>> Using NaN as sentinels is a natural way when you cannot spend extra
>> memory for keeping flags for each item, spend extra cycles to read that
>> flag and do not want to pollute cache. It does not depend on reading
>> documentation or writing the code from scratch. It is simply the best
>> solution for storing data. If performance of the data processing is
>> critical, -ffast-math is a good solution. This is a fairly legitimate
use
>> case. The fact that the compiler does not allow it is a compiler
drawback.
>>
>>
>>> To me, I think Mehdi had the best solution: The algorithm that is
the
>>> bottleneck, and experiences the huge speedup using fast-math should
be
>>> separated into its own source file. This source file, and only this
source
>>> file should be compiled with fast-math. The outer driver loop
should not be
>>> compiled with fast math. This solution is clean, (probably) easy,
and
>>> doesn't require a change in the compiler.
>>
>>
>> It is a workaround, it works in some cases but does not in others. ML
>> kernel often is a single translation unit, there may be no such thing
as
>> linker for that processor. At the same time it is computation intensive
and
>> using fast-math in it may be very profitable.
>>
>
> Switching mode in a single TU seems valuable, but could this be handled
> with pragmas or function attributes instead?
>
GCC allows it by using `#pragma GCC optimize()`, but clang does not support
it. No suitable function attribute exists for that.

>
>
>>
>>
>>> Changing the compiler is hard, affects everybody who uses the
compiler,
>>> and creates inconsistency in behavior between clang and gcc (and
msvc with
>>> /fp:fast), and clang and old versions of clang.
>>
>>
>> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is
>> implemented in libc, it is also a real check. Actually removing `isnan`
>> creates inconsistency.
>>
>>
>>> The behavior of fast-math with respect to NaN is consistent across
the
>>> mainstream c/c++ compilers: no promises are made, users should not
assume
>>> that they can use it for anything. Changing it now would create a
major
>>> portability issue for user codebases, which in and of itself is a
very
>>> strong reason to not make this change.
>>>
>>
>> Removing `isnan` is only an optimization, it does not intend to change
>> semantics. So it cannot create portability issues. Quite the contrary,
it
>> helps portability by making behavior consistent between compilers and
libc
>> implementations. The only possible issue is performance loss, this is
>> discussed above, it is an unlikely case. Anyway, if such loss exists
and it
>> is absolutely intolerable for a user, a hack with redefinition of
`isnan`
>> restores the previous code generation.
>>
>>
>>> If the behavior is confusing to users, that's because it's
poorly
>>> explained.
>>
>>
>> What is confusing? That the explicitly written call to a function is
not
>> removed? According to user feedback it is the silent removal of `isnan`
>> that confuses users.
>>
>> Honestly, I think the docs are pretty clear, but "It's clear,
you just
>>> need to learn to read" is never an acceptable answer so it
could certainly
>>> be improved. This is the only thing that needs to be fixed in my
opinion.
>>>
>>
>> The documentation says about -ffinite-math-only:
>>
>>  "Allow optimizations for floating-point arithmetic that assume
that
>> arguments and results are not NaNs or +-Infs."
>>
>> Is it clear whether `isnan` is arithmetic or not?
>>
>
> If the result of a floating point arithmetic is fed directly to `isnan()`,
> are we allowed to eliminate the computation and fold the check to none?
> (seems like it according to the sentence you're quoting).
> Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`?
>
Initially there was such intention, but during the discussion it became
clear that it is not profitable. For `isinf` there is a realistic use case
when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer
demonstrated an example for `isnan`. It is harder to provide guarantees
about the result, than about the arguments.

> --
> Mehdi
>
>
>
>
>>
>>
>>> Thanks,
>>>    Chris Tetreault
>>>
>>> -----Original Message-----
>>> From: Michael Kruse <cfe-dev at meinersbur.de>
>>> Sent: Thursday, September 16, 2021 12:29 PM
>>> To: Chris Tetreault <ctetreau at quicinc.com>
>>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at
lists.llvm.org;
>>> cfe-dev at lists.llvm.org
>>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
>>> fast-math mode?
>>>
>>> WARNING: This email originated from outside of Qualcomm. Please be
wary
>>> of any links or attachments, and do not enable macros.
>>>
>>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via
cfe-dev <
>>> cfe-dev at lists.llvm.org>:
>>> > As a user, if I read that:
>>> >
>>> >
>>> >
>>> > ```
>>> >
>>> > if (isnan(x)) {
>>> >
>>> > ```
>>> >
>>> >
>>> >
>>> > … is guaranteed to work, and I read that fast-math enables the
>>> compiler to reason about constructs like `x + 0` being equal to
`x`, then
>>> I’m going to be very confused when:
>>> >
>>> >
>>> >
>>> > ```
>>> >
>>> > if (isnan(x + 0)) {
>>> >
>>> > ```
>>> >
>>> >
>>> >
>>> > … does not also work. I’m going to open a bug and complain,
and the
>>> slide down the slippery slope will continue. You and I understand
the
>>> difference, and the technical reason why `isnan(x)` is supported
but
>>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why
he’s got
>>> NaN in his matrices despite his careful NaN handling code. Joe is
not a
>>> compiler expert, and on the face of it, it seems like a silly
limitation.
>>> This will never end until fast-math is gutted.
>>>
>>> C/C++ already has cases like this. Pointer arithmetic on null
pointers
>>> is undefined behaviour, even if adding[1,2]/subtracting[3] zero. I
don't
>>> think it is too far fetched to expect from users to know that an
operation
>>> is undefined behaviour even if one of the operands is zero.
>>>
>>> Michael
>>>
>>> [1]
>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c
>>> [2]
>>>
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp
>>> [3]
>>>
https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c
>>>
>>>
>>> Michael
>>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/ae003b54/attachment.html>

llvm dev - Sep 2021 - [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?