Serge Pavlov via llvm-dev
2021-Sep-17 06:18 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com> wrote:> > > On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at quicinc.com> >> wrote: >> >>> The difference there is that doing pointer arithmetic on null pointers >>> doesn't *usually* work, unless you turn on -ffast-pointers. >>> >>> It seems to me that most confusion related to -ffast-math is likely >>> caused by people who are transitioning to using it. I have some codebase, >>> and I turn on fast math, and then a few months down the road I notice a >>> strangeness that I did not catch during the initial transition period. If >>> you're writing new code with fast-math, you don't do things like try to use >>> NaN as a sentinel value in a TU with fast math turned on. This is the sort >>> of thing you catch when you try to transition an existing codebase. Forgive >>> me for the uncharitable interpretation, but it's much easier to ask the >>> compiler to change to accommodate your use case than it is to refactor your >>> code. >>> >> >> It is a common way to explain problems with -ffinite-math-only by user >> ignorance. However user misunderstandings and complaints may indicate a >> flaw in compiler implementation, which I believe we have in this case. >> >> Using NaN as sentinels is a natural way when you cannot spend extra >> memory for keeping flags for each item, spend extra cycles to read that >> flag and do not want to pollute cache. It does not depend on reading >> documentation or writing the code from scratch. It is simply the best >> solution for storing data. If performance of the data processing is >> critical, -ffast-math is a good solution. This is a fairly legitimate use >> case. The fact that the compiler does not allow it is a compiler drawback. >> >> >>> To me, I think Mehdi had the best solution: The algorithm that is the >>> bottleneck, and experiences the huge speedup using fast-math should be >>> separated into its own source file. This source file, and only this source >>> file should be compiled with fast-math. The outer driver loop should not be >>> compiled with fast math. This solution is clean, (probably) easy, and >>> doesn't require a change in the compiler. >> >> >> It is a workaround, it works in some cases but does not in others. ML >> kernel often is a single translation unit, there may be no such thing as >> linker for that processor. At the same time it is computation intensive and >> using fast-math in it may be very profitable. >> > > Switching mode in a single TU seems valuable, but could this be handled > with pragmas or function attributes instead? >GCC allows it by using `#pragma GCC optimize()`, but clang does not support it. No suitable function attribute exists for that.> > >> >> >>> Changing the compiler is hard, affects everybody who uses the compiler, >>> and creates inconsistency in behavior between clang and gcc (and msvc with >>> /fp:fast), and clang and old versions of clang. >> >> >> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is >> implemented in libc, it is also a real check. Actually removing `isnan` >> creates inconsistency. >> >> >>> The behavior of fast-math with respect to NaN is consistent across the >>> mainstream c/c++ compilers: no promises are made, users should not assume >>> that they can use it for anything. Changing it now would create a major >>> portability issue for user codebases, which in and of itself is a very >>> strong reason to not make this change. >>> >> >> Removing `isnan` is only an optimization, it does not intend to change >> semantics. So it cannot create portability issues. Quite the contrary, it >> helps portability by making behavior consistent between compilers and libc >> implementations. The only possible issue is performance loss, this is >> discussed above, it is an unlikely case. Anyway, if such loss exists and it >> is absolutely intolerable for a user, a hack with redefinition of `isnan` >> restores the previous code generation. >> >> >>> If the behavior is confusing to users, that's because it's poorly >>> explained. >> >> >> What is confusing? That the explicitly written call to a function is not >> removed? According to user feedback it is the silent removal of `isnan` >> that confuses users. >> >> Honestly, I think the docs are pretty clear, but "It's clear, you just >>> need to learn to read" is never an acceptable answer so it could certainly >>> be improved. This is the only thing that needs to be fixed in my opinion. >>> >> >> The documentation says about -ffinite-math-only: >> >> "Allow optimizations for floating-point arithmetic that assume that >> arguments and results are not NaNs or +-Infs." >> >> Is it clear whether `isnan` is arithmetic or not? >> > > If the result of a floating point arithmetic is fed directly to `isnan()`, > are we allowed to eliminate the computation and fold the check to none? > (seems like it according to the sentence you're quoting). > Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`? >Initially there was such intention, but during the discussion it became clear that it is not profitable. For `isinf` there is a realistic use case when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer demonstrated an example for `isnan`. It is harder to provide guarantees about the result, than about the arguments.> -- > Mehdi > > > > >> >> >>> Thanks, >>> Chris Tetreault >>> >>> -----Original Message----- >>> From: Michael Kruse <cfe-dev at meinersbur.de> >>> Sent: Thursday, September 16, 2021 12:29 PM >>> To: Chris Tetreault <ctetreau at quicinc.com> >>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at lists.llvm.org; >>> cfe-dev at lists.llvm.org >>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in >>> fast-math mode? >>> >>> WARNING: This email originated from outside of Qualcomm. Please be wary >>> of any links or attachments, and do not enable macros. >>> >>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via cfe-dev < >>> cfe-dev at lists.llvm.org>: >>> > As a user, if I read that: >>> > >>> > >>> > >>> > ``` >>> > >>> > if (isnan(x)) { >>> > >>> > ``` >>> > >>> > >>> > >>> > … is guaranteed to work, and I read that fast-math enables the >>> compiler to reason about constructs like `x + 0` being equal to `x`, then >>> I’m going to be very confused when: >>> > >>> > >>> > >>> > ``` >>> > >>> > if (isnan(x + 0)) { >>> > >>> > ``` >>> > >>> > >>> > >>> > … does not also work. I’m going to open a bug and complain, and the >>> slide down the slippery slope will continue. You and I understand the >>> difference, and the technical reason why `isnan(x)` is supported but >>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why he’s got >>> NaN in his matrices despite his careful NaN handling code. Joe is not a >>> compiler expert, and on the face of it, it seems like a silly limitation. >>> This will never end until fast-math is gutted. >>> >>> C/C++ already has cases like this. Pointer arithmetic on null pointers >>> is undefined behaviour, even if adding[1,2]/subtracting[3] zero. I don't >>> think it is too far fetched to expect from users to know that an operation >>> is undefined behaviour even if one of the operands is zero. >>> >>> Michael >>> >>> [1] >>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c >>> [2] >>> https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp >>> [3] >>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c >>> >>> >>> Michael >>> >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/ae003b54/attachment.html>
Mehdi AMINI via llvm-dev
2021-Sep-17 16:16 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
On Thu, Sep 16, 2021 at 11:19 PM Serge Pavlov <sepavloff at gmail.com> wrote:> On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com> wrote: > >> >> >> On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev < >> cfe-dev at lists.llvm.org> wrote: >> >>> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at quicinc.com> >>> wrote: >>> >>>> The difference there is that doing pointer arithmetic on null pointers >>>> doesn't *usually* work, unless you turn on -ffast-pointers. >>>> >>>> It seems to me that most confusion related to -ffast-math is likely >>>> caused by people who are transitioning to using it. I have some codebase, >>>> and I turn on fast math, and then a few months down the road I notice a >>>> strangeness that I did not catch during the initial transition period. If >>>> you're writing new code with fast-math, you don't do things like try to use >>>> NaN as a sentinel value in a TU with fast math turned on. This is the sort >>>> of thing you catch when you try to transition an existing codebase. Forgive >>>> me for the uncharitable interpretation, but it's much easier to ask the >>>> compiler to change to accommodate your use case than it is to refactor your >>>> code. >>>> >>> >>> It is a common way to explain problems with -ffinite-math-only by user >>> ignorance. However user misunderstandings and complaints may indicate a >>> flaw in compiler implementation, which I believe we have in this case. >>> >>> Using NaN as sentinels is a natural way when you cannot spend extra >>> memory for keeping flags for each item, spend extra cycles to read that >>> flag and do not want to pollute cache. It does not depend on reading >>> documentation or writing the code from scratch. It is simply the best >>> solution for storing data. If performance of the data processing is >>> critical, -ffast-math is a good solution. This is a fairly legitimate use >>> case. The fact that the compiler does not allow it is a compiler drawback. >>> >>> >>>> To me, I think Mehdi had the best solution: The algorithm that is the >>>> bottleneck, and experiences the huge speedup using fast-math should be >>>> separated into its own source file. This source file, and only this source >>>> file should be compiled with fast-math. The outer driver loop should not be >>>> compiled with fast math. This solution is clean, (probably) easy, and >>>> doesn't require a change in the compiler. >>> >>> >>> It is a workaround, it works in some cases but does not in others. ML >>> kernel often is a single translation unit, there may be no such thing as >>> linker for that processor. At the same time it is computation intensive and >>> using fast-math in it may be very profitable. >>> >> >> Switching mode in a single TU seems valuable, but could this be handled >> with pragmas or function attributes instead? >> > > GCC allows it by using `#pragma GCC optimize()`, but clang does not > support it. No suitable function attribute exists for that. >Right, I know that clang does not support it, but it could :) So since we're looking at what provides the best user-experience: isn't that it? Shouldn't we look into providing this level of granularity? (whether function-level or finer grain)> >> >> >>> >>> >>>> Changing the compiler is hard, affects everybody who uses the compiler, >>>> and creates inconsistency in behavior between clang and gcc (and msvc with >>>> /fp:fast), and clang and old versions of clang. >>> >>> >>> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is >>> implemented in libc, it is also a real check. Actually removing `isnan` >>> creates inconsistency. >>> >>> >>>> The behavior of fast-math with respect to NaN is consistent across the >>>> mainstream c/c++ compilers: no promises are made, users should not assume >>>> that they can use it for anything. Changing it now would create a major >>>> portability issue for user codebases, which in and of itself is a very >>>> strong reason to not make this change. >>>> >>> >>> Removing `isnan` is only an optimization, it does not intend to change >>> semantics. So it cannot create portability issues. Quite the contrary, it >>> helps portability by making behavior consistent between compilers and libc >>> implementations. The only possible issue is performance loss, this is >>> discussed above, it is an unlikely case. Anyway, if such loss exists and it >>> is absolutely intolerable for a user, a hack with redefinition of `isnan` >>> restores the previous code generation. >>> >>> >>>> If the behavior is confusing to users, that's because it's poorly >>>> explained. >>> >>> >>> What is confusing? That the explicitly written call to a function is not >>> removed? According to user feedback it is the silent removal of `isnan` >>> that confuses users. >>> >>> Honestly, I think the docs are pretty clear, but "It's clear, you just >>>> need to learn to read" is never an acceptable answer so it could certainly >>>> be improved. This is the only thing that needs to be fixed in my opinion. >>>> >>> >>> The documentation says about -ffinite-math-only: >>> >>> "Allow optimizations for floating-point arithmetic that assume that >>> arguments and results are not NaNs or +-Infs." >>> >>> Is it clear whether `isnan` is arithmetic or not? >>> >> >> If the result of a floating point arithmetic is fed directly to >> `isnan()`, are we allowed to eliminate the computation and fold the check >> to none? (seems like it according to the sentence you're quoting). >> Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`? >> > > Initially there was such intention, but during the discussion it became > clear that it is not profitable. For `isinf` there is a realistic use case > when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer > demonstrated an example for `isnan`. It is harder to provide guarantees > about the result, than about the arguments. >> >> -- >> Mehdi >> >> >> >> >>> >>> >>>> Thanks, >>>> Chris Tetreault >>>> >>>> -----Original Message----- >>>> From: Michael Kruse <cfe-dev at meinersbur.de> >>>> Sent: Thursday, September 16, 2021 12:29 PM >>>> To: Chris Tetreault <ctetreau at quicinc.com> >>>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at lists.llvm.org; >>>> cfe-dev at lists.llvm.org >>>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in >>>> fast-math mode? >>>> >>>> WARNING: This email originated from outside of Qualcomm. Please be wary >>>> of any links or attachments, and do not enable macros. >>>> >>>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via cfe-dev >>>> <cfe-dev at lists.llvm.org>: >>>> > As a user, if I read that: >>>> > >>>> > >>>> > >>>> > ``` >>>> > >>>> > if (isnan(x)) { >>>> > >>>> > ``` >>>> > >>>> > >>>> > >>>> > … is guaranteed to work, and I read that fast-math enables the >>>> compiler to reason about constructs like `x + 0` being equal to `x`, then >>>> I’m going to be very confused when: >>>> > >>>> > >>>> > >>>> > ``` >>>> > >>>> > if (isnan(x + 0)) { >>>> > >>>> > ``` >>>> > >>>> > >>>> > >>>> > … does not also work. I’m going to open a bug and complain, and the >>>> slide down the slippery slope will continue. You and I understand the >>>> difference, and the technical reason why `isnan(x)` is supported but >>>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why he’s got >>>> NaN in his matrices despite his careful NaN handling code. Joe is not a >>>> compiler expert, and on the face of it, it seems like a silly limitation. >>>> This will never end until fast-math is gutted. >>>> >>>> C/C++ already has cases like this. Pointer arithmetic on null pointers >>>> is undefined behaviour, even if adding[1,2]/subtracting[3] zero. I don't >>>> think it is too far fetched to expect from users to know that an operation >>>> is undefined behaviour even if one of the operands is zero. >>>> >>>> Michael >>>> >>>> [1] >>>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c >>>> [2] >>>> https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp >>>> [3] >>>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c >>>> >>>> >>>> Michael >>>> >>> _______________________________________________ >>> cfe-dev mailing list >>> cfe-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/afd6adcd/attachment.html>