Mehdi AMINI via llvm-dev
2021-Sep-17 03:53 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev < cfe-dev at lists.llvm.org> wrote:> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at quicinc.com> > wrote: > >> The difference there is that doing pointer arithmetic on null pointers >> doesn't *usually* work, unless you turn on -ffast-pointers. >> >> It seems to me that most confusion related to -ffast-math is likely >> caused by people who are transitioning to using it. I have some codebase, >> and I turn on fast math, and then a few months down the road I notice a >> strangeness that I did not catch during the initial transition period. If >> you're writing new code with fast-math, you don't do things like try to use >> NaN as a sentinel value in a TU with fast math turned on. This is the sort >> of thing you catch when you try to transition an existing codebase. Forgive >> me for the uncharitable interpretation, but it's much easier to ask the >> compiler to change to accommodate your use case than it is to refactor your >> code. >> > > It is a common way to explain problems with -ffinite-math-only by user > ignorance. However user misunderstandings and complaints may indicate a > flaw in compiler implementation, which I believe we have in this case. > > Using NaN as sentinels is a natural way when you cannot spend extra memory > for keeping flags for each item, spend extra cycles to read that flag and > do not want to pollute cache. It does not depend on reading documentation > or writing the code from scratch. It is simply the best solution for > storing data. If performance of the data processing is critical, > -ffast-math is a good solution. This is a fairly legitimate use case. The > fact that the compiler does not allow it is a compiler drawback. > > >> To me, I think Mehdi had the best solution: The algorithm that is the >> bottleneck, and experiences the huge speedup using fast-math should be >> separated into its own source file. This source file, and only this source >> file should be compiled with fast-math. The outer driver loop should not be >> compiled with fast math. This solution is clean, (probably) easy, and >> doesn't require a change in the compiler. > > > It is a workaround, it works in some cases but does not in others. ML > kernel often is a single translation unit, there may be no such thing as > linker for that processor. At the same time it is computation intensive and > using fast-math in it may be very profitable. >Switching mode in a single TU seems valuable, but could this be handled with pragmas or function attributes instead?> > >> Changing the compiler is hard, affects everybody who uses the compiler, >> and creates inconsistency in behavior between clang and gcc (and msvc with >> /fp:fast), and clang and old versions of clang. > > > ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is > implemented in libc, it is also a real check. Actually removing `isnan` > creates inconsistency. > > >> The behavior of fast-math with respect to NaN is consistent across the >> mainstream c/c++ compilers: no promises are made, users should not assume >> that they can use it for anything. Changing it now would create a major >> portability issue for user codebases, which in and of itself is a very >> strong reason to not make this change. >> > > Removing `isnan` is only an optimization, it does not intend to change > semantics. So it cannot create portability issues. Quite the contrary, it > helps portability by making behavior consistent between compilers and libc > implementations. The only possible issue is performance loss, this is > discussed above, it is an unlikely case. Anyway, if such loss exists and it > is absolutely intolerable for a user, a hack with redefinition of `isnan` > restores the previous code generation. > > >> If the behavior is confusing to users, that's because it's poorly >> explained. > > > What is confusing? That the explicitly written call to a function is not > removed? According to user feedback it is the silent removal of `isnan` > that confuses users. > > Honestly, I think the docs are pretty clear, but "It's clear, you just >> need to learn to read" is never an acceptable answer so it could certainly >> be improved. This is the only thing that needs to be fixed in my opinion. >> > > The documentation says about -ffinite-math-only: > > "Allow optimizations for floating-point arithmetic that assume that > arguments and results are not NaNs or +-Infs." > > Is it clear whether `isnan` is arithmetic or not? >If the result of a floating point arithmetic is fed directly to `isnan()`, are we allowed to eliminate the computation and fold the check to none? (seems like it according to the sentence you're quoting). Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`? -- Mehdi> > >> Thanks, >> Chris Tetreault >> >> -----Original Message----- >> From: Michael Kruse <cfe-dev at meinersbur.de> >> Sent: Thursday, September 16, 2021 12:29 PM >> To: Chris Tetreault <ctetreau at quicinc.com> >> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at lists.llvm.org; >> cfe-dev at lists.llvm.org >> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in >> fast-math mode? >> >> WARNING: This email originated from outside of Qualcomm. Please be wary >> of any links or attachments, and do not enable macros. >> >> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via cfe-dev < >> cfe-dev at lists.llvm.org>: >> > As a user, if I read that: >> > >> > >> > >> > ``` >> > >> > if (isnan(x)) { >> > >> > ``` >> > >> > >> > >> > … is guaranteed to work, and I read that fast-math enables the compiler >> to reason about constructs like `x + 0` being equal to `x`, then I’m going >> to be very confused when: >> > >> > >> > >> > ``` >> > >> > if (isnan(x + 0)) { >> > >> > ``` >> > >> > >> > >> > … does not also work. I’m going to open a bug and complain, and the >> slide down the slippery slope will continue. You and I understand the >> difference, and the technical reason why `isnan(x)` is supported but >> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why he’s got >> NaN in his matrices despite his careful NaN handling code. Joe is not a >> compiler expert, and on the face of it, it seems like a silly limitation. >> This will never end until fast-math is gutted. >> >> C/C++ already has cases like this. Pointer arithmetic on null pointers is >> undefined behaviour, even if adding[1,2]/subtracting[3] zero. I don't think >> it is too far fetched to expect from users to know that an operation is >> undefined behaviour even if one of the operands is zero. >> >> Michael >> >> [1] >> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c >> [2] >> https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp >> [3] >> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c >> >> >> Michael >> > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/d77c0bc4/attachment.html>
Serge Pavlov via llvm-dev
2021-Sep-17 06:18 UTC
[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com> wrote:> > > On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at quicinc.com> >> wrote: >> >>> The difference there is that doing pointer arithmetic on null pointers >>> doesn't *usually* work, unless you turn on -ffast-pointers. >>> >>> It seems to me that most confusion related to -ffast-math is likely >>> caused by people who are transitioning to using it. I have some codebase, >>> and I turn on fast math, and then a few months down the road I notice a >>> strangeness that I did not catch during the initial transition period. If >>> you're writing new code with fast-math, you don't do things like try to use >>> NaN as a sentinel value in a TU with fast math turned on. This is the sort >>> of thing you catch when you try to transition an existing codebase. Forgive >>> me for the uncharitable interpretation, but it's much easier to ask the >>> compiler to change to accommodate your use case than it is to refactor your >>> code. >>> >> >> It is a common way to explain problems with -ffinite-math-only by user >> ignorance. However user misunderstandings and complaints may indicate a >> flaw in compiler implementation, which I believe we have in this case. >> >> Using NaN as sentinels is a natural way when you cannot spend extra >> memory for keeping flags for each item, spend extra cycles to read that >> flag and do not want to pollute cache. It does not depend on reading >> documentation or writing the code from scratch. It is simply the best >> solution for storing data. If performance of the data processing is >> critical, -ffast-math is a good solution. This is a fairly legitimate use >> case. The fact that the compiler does not allow it is a compiler drawback. >> >> >>> To me, I think Mehdi had the best solution: The algorithm that is the >>> bottleneck, and experiences the huge speedup using fast-math should be >>> separated into its own source file. This source file, and only this source >>> file should be compiled with fast-math. The outer driver loop should not be >>> compiled with fast math. This solution is clean, (probably) easy, and >>> doesn't require a change in the compiler. >> >> >> It is a workaround, it works in some cases but does not in others. ML >> kernel often is a single translation unit, there may be no such thing as >> linker for that processor. At the same time it is computation intensive and >> using fast-math in it may be very profitable. >> > > Switching mode in a single TU seems valuable, but could this be handled > with pragmas or function attributes instead? >GCC allows it by using `#pragma GCC optimize()`, but clang does not support it. No suitable function attribute exists for that.> > >> >> >>> Changing the compiler is hard, affects everybody who uses the compiler, >>> and creates inconsistency in behavior between clang and gcc (and msvc with >>> /fp:fast), and clang and old versions of clang. >> >> >> ICC and MSVC do not remove `isnan` in fast math mode. If `isnan` is >> implemented in libc, it is also a real check. Actually removing `isnan` >> creates inconsistency. >> >> >>> The behavior of fast-math with respect to NaN is consistent across the >>> mainstream c/c++ compilers: no promises are made, users should not assume >>> that they can use it for anything. Changing it now would create a major >>> portability issue for user codebases, which in and of itself is a very >>> strong reason to not make this change. >>> >> >> Removing `isnan` is only an optimization, it does not intend to change >> semantics. So it cannot create portability issues. Quite the contrary, it >> helps portability by making behavior consistent between compilers and libc >> implementations. The only possible issue is performance loss, this is >> discussed above, it is an unlikely case. Anyway, if such loss exists and it >> is absolutely intolerable for a user, a hack with redefinition of `isnan` >> restores the previous code generation. >> >> >>> If the behavior is confusing to users, that's because it's poorly >>> explained. >> >> >> What is confusing? That the explicitly written call to a function is not >> removed? According to user feedback it is the silent removal of `isnan` >> that confuses users. >> >> Honestly, I think the docs are pretty clear, but "It's clear, you just >>> need to learn to read" is never an acceptable answer so it could certainly >>> be improved. This is the only thing that needs to be fixed in my opinion. >>> >> >> The documentation says about -ffinite-math-only: >> >> "Allow optimizations for floating-point arithmetic that assume that >> arguments and results are not NaNs or +-Infs." >> >> Is it clear whether `isnan` is arithmetic or not? >> > > If the result of a floating point arithmetic is fed directly to `isnan()`, > are we allowed to eliminate the computation and fold the check to none? > (seems like it according to the sentence you're quoting). > Are we back to `isnan(x+0.0)` can be folded but not `isnan(x)`? >Initially there was such intention, but during the discussion it became clear that it is not profitable. For `isinf` there is a realistic use case when removing the call `isinf(a+b)` is an issue. Arthur O'Dwyer demonstrated an example for `isnan`. It is harder to provide guarantees about the result, than about the arguments.> -- > Mehdi > > > > >> >> >>> Thanks, >>> Chris Tetreault >>> >>> -----Original Message----- >>> From: Michael Kruse <cfe-dev at meinersbur.de> >>> Sent: Thursday, September 16, 2021 12:29 PM >>> To: Chris Tetreault <ctetreau at quicinc.com> >>> Cc: Serge Pavlov <sepavloff at gmail.com>; llvm-dev at lists.llvm.org; >>> cfe-dev at lists.llvm.org >>> Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in >>> fast-math mode? >>> >>> WARNING: This email originated from outside of Qualcomm. Please be wary >>> of any links or attachments, and do not enable macros. >>> >>> Am Mo., 13. Sept. 2021 um 11:46 Uhr schrieb Chris Tetreault via cfe-dev < >>> cfe-dev at lists.llvm.org>: >>> > As a user, if I read that: >>> > >>> > >>> > >>> > ``` >>> > >>> > if (isnan(x)) { >>> > >>> > ``` >>> > >>> > >>> > >>> > … is guaranteed to work, and I read that fast-math enables the >>> compiler to reason about constructs like `x + 0` being equal to `x`, then >>> I’m going to be very confused when: >>> > >>> > >>> > >>> > ``` >>> > >>> > if (isnan(x + 0)) { >>> > >>> > ``` >>> > >>> > >>> > >>> > … does not also work. I’m going to open a bug and complain, and the >>> slide down the slippery slope will continue. You and I understand the >>> difference, and the technical reason why `isnan(x)` is supported but >>> `isnan(x + 0)` isn’t, but Joe Coder just trying to figure out why he’s got >>> NaN in his matrices despite his careful NaN handling code. Joe is not a >>> compiler expert, and on the face of it, it seems like a silly limitation. >>> This will never end until fast-math is gutted. >>> >>> C/C++ already has cases like this. Pointer arithmetic on null pointers >>> is undefined behaviour, even if adding[1,2]/subtracting[3] zero. I don't >>> think it is too far fetched to expect from users to know that an operation >>> is undefined behaviour even if one of the operands is zero. >>> >>> Michael >>> >>> [1] >>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-addition.c >>> [2] >>> https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp >>> [3] >>> https://github.com/llvm/llvm-project/blob/main/clang/test/Sema/pointer-subtraction.c >>> >>> >>> Michael >>> >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210917/ae003b54/attachment.html>