Folks, I've been asking around people about the state of FP contract, which seems to be "on" but it's not really behaving like it, at least not as I would expect: int foo(float a, float b, float c) { return a*b+c; } $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - (...) fmul s0, s0, s1 fadd s0, s0, s2 (...) $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - (...) fmadd s0, s0, s1, s2 (...) I'm not sure this works in Fortran either, but defaulting to "on" when (I believe) the language should allow contraction and not doing it is not a good default. i haven't worked out what would be necessary to make it work on a case-by-case basis (what kinds of fusions does C allow?) to make sure we don't do all or nothing, but if we don't want to start that conversation now, then I'd recommend we just turn it all the way to 11 (like GCC) and let people turn it off if they really mean it. The rationale is that: * Contracted operations increase precision (less rounding steps) * It performs equal or faster on all architectures I know (true everywhere?) * Users already expect that (certainly, GCC users do) * Makes us look good on benchmarks :) A recent SPEC2k6 comparison Linaro did for AArch64, enabling -ffp-contract=fast took the edge of GCC in a number of cases and in some of them made them comparable in performance. So, any reasons not to? If we go with it, we need to first finish the job that Sebastian was dong on the test-suite, then just turn it on by default. A second stage would be to add tests/benchmarks that explicitly test FP precision, so that we have some extra guarantee that we're doing the right thing. Opinions? cheers, --renato
On 03/15/2017 08:27 AM, Renato Golin wrote:> Folks, > > I've been asking around people about the state of FP contract, which > seems to be "on" but it's not really behaving like it, at least not as > I would expect: > > int foo(float a, float b, float c) { return a*b+c; } > > $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - > (...) > fmul s0, s0, s1 > fadd s0, s0, s2 > (...)When you reverted r282259 in 282289, you also reverted the functional fix to make the command-line option actually work. Right now it is broken. Regardless of what else we do, we should fix this (we should probably recommit r282259, with the default flipped, to pick up the fixes). If you were to change your source file to: #pragma STDC FP_CONTRACT ON int foo(float a, float b, float c) { return a*b+c; } Then running: $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - (...) fmadd s0, s0, s1, s2 (...)> > $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - > (...) > fmadd s0, s0, s1, s2 > (...) > > I'm not sure this works in Fortran either, but defaulting to "on" when > (I believe) the language should allow contraction and not doing it is > not a good default. > > i haven't worked out what would be necessary to make it work on a > case-by-case basis (what kinds of fusions does C allow?) to make sure > we don't do all or nothing, but if we don't want to start that > conversation now, then I'd recommend we just turn it all the way to 11 > (like GCC) and let people turn it off if they really mean it. > > The rationale is that: > > * Contracted operations increase precision (less rounding steps) > * It performs equal or faster on all architectures I know (true everywhere?) > * Users already expect that (certainly, GCC users do) > * Makes us look good on benchmarks :) > > A recent SPEC2k6 comparison Linaro did for AArch64, enabling > -ffp-contract=fast took the edge of GCC in a number of cases and in > some of them made them comparable in performance. So, any reasons not > to? > > If we go with it, we need to first finish the job that Sebastian was > dong on the test-suite, then just turn it on by default. A second > stage would be to add tests/benchmarks that explicitly test FP > precision, so that we have some extra guarantee that we're doing the > right thing. > > Opinions?I'm certainly in favor of this plan. My users generally find our current defaults confusing because it differs from all of our other compilers (GCC and vendor compilers), plus it gives poor performance. Thanks again, Hal> > cheers, > --renato-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
On 03/15/2017 09:58 AM, Hal Finkel wrote:> On 03/15/2017 08:27 AM, Renato Golin wrote: > >> Folks, >> >> I've been asking around people about the state of FP contract, which >> seems to be "on" but it's not really behaving like it, at least not as >> I would expect: >> >> int foo(float a, float b, float c) { return a*b+c; } >> >> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - >> (...) >> fmul s0, s0, s1 >> fadd s0, s0, s2 >> (...) > > When you reverted r282259 in 282289, you also reverted the functional > fix to make the command-line option actually work. Right now it is > broken. Regardless of what else we do, we should fix this (we should > probably recommit r282259, with the default flipped, to pick up the > fixes). If you were to change your source file to: > > #pragma STDC FP_CONTRACT ON > int foo(float a, float b, float c) { return a*b+c; } >I would like to see https://reviews.llvm.org/rL282259 re-enabled, and the few miscompares left in the aarch64 run of the test-suite fixed by adding the FP_CONTRACT pragma in the source code. The commit log of r282259 states the problem that it fixed:> Clang has the default FP contraction setting of “-ffp-contract=on”, which > doesn't really mean “on” in the conventional sense of the word, but rather > really means “according to the per-statement effective value of the relevant > pragma”.Thanks, Sebastian
On 15 March 2017 at 14:58, Hal Finkel <hfinkel at anl.gov> wrote:> When you reverted r282259 in 282289, you also reverted the functional fix > to make the command-line option actually work. Right now it is broken. > Regardless of what else we do, we should fix this (we should probably > recommit r282259, with the default flipped, to pick up the fixes).We should run check-all and the test-suite with it on/fast before flipping the switch, and make sure that the behaviour Sebastian encountered is dealt with before this going live, or we'd be breaking too many test-suites and reverting and reapplying too often. But I'm certainly in favour of the plan to make it on/fast by default.> If you were to change your source file to: > > #pragma STDC FP_CONTRACT ON > int foo(float a, float b, float c) { return a*b+c; }I wasn't aware you needed the pragma for -ffp-contract=on. I assumed it would enable on all fp-math that the standard allowed (thus maybe needing some annotation on the operations). I thought that the pragma was to avoid using the command line argument...> I'm certainly in favor of this plan. My users generally find our current > defaults confusing because it differs from all of our other compilers (GCC > and vendor compilers), plus it gives poor performance.Right, I also agree to follow the principle of least surprise, we should default to "fast". Let's just make sure the infrastructure isn't going to crumble and do it. cheers, --renato
Adam Nemet via llvm-dev
2017-Mar-15 17:10 UTC
[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 <http://bugs.llvm.org/show_bug.cgi?id=25721> (-ffp-contract=fast does not work with LTO). I am working on adding function attributes for fp-contract=fast which should fix this. Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on. This will show up in the opt-viewer. Then the user can opt in either with the command-line switch or the new function attribute. Adam> On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > Folks, > > I've been asking around people about the state of FP contract, which > seems to be "on" but it's not really behaving like it, at least not as > I would expect: > > int foo(float a, float b, float c) { return a*b+c; } > > $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - > (...) > fmul s0, s0, s1 > fadd s0, s0, s2 > (...) > > $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - > (...) > fmadd s0, s0, s1, s2 > (...) > > I'm not sure this works in Fortran either, but defaulting to "on" when > (I believe) the language should allow contraction and not doing it is > not a good default. > > i haven't worked out what would be necessary to make it work on a > case-by-case basis (what kinds of fusions does C allow?) to make sure > we don't do all or nothing, but if we don't want to start that > conversation now, then I'd recommend we just turn it all the way to 11 > (like GCC) and let people turn it off if they really mean it. > > The rationale is that: > > * Contracted operations increase precision (less rounding steps) > * It performs equal or faster on all architectures I know (true everywhere?) > * Users already expect that (certainly, GCC users do) > * Makes us look good on benchmarks :) > > A recent SPEC2k6 comparison Linaro did for AArch64, enabling > -ffp-contract=fast took the edge of GCC in a number of cases and in > some of them made them comparable in performance. So, any reasons not > to? > > If we go with it, we need to first finish the job that Sebastian was > dong on the test-suite, then just turn it on by default. A second > stage would be to add tests/benchmarks that explicitly test FP > precision, so that we have some extra guarantee that we're doing the > right thing. > > Opinions? > > cheers, > --renato > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/30d1cf85/attachment.html>
Hal Finkel via llvm-dev
2017-Mar-15 17:13 UTC
[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?
On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:> Relevant to this discussion is > http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does > not work with LTO). I am working on adding function attributes for > fp-contract=fast which should fix this.Great!> > Also now that we have backend optimization remarks, I am planning to > report missed optimization when we can’t fuse FMAs due “fast” not > being on. This will show up in the opt-viewer. Then the user can opt > in either with the command-line switch or the new function attribute.That seems useful. Thanks again, Hal> > Adam > >> On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev >> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >> >> Folks, >> >> I've been asking around people about the state of FP contract, which >> seems to be "on" but it's not really behaving like it, at least not as >> I would expect: >> >> int foo(float a, float b, float c) { return a*b+c; } >> >> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o - >> (...) >> fmul s0, s0, s1 >> fadd s0, s0, s2 >> (...) >> >> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o - >> (...) >> fmadd s0, s0, s1, s2 >> (...) >> >> I'm not sure this works in Fortran either, but defaulting to "on" when >> (I believe) the language should allow contraction and not doing it is >> not a good default. >> >> i haven't worked out what would be necessary to make it work on a >> case-by-case basis (what kinds of fusions does C allow?) to make sure >> we don't do all or nothing, but if we don't want to start that >> conversation now, then I'd recommend we just turn it all the way to 11 >> (like GCC) and let people turn it off if they really mean it. >> >> The rationale is that: >> >> * Contracted operations increase precision (less rounding steps) >> * It performs equal or faster on all architectures I know (true >> everywhere?) >> * Users already expect that (certainly, GCC users do) >> * Makes us look good on benchmarks :) >> >> A recent SPEC2k6 comparison Linaro did for AArch64, enabling >> -ffp-contract=fast took the edge of GCC in a number of cases and in >> some of them made them comparable in performance. So, any reasons not >> to? >> >> If we go with it, we need to first finish the job that Sebastian was >> dong on the test-suite, then just turn it on by default. A second >> stage would be to add tests/benchmarks that explicitly test FP >> precision, so that we have some extra guarantee that we're doing the >> right thing. >> >> Opinions? >> >> cheers, >> --renato >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/9754e1d0/attachment.html>