Chris Lattner via llvm-dev
2016-Sep-11 01:18 UTC
[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
On Sep 10, 2016, at 3:33 AM, Steve Canon <scanon at apple.com> wrote:>>> >>> Pretty much. In particular, imagine a user trying to debug an unexpected floating point result caused by conversion of a*b + c into fma(a, b, c). >> >> I think that’s unavoidable, because of the way the optimization levels work. Even fma contraction is on by default (something I’d like to see), at -O0, we wouldn't be doing contraction for: >> >> auto x = a*b; >> auto y = x+c; >> >> but we would do that at -O2 since we do mem2reg on x. > > In C, we don't contract (the equivalent of) this unless we're passed fp-contract=fast. The pragma only licenses contraction within a statement.Ah ok. What’s GCC’s policy on this?> IIRC, the situation in C++ is somewhat different, and the standard allows contraction across statement boundaries, though I don't think we take advantage of it at present.Is language standard pedanticism what we want to base our policies on? It’s great to not violate the standard of course, but it would be suboptimal for switching a .c file to .cpp to change its behavior. I’m not sure which way this cuts on this topic though, or if the cost is worth bearing.> TLDR: yeah, let's do this.Nice :-) -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160910/e9ace59a/attachment.html>
Abe Skolnik via llvm-dev
2016-Sep-12 18:58 UTC
[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
The now-ungated-by-O3-or-higher passes with no new unexpected failures when run on Ubuntu 14.04.1 on a Xeon-based server in 64-bit mode. [No known unexpected failures when testing on any other platform.] -- Abe The below patch is relative to... --------------------------------- commit b0768e805d1d33d730e5bd44ba578df043dfbc66 Author: George Burgess IV <george.burgess.iv at gmail.com> Date: Wed Sep 7 20:03:19 2016 +0000 diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 619ea9c..4b04937 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -2437,6 +2437,12 @@ bool CompilerInvocation::CreateFromArgs(CompilerInvocation &Res, if (Arch == llvm::Triple::spir || Arch == llvm::Triple::spir64) { Res.getDiagnosticOpts().Warnings.push_back("spir-compat"); } + + // If there will ever be e.g. "LangOpts.C", replace "LangOpts.C11 || LangOpts.C99" with "LangOpts.C" on the next line. + if ( (LangOpts.C11 || LangOpts.C99 || LangOpts.CPlusPlus) // ... + /*...*/ && ( CodeGenOptions::FPC_On == Res.getCodeGenOpts().getFPContractMode() ) ) // ... // just being careful + LangOpts.DefaultFPContract = 1; + return Success; } diff --git a/clang/test/CodeGen/fp-contract-pragma.cpp b/clang/test/CodeGen/fp-contract-pragma.cpp index 1c5921a..0949272 100644 --- a/clang/test/CodeGen/fp-contract-pragma.cpp +++ b/clang/test/CodeGen/fp-contract-pragma.cpp @@ -13,6 +13,7 @@ float fp_contract_2(float a, float b, float c) { // CHECK: _Z13fp_contract_2fff // CHECK: %[[M:.+]] = fmul float %a, %b // CHECK-NEXT: fadd float %[[M]], %c + #pragma STDC FP_CONTRACT OFF { #pragma STDC FP_CONTRACT ON }
Abe Skolnik via llvm-dev
2016-Sep-12 19:42 UTC
[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
On 09/12/2016 01:58 PM, Abe Skolnik wrote:> The now-ungated-by-O3-or-higher passes with no new unexpected failures when run on Ubuntu 14.04.1 on a > Xeon-based server in 64-bit mode. [No known unexpected failures when testing on any otherplatform.] Oops. I did a "make check" when I _should_ have done a "make check-all". Some test cases _are_ broken. I will work on fixing them [as well as finishing my new test cases that will test in the future that the WIP improvement will not have regressed] and report again later. -- Abe
Abe Skolnik via llvm-dev
2016-Sep-12 21:09 UTC
[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive
Dear all, I have added 4 test cases that all fail on the "vanilla" [i.e. unmodified] compiler and succeed with my patch applied. Please see below, presented for comments/feedback. The only difference across the non-O0 files is the -O<something> flag; would other people prefer that I factor this out into one include file and 3 short stubs, if I can? The only difference other than -O<something> between the O0 test and all the rest is that in the -O0 case I have removed the "CHECK-NEXT" for "ret" immediately following the "fmadd" b/c at -O0 the optimizer is not eliminating the boilerplate stack-related code that in this case is not truly needed. Regards, Abe diff --git a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c new file mode 100644 index 0000000..fd4a979 --- /dev/null +++ b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c @@ -0,0 +1,15 @@ +// RUN: %clang_cc1 -triple aarch64 -O0 -S -o - %s | FileCheck %s +// REQUIRES: aarch64-registered-target + +// CHECK-LABEL: fmadd_double: +// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}} +double fmadd_double(double a, double b, double c) { + return a*b+c; +} + +// CHECK-LABEL: fmadd_single: +// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}} +float fmadd_single(float a, float b, float c) { + return a*b+c; +} + diff --git a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c new file mode 100644 index 0000000..f5c55a9 --- /dev/null +++ b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c @@ -0,0 +1,17 @@ +// RUN: %clang_cc1 -triple aarch64 -O1 -S -o - %s | FileCheck %s +// REQUIRES: aarch64-registered-target + +// CHECK-LABEL: fmadd_double: +// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}} +// CHECK-NEXT: ret +double fmadd_double(double a, double b, double c) { + return a*b+c; +} + +// CHECK-LABEL: fmadd_single: +// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}} +// CHECK-NEXT: ret +float fmadd_single(float a, float b, float c) { + return a*b+c; +} + diff --git a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c new file mode 100644 index 0000000..98b82a6 --- /dev/null +++ b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c @@ -0,0 +1,17 @@ +// RUN: %clang_cc1 -triple aarch64 -O2 -S -o - %s | FileCheck %s +// REQUIRES: aarch64-registered-target + +// CHECK-LABEL: fmadd_double: +// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}} +// CHECK-NEXT: ret +double fmadd_double(double a, double b, double c) { + return a*b+c; +} + +// CHECK-LABEL: fmadd_single: +// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}} +// CHECK-NEXT: ret +float fmadd_single(float a, float b, float c) { + return a*b+c; +} + diff --git a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c new file mode 100644 index 0000000..4d64738 --- /dev/null +++ b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c @@ -0,0 +1,17 @@ +// RUN: %clang_cc1 -triple aarch64 -O3 -S -o - %s | FileCheck %s +// REQUIRES: aarch64-registered-target + +// CHECK-LABEL: fmadd_double: +// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}} +// CHECK-NEXT: ret +double fmadd_double(double a, double b, double c) { + return a*b+c; +} + +// CHECK-LABEL: fmadd_single: +// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}} +// CHECK-NEXT: ret +float fmadd_single(float a, float b, float c) { + return a*b+c; +} +