thr3ads.net - llvm dev - [llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"] [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Chris Lattner via llvm-dev

2016-Sep-11 01:18 UTC

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

On Sep 10, 2016, at 3:33 AM, Steve Canon <scanon at apple.com>
wrote:>>> 
>>> Pretty much.  In particular, imagine a user trying to debug an
unexpected floating point result caused by conversion of a*b + c into fma(a, b,
c).
>> 
>> I think that’s unavoidable, because of the way the optimization levels
work.  Even fma contraction is on by default (something I’d like to see), at
-O0, we wouldn't be doing contraction for:
>> 
>> auto x = a*b;
>> auto y = x+c;
>> 
>> but we would do that at -O2 since we do mem2reg on x.
> 
> In C, we don't contract (the equivalent of) this unless we're
passed fp-contract=fast.  The pragma only licenses contraction within a
statement.
Ah ok.  What’s GCC’s policy on this?
> IIRC, the situation in C++ is somewhat different, and the standard allows
contraction across statement boundaries, though I don't think we take
advantage of it at present.
Is language standard pedanticism what we want to base our policies on?  It’s
great to not violate the standard of course, but it would be suboptimal for
switching a .c file to .cpp to change its behavior.  I’m not sure which way this
cuts on this topic though, or if the cost is worth bearing.
> TLDR: yeah, let's do this.
Nice :-)

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160910/e9ace59a/attachment.html>

Abe Skolnik via llvm-dev

2016-Sep-12 18:58 UTC

head link

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

The now-ungated-by-O3-or-higher passes with no new unexpected failures when run
on Ubuntu
14.04.1 on a Xeon-based server in 64-bit mode.  [No known unexpected failures
when testing on
any other platform.]

-- Abe





The below patch is relative to...
---------------------------------
commit b0768e805d1d33d730e5bd44ba578df043dfbc66
Author: George Burgess IV <george.burgess.iv at gmail.com>
Date:   Wed Sep 7 20:03:19 2016 +0000






diff --git a/clang/lib/Frontend/CompilerInvocation.cpp
b/clang/lib/Frontend/CompilerInvocation.cpp
index 619ea9c..4b04937 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2437,6 +2437,12 @@ bool
CompilerInvocation::CreateFromArgs(CompilerInvocation &Res,
    if (Arch == llvm::Triple::spir || Arch == llvm::Triple::spir64) {
      Res.getDiagnosticOpts().Warnings.push_back("spir-compat");
    }
+
+  // If there will ever be e.g. "LangOpts.C", replace
"LangOpts.C11 || LangOpts.C99" with
"LangOpts.C" on the next line.
+  if (    (LangOpts.C11 || LangOpts.C99 || LangOpts.CPlusPlus)                 
// ...
+  /*...*/ && ( CodeGenOptions::FPC_On ==
Res.getCodeGenOpts().getFPContractMode() ) ) // ...
// just being careful
+    LangOpts.DefaultFPContract = 1;
+
    return Success;
  }

diff --git a/clang/test/CodeGen/fp-contract-pragma.cpp
b/clang/test/CodeGen/fp-contract-pragma.cpp
index 1c5921a..0949272 100644
--- a/clang/test/CodeGen/fp-contract-pragma.cpp
+++ b/clang/test/CodeGen/fp-contract-pragma.cpp
@@ -13,6 +13,7 @@ float fp_contract_2(float a, float b, float c) {
  // CHECK: _Z13fp_contract_2fff
  // CHECK: %[[M:.+]] = fmul float %a, %b
  // CHECK-NEXT: fadd float %[[M]], %c
+  #pragma STDC FP_CONTRACT OFF
    {
      #pragma STDC FP_CONTRACT ON
    }

Abe Skolnik via llvm-dev

2016-Sep-12 19:42 UTC

head link

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

On 09/12/2016 01:58 PM, Abe Skolnik wrote:> The now-ungated-by-O3-or-higher passes with no new unexpected failures when
run on Ubuntu 14.04.1 on a
>  Xeon-based server in 64-bit mode.  [No known unexpected failures when
testing on any otherplatform.]

Oops.  I did a "make check" when I _should_ have done a "make
check-all".  Some test cases
_are_ broken.  I will work on fixing them [as well as finishing my new test
cases that will
test in the future that the WIP improvement will not have regressed] and report
again later.

-- Abe

Abe Skolnik via llvm-dev

2016-Sep-12 21:09 UTC

head link

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive

Dear all,

I have added 4 test cases that all fail on the "vanilla" [i.e.
unmodified] compiler and succeed
with my patch applied.  Please see below, presented for comments/feedback.

The only difference across the non-O0 files is the -O<something> flag;
would other people
prefer that I factor this out into one include file and 3 short stubs, if I can?

The only difference other than -O<something> between the O0 test and all
the rest is that in
the -O0 case I have removed the "CHECK-NEXT" for "ret"
immediately following the "fmadd" b/c at
-O0 the optimizer is not eliminating the boilerplate stack-related code that in
this case is
not truly needed.

Regards,

Abe






diff --git
a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c
new file mode 100644
index 0000000..fd4a979
--- /dev/null
+++
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O0___aarch64-backend.c
@@ -0,0 +1,15 @@
+// RUN: %clang_cc1 -triple aarch64 -O0 -S -o - %s | FileCheck %s
+// REQUIRES: aarch64-registered-target
+
+// CHECK-LABEL: fmadd_double:
+// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}}
+double fmadd_double(double a, double b, double c) {
+  return a*b+c;
+}
+
+// CHECK-LABEL: fmadd_single:
+// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}}
+float  fmadd_single(float  a, float  b, float  c) {
+  return a*b+c;
+}
+
diff --git
a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c
new file mode 100644
index 0000000..f5c55a9
--- /dev/null
+++
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O1___aarch64-backend.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -triple aarch64 -O1 -S -o - %s | FileCheck %s
+// REQUIRES: aarch64-registered-target
+
+// CHECK-LABEL: fmadd_double:
+// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}}
+// CHECK-NEXT: ret
+double fmadd_double(double a, double b, double c) {
+  return a*b+c;
+}
+
+// CHECK-LABEL: fmadd_single:
+// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}}
+// CHECK-NEXT: ret
+float  fmadd_single(float  a, float  b, float  c) {
+  return a*b+c;
+}
+
diff --git
a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c
new file mode 100644
index 0000000..98b82a6
--- /dev/null
+++
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O2___aarch64-backend.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -triple aarch64 -O2 -S -o - %s | FileCheck %s
+// REQUIRES: aarch64-registered-target
+
+// CHECK-LABEL: fmadd_double:
+// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}}
+// CHECK-NEXT: ret
+double fmadd_double(double a, double b, double c) {
+  return a*b+c;
+}
+
+// CHECK-LABEL: fmadd_single:
+// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}}
+// CHECK-NEXT: ret
+float  fmadd_single(float  a, float  b, float  c) {
+  return a*b+c;
+}
+
diff --git
a/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c
new file mode 100644
index 0000000..4d64738
--- /dev/null
+++
b/clang/test/CodeGen/fp-contract-pragma___on-by-default___-O3___aarch64-backend.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -triple aarch64 -O3 -S -o - %s | FileCheck %s
+// REQUIRES: aarch64-registered-target
+
+// CHECK-LABEL: fmadd_double:
+// CHECK: fmadd d0, d{{[0-7]}}, d{{[0-7]}}, d{{[0-7]}}
+// CHECK-NEXT: ret
+double fmadd_double(double a, double b, double c) {
+  return a*b+c;
+}
+
+// CHECK-LABEL: fmadd_single:
+// CHECK: fmadd s0, s{{[0-7]}}, s{{[0-7]}}, s{{[0-7]}}
+// CHECK-NEXT: ret
+float  fmadd_single(float  a, float  b, float  c) {
+  return a*b+c;
+}
+

llvm dev - Sep 2016 - defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]

[llvm-dev] defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive