thr3ads.net - llvm dev - [llvm-dev] [RFC] FP Contract = fast? [Mar 2017]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2017-Mar-15 13:27 UTC

[llvm-dev] [RFC] FP Contract = fast?

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least
not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on"
when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato

Hal Finkel via llvm-dev

2017-Mar-15 14:58 UTC

head link

[llvm-dev] [RFC] FP Contract = fast?

On 03/15/2017 08:27 AM, Renato Golin wrote:
> Folks,
>
> I've been asking around people about the state of FP contract, which
> seems to be "on" but it's not really behaving like it, at
least not as
> I would expect:
>
> int foo(float a, float b, float c) { return a*b+c; }
>
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
> (...)
> fmul s0, s0, s1
> fadd s0, s0, s2
> (...)
When you reverted  r282259 in 282289, you also reverted the functional 
fix to make the command-line option actually work. Right now it is 
broken. Regardless of what else we do, we should fix this (we should 
probably recommit r282259, with the default flipped, to pick up the 
fixes). If you were to change your source file to:

#pragma STDC FP_CONTRACT ON
int foo(float a, float b, float c) { return a*b+c; }

Then running:

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmadd    s0, s0, s1, s2
(...)
>
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
> (...)
> fmadd s0, s0, s1, s2
> (...)
>
> I'm not sure this works in Fortran either, but defaulting to
"on" when
> (I believe) the language should allow contraction and not doing it is
> not a good default.
>
> i haven't worked out what would be necessary to make it work on a
> case-by-case basis (what kinds of fusions does C allow?) to make sure
> we don't do all or nothing, but if we don't want to start that
> conversation now, then I'd recommend we just turn it all the way to 11
> (like GCC) and let people turn it off if they really mean it.
>
> The rationale is that:
>
> * Contracted operations increase precision (less rounding steps)
> * It performs equal or faster on all architectures I know (true
everywhere?)
> * Users already expect that (certainly, GCC users do)
> * Makes us look good on benchmarks :)
>
> A recent SPEC2k6 comparison Linaro did for AArch64, enabling
> -ffp-contract=fast took the edge of GCC in a number of cases and in
> some of them made them comparable in performance. So, any reasons not
> to?
>
> If we go with it, we need to first finish the job that Sebastian was
> dong on the test-suite, then just turn it on by default. A second
> stage would be to add tests/benchmarks that explicitly test FP
> precision, so that we have some extra guarantee that we're doing the
> right thing.
>
> Opinions?
I'm certainly in favor of this plan. My users generally find our current 
defaults confusing because it differs from all of our other compilers 
(GCC and vendor compilers), plus it gives poor performance.

Thanks again,
Hal
>
> cheers,
> --renato
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Sebastian Pop via llvm-dev

2017-Mar-15 15:35 UTC

head link

[llvm-dev] [RFC] FP Contract = fast?

On 03/15/2017 09:58 AM, Hal Finkel wrote:> On 03/15/2017 08:27 AM, Renato Golin wrote:
>
>> Folks,
>>
>> I've been asking around people about the state of FP contract,
which
>> seems to be "on" but it's not really behaving like it, at
least not as
>> I would expect:
>>
>> int foo(float a, float b, float c) { return a*b+c; }
>>
>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
>> (...)
>> fmul s0, s0, s1
>> fadd s0, s0, s2
>> (...)
>
> When you reverted  r282259 in 282289, you also reverted the functional 
> fix to make the command-line option actually work. Right now it is 
> broken. Regardless of what else we do, we should fix this (we should 
> probably recommit r282259, with the default flipped, to pick up the 
> fixes). If you were to change your source file to:
>
> #pragma STDC FP_CONTRACT ON
> int foo(float a, float b, float c) { return a*b+c; }
>I would like to see https://reviews.llvm.org/rL282259 re-enabled,
and the few miscompares left in the aarch64 run of the test-suite
fixed by adding the FP_CONTRACT pragma in the source code.

The commit log of r282259 states the problem that it
fixed:> Clang has the default FP contraction setting of  “-ffp-contract=on”, which
> doesn't really mean “on” in the conventional  sense of the word, but
rather
> really means “according to the per-statement  effective value of the
relevant
> pragma”.
Thanks,
Sebastian

Renato Golin via llvm-dev

2017-Mar-15 16:35 UTC

head link

[llvm-dev] [RFC] FP Contract = fast?

On 15 March 2017 at 14:58, Hal Finkel <hfinkel at anl.gov>
wrote:> When you reverted  r282259 in 282289, you also reverted the functional fix
> to make the command-line option actually work. Right now it is broken.
> Regardless of what else we do, we should fix this (we should probably
> recommit r282259, with the default flipped, to pick up the fixes).
We should run check-all and the test-suite with it on/fast before
flipping the switch, and make sure that the behaviour Sebastian
encountered is dealt with before this going live, or we'd be breaking
too many test-suites and reverting and reapplying too often.

But I'm certainly in favour of the plan to make it on/fast by default.

> If you were to change your source file to:
>
> #pragma STDC FP_CONTRACT ON
> int foo(float a, float b, float c) { return a*b+c; }
I wasn't aware you needed the pragma for -ffp-contract=on. I assumed
it would enable on all fp-math that the standard allowed (thus maybe
needing some annotation on the operations).

I thought that the pragma was to avoid using the command line argument...

> I'm certainly in favor of this plan. My users generally find our
current
> defaults confusing because it differs from all of our other compilers (GCC
> and vendor compilers), plus it gives poor performance.
Right, I also agree to follow the principle of least surprise, we
should default to "fast". Let's just make sure the infrastructure
isn't going to crumble and do it.

cheers,
--renato

Adam Nemet via llvm-dev

2017-Mar-15 17:10 UTC

head link

[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?

Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721
<http://bugs.llvm.org/show_bug.cgi?id=25721> (-ffp-contract=fast does not
work with LTO).  I am working on adding function attributes for fp-contract=fast
which should fix this.

Also now that we have backend optimization remarks, I am planning to report
missed optimization when we can’t fuse FMAs due “fast” not being on.  This will
show up in the opt-viewer.  Then the user can opt in either with the
command-line switch or the new function attribute.

Adam
> On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <cfe-dev at
lists.llvm.org> wrote:
> 
> Folks,
> 
> I've been asking around people about the state of FP contract, which
> seems to be "on" but it's not really behaving like it, at
least not as
> I would expect:
> 
> int foo(float a, float b, float c) { return a*b+c; }
> 
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
> (...)
> fmul s0, s0, s1
> fadd s0, s0, s2
> (...)
> 
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
> (...)
> fmadd s0, s0, s1, s2
> (...)
> 
> I'm not sure this works in Fortran either, but defaulting to
"on" when
> (I believe) the language should allow contraction and not doing it is
> not a good default.
> 
> i haven't worked out what would be necessary to make it work on a
> case-by-case basis (what kinds of fusions does C allow?) to make sure
> we don't do all or nothing, but if we don't want to start that
> conversation now, then I'd recommend we just turn it all the way to 11
> (like GCC) and let people turn it off if they really mean it.
> 
> The rationale is that:
> 
> * Contracted operations increase precision (less rounding steps)
> * It performs equal or faster on all architectures I know (true
everywhere?)
> * Users already expect that (certainly, GCC users do)
> * Makes us look good on benchmarks :)
> 
> A recent SPEC2k6 comparison Linaro did for AArch64, enabling
> -ffp-contract=fast took the edge of GCC in a number of cases and in
> some of them made them comparable in performance. So, any reasons not
> to?
> 
> If we go with it, we need to first finish the job that Sebastian was
> dong on the test-suite, then just turn it on by default. A second
> stage would be to add tests/benchmarks that explicitly test FP
> precision, so that we have some extra guarantee that we're doing the
> right thing.
> 
> Opinions?
> 
> cheers,
> --renato
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/30d1cf85/attachment.html>

Hal Finkel via llvm-dev

2017-Mar-15 17:13 UTC

head link

[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?

On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:> Relevant to this discussion is 
> http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does 
> not work with LTO).  I am working on adding function attributes for 
> fp-contract=fast which should fix this.
Great!
>
> Also now that we have backend optimization remarks, I am planning to 
> report missed optimization when we can’t fuse FMAs due “fast” not 
> being on.  This will show up in the opt-viewer.  Then the user can opt 
> in either with the command-line switch or the new function attribute.
That seems useful.

Thanks again,
Hal
>
> Adam
>
>> On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev 
>> <cfe-dev at lists.llvm.org <mailto:cfe-dev at
lists.llvm.org>> wrote:
>>
>> Folks,
>>
>> I've been asking around people about the state of FP contract,
which
>> seems to be "on" but it's not really behaving like it, at
least not as
>> I would expect:
>>
>> int foo(float a, float b, float c) { return a*b+c; }
>>
>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
>> (...)
>> fmul s0, s0, s1
>> fadd s0, s0, s2
>> (...)
>>
>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
>> (...)
>> fmadd s0, s0, s1, s2
>> (...)
>>
>> I'm not sure this works in Fortran either, but defaulting to
"on" when
>> (I believe) the language should allow contraction and not doing it is
>> not a good default.
>>
>> i haven't worked out what would be necessary to make it work on a
>> case-by-case basis (what kinds of fusions does C allow?) to make sure
>> we don't do all or nothing, but if we don't want to start that
>> conversation now, then I'd recommend we just turn it all the way to
11
>> (like GCC) and let people turn it off if they really mean it.
>>
>> The rationale is that:
>>
>> * Contracted operations increase precision (less rounding steps)
>> * It performs equal or faster on all architectures I know (true 
>> everywhere?)
>> * Users already expect that (certainly, GCC users do)
>> * Makes us look good on benchmarks :)
>>
>> A recent SPEC2k6 comparison Linaro did for AArch64, enabling
>> -ffp-contract=fast took the edge of GCC in a number of cases and in
>> some of them made them comparable in performance. So, any reasons not
>> to?
>>
>> If we go with it, we need to first finish the job that Sebastian was
>> dong on the test-suite, then just turn it on by default. A second
>> stage would be to add tests/benchmarks that explicitly test FP
>> precision, so that we have some extra guarantee that we're doing
the
>> right thing.
>>
>> Opinions?
>>
>> cheers,
>> --renato
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/9754e1d0/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Mar 2017 - [RFC] FP Contract = fast?

[llvm-dev] [RFC] FP Contract = fast?

[llvm-dev] [RFC] FP Contract = fast?

[llvm-dev] [RFC] FP Contract = fast?

[llvm-dev] [RFC] FP Contract = fast?

[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?

[llvm-dev] [cfe-dev] [RFC] FP Contract = fast?

Maybe Matching Threads