thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2016-Oct-12 13:35 UTC

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:> Correct me if I misunderstood: you would be ok changing the
> reference output to exactly match the output of "-O0
-ffp-contract=off".
No, that's not at all what I said.

Matching identical outputs to FP tests makes no sense because there's
*always* an error bar.

The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the
boundaries of an average and its associated error bar.

By understanding what's the *expected* output and its associated error
range we can accurately predict what will be the correct
reference_output and the tolerance for each individual test.

Your solution 2 "works" because you're doing the matching
yourself, in
the code, and for that, you pay the penalty of running it twice. But
it's not easy to control the tolerance, nor it's stable for all
platforms where we don't yet run the test suite.

My original proposal, and what I'm still proposing here, is to
understand the tests and make them right, by giving them proper
references and tolerances. If the output is too large, reduce/sample
in a way that doesn't increase the error ranges too much, enough to
keep the tolerance low, so we can still catch bugs in the FP
transformations.

cheers,
--renato

Sebastian Pop via llvm-dev

2016-Oct-12 14:00 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Wed, Oct 12, 2016 at 9:35 AM, Renato Golin <renato.golin at linaro.org>
wrote:> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:
>> Correct me if I misunderstood: you would be ok changing the
>> reference output to exactly match the output of "-O0
-ffp-contract=off".
>
> No, that's not at all what I said.
Thanks for clarifying your previous statement: I stand corrected.
>
> Matching identical outputs to FP tests makes no sense because there's
> *always* an error bar.
Agreed.
> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the
> boundaries of an average and its associated error bar.
Agreed.
> By understanding what's the *expected* output and its associated error
> range we can accurately predict what will be the correct
> reference_output and the tolerance for each individual test.
Agreed.
>
> Your solution 2 "works" because you're doing the matching
yourself, in
> the code, and for that, you pay the penalty of running it twice. But
> it's not easy to control the tolerance, nor it's stable for all
> platforms where we don't yet run the test suite.
>
> My original proposal, and what I'm still proposing here, is to
> understand the tests and make them right, by giving them proper
> references and tolerances. If the output is too large, reduce/sample
> in a way that doesn't increase the error ranges too much, enough to
> keep the tolerance low, so we can still catch bugs in the FP
> transformations.
This goes in the same direction as what you said earlier in:
> To simplify the analysis, you can reduce the output into a single
> number, say, adding all the results up. This will generate more
> inaccuracies than comparing each value, and if that's too large an
> error, then you reduce the number of samples.
>
> For example, on cholesky, we sampled every 16th item of the array:
>
>  for (i = 0; i < n; i++) {
>    for (j = 0; j < n; j++)
>      print_element(A[i][j], j*16, printmat);
>    fputs(printmat, stderr);
>  }
Wrt "we sampled every 16th item of the array", not really in that
test,
but I get your point:

 k = 0;
 for (i = 0; i < n; i++) {
   for (j = 0; j < n; j+=16) {
     print_element(A[i][j], k, printmat);
     k += 16;
   }
   fputs(printmat, stderr);
 }

Ok, let's do this for the 5 benchmarks that do not exactly match.

Thanks,
Sebastian

Hal Finkel via llvm-dev

2016-Oct-12 14:05 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

----- Original Message -----> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Sebastian Pop" <sebpop.llvm at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "Sebastian Paul
Pop" <s.pop at samsung.com>, "llvm-dev" <llvm-dev at
lists.llvm.org>,
> "Matthias Braun" <matze at braunis.de>, "Clang
Dev" <cfe-dev at lists.llvm.org>, "nd" <nd at
arm.com>, "Abe Skolnik"
> <a.skolnik at samsung.com>
> Sent: Wednesday, October 12, 2016 8:35:16 AM
> Subject: Re: [test-suite] making polybench/symm succeed with
"-Ofast" and "-ffp-contract=on"
> 
> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com>
> wrote:
> > Correct me if I misunderstood: you would be ok changing the
> > reference output to exactly match the output of "-O0
> > -ffp-contract=off".
> 
> No, that's not at all what I said.
> 
> Matching identical outputs to FP tests makes no sense because there's
> *always* an error bar.
This is something we need to understand. No, there's not always an error
bar. With FMA formation and without non-IEEE-compliant optimizations (i.e.
fast-math), the optimized answer should be identical to the non-optimized
answer. If these don't match, then we should understand why. This used to be
a large problem because of fp80-related issues on x86 processors, but even on
x86 if we stick to SSE (etc.) FP instructions, this is not an issue any more. We
still do see cross-system discrepancies sometimes because of differences in
denormal handling, but on the same system that should be consistent (aside,
perhaps, from compiler-level constant-folding issues).

 -Hal
> 
> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the
> boundaries of an average and its associated error bar.
> 
> By understanding what's the *expected* output and its associated
> error
> range we can accurately predict what will be the correct
> reference_output and the tolerance for each individual test.
> 
> Your solution 2 "works" because you're doing the matching
yourself,
> in
> the code, and for that, you pay the penalty of running it twice. But
> it's not easy to control the tolerance, nor it's stable for all
> platforms where we don't yet run the test suite.
> 
> My original proposal, and what I'm still proposing here, is to
> understand the tests and make them right, by giving them proper
> references and tolerances. If the output is too large, reduce/sample
> in a way that doesn't increase the error ranges too much, enough to
> keep the tolerance low, so we can still catch bugs in the FP
> transformations.
> 
> cheers,
> --renato
> 
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Renato Golin via llvm-dev

2016-Oct-12 14:16 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 15:05, Hal Finkel <hfinkel at anl.gov>
wrote:> This is something we need to understand. No, there's not always an
error bar. With FMA formation and without non-IEEE-compliant optimizations (i.e.
fast-math), the optimized answer should be identical to the non-optimized
answer.
What about architectures that this is never respected, like Darwin?

In the general case, indeed, optimisation levels should not change the
IEEE representation and the tests should be deterministic.

But we can't guarantee this will always be the case.

> We still do see cross-system discrepancies sometimes because of differences
in denormal handling, but on the same system that should be consistent (aside,
perhaps, from compiler-level constant-folding issues).
But the test-suite doesn't run on a single system, nor it has one
reference_output for each system.

cheers,
--renato

Sebastian Pop via llvm-dev

2016-Oct-12 14:19 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Wed, Oct 12, 2016 at 9:35 AM, Renato Golin <renato.golin at linaro.org>
wrote:> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:
>> Correct me if I misunderstood: you would be ok changing the
>> reference output to exactly match the output of "-O0
-ffp-contract=off".
>
> No, that's not at all what I said.
>
> Matching identical outputs to FP tests makes no sense because there's
> *always* an error bar.
>
> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the
> boundaries of an average and its associated error bar.
>
> By understanding what's the *expected* output and its associated error
> range we can accurately predict what will be the correct
> reference_output and the tolerance for each individual test.
>
> Your solution 2 "works" because you're doing the matching
yourself, in
> the code, and for that, you pay the penalty of running it twice. But
> it's not easy to control the tolerance, nor it's stable for all
> platforms where we don't yet run the test suite.
>
> My original proposal, and what I'm still proposing here, is to
> understand the tests and make them right, by giving them proper
> references and tolerances.
There is also the problem that I documented for 5 of the benchmarks
where the error margin between -Ofast and "-O0 -ffp-contract=off" is
too big:

polybench/linear-algebra/kernels/symm, FP_ABSTOLERANCE=1e1
polybench/linear-algebra/solvers/gramschmidt, FP_ABSTOLERANCE=1e0
polybench/medley/reg_detect, FP_ABSTOLERANCE=1e4
polybench/stencils/adi, FP_ABSTOLERANCE=1e4

These differences come from the fact that these benchmarks
contain reductions of 1000+ values. The reductions are cumulating
errors, making end result diverge as the problem size increases.
> If the output is too large, reduce/sample
> in a way that doesn't increase the error ranges too much, enough to
> keep the tolerance low, so we can still catch bugs in the FP
> transformations.
>
> cheers,
> --renato

Joerg Sonnenberger via llvm-dev

2016-Oct-12 14:37 UTC

head link

[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Wed, Oct 12, 2016 at 02:35:16PM +0100, Renato Golin via cfe-dev
wrote:> Matching identical outputs to FP tests makes no sense because there's
> *always* an error bar.
That is plainly wrong and a very common misconception about floating
point. A very good example for something that is *required* to give the
very same result all the time is strtod. If a compiler change results in
different output, it is a bug. It is surprisingly difficult to ensure
that, but yes, there are floating point routines where absolute no
change must be added.

This doesn't mean that the rest of the proposal is wrong -- FMA
formation is after all valid inside expresions, so variance is possible.

Joerg

Renato Golin via llvm-dev

2016-Oct-12 14:41 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 15:19, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:> polybench/linear-algebra/kernels/symm, FP_ABSTOLERANCE=1e1
> polybench/linear-algebra/solvers/gramschmidt, FP_ABSTOLERANCE=1e0
> polybench/medley/reg_detect, FP_ABSTOLERANCE=1e4
> polybench/stencils/adi, FP_ABSTOLERANCE=1e4
Understanding it would be interesting to see what's at play. 1e4 may
be very large if the individual results are small, but acceptable if
they're all big anyway.

We don't want to have a large tolerance just because the reduced value
is large, so sampling may be a better strategy (it normally is).

cheers,
--renato

Renato Golin via llvm-dev

2016-Oct-12 15:10 UTC

head link

[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 15:37, Joerg Sonnenberger via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> That is plainly wrong and a very common misconception about floating
> point. A very good example for something that is *required* to give the
> very same result all the time is strtod. If a compiler change results in
> different output, it is a bug. It is surprisingly difficult to ensure
> that, but yes, there are floating point routines where absolute no
> change must be added.
That was a general remark, not an absolute one as both you and Hal
interpreted as. :)

But I'll repeat my response to Hal: Not all hardwares / systems are
the same. For example, Darwin has -ffast-math always enabled, so O3
will produce different results than O0.

I have added a lot of extra logic to the tests to minimise the
uncertainties, for example not relying on the platform's libraries for
printing, trigonometric or RNG functions, sampling results, etc. in
order to reduce the variability *across* platforms.

The reference_output is per test for all platforms, not per platform.

cheers,
--renato

Mehdi Amini via llvm-dev

2016-Oct-12 18:29 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

> On Oct 12, 2016, at 7:05 AM, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> ----- Original Message -----
>> From: "Renato Golin" <renato.golin at linaro.org
<mailto:renato.golin at linaro.org>>
>> To: "Sebastian Pop" <sebpop.llvm at gmail.com
<mailto:sebpop.llvm at gmail.com>>
>> Cc: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at
anl.gov>>, "Sebastian Paul Pop" <s.pop at samsung.com
<mailto:s.pop at samsung.com>>, "llvm-dev" <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>,
>> "Matthias Braun" <matze at braunis.de <mailto:matze at
braunis.de>>, "Clang Dev" <cfe-dev at lists.llvm.org
<mailto:cfe-dev at lists.llvm.org>>, "nd" <nd at arm.com
<mailto:nd at arm.com>>, "Abe Skolnik"
>> <a.skolnik at samsung.com <mailto:a.skolnik at
samsung.com>>
>> Sent: Wednesday, October 12, 2016 8:35:16 AM
>> Subject: Re: [test-suite] making polybench/symm succeed with
"-Ofast" and "-ffp-contract=on"
>> 
>> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com
<mailto:sebpop.llvm at gmail.com>>
>> wrote:
>>> Correct me if I misunderstood: you would be ok changing the
>>> reference output to exactly match the output of "-O0
>>> -ffp-contract=off".
>> 
>> No, that's not at all what I said.
>> 
>> Matching identical outputs to FP tests makes no sense because
there's
>> *always* an error bar.
> 
> This is something we need to understand. No, there's not always an
error bar. With FMA formation and without non-IEEE-compliant optimizations (i.e.
fast-math), the optimized answer should be identical to the non-optimized
answer.
Can you clarify: in my mind the F in FMA is for “fused”, i.e. no intermediate
truncation, i.e. not the same numerical result. But you imply the opposite
above?


— 
Mehdi

> If these don't match, then we should understand why. This used to be a
large problem because of fp80-related issues on x86 processors, but even on x86
if we stick to SSE (etc.) FP instructions, this is not an issue any more. We
still do see cross-system discrepancies sometimes because of differences in
denormal handling, but on the same system that should be consistent (aside,
perhaps, from compiler-level constant-folding issues).
> 
> -Hal
> 
>> 
>> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the
>> boundaries of an average and its associated error bar.
>> 
>> By understanding what's the *expected* output and its associated
>> error
>> range we can accurately predict what will be the correct
>> reference_output and the tolerance for each individual test.
>> 
>> Your solution 2 "works" because you're doing the matching
yourself,
>> in
>> the code, and for that, you pay the penalty of running it twice. But
>> it's not easy to control the tolerance, nor it's stable for all
>> platforms where we don't yet run the test suite.
>> 
>> My original proposal, and what I'm still proposing here, is to
>> understand the tests and make them right, by giving them proper
>> references and tolerances. If the output is too large, reduce/sample
>> in a way that doesn't increase the error ranges too much, enough to
>> keep the tolerance low, so we can still catch bugs in the FP
>> transformations.
>> 
>> cheers,
>> --renato
>> 
> 
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/4176203c/attachment.html>

llvm dev - Oct 2016 - [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"