Renato Golin via llvm-dev
2016-Oct-12 13:35 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> Correct me if I misunderstood: you would be ok changing the > reference output to exactly match the output of "-O0 -ffp-contract=off".No, that's not at all what I said. Matching identical outputs to FP tests makes no sense because there's *always* an error bar. The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the boundaries of an average and its associated error bar. By understanding what's the *expected* output and its associated error range we can accurately predict what will be the correct reference_output and the tolerance for each individual test. Your solution 2 "works" because you're doing the matching yourself, in the code, and for that, you pay the penalty of running it twice. But it's not easy to control the tolerance, nor it's stable for all platforms where we don't yet run the test suite. My original proposal, and what I'm still proposing here, is to understand the tests and make them right, by giving them proper references and tolerances. If the output is too large, reduce/sample in a way that doesn't increase the error ranges too much, enough to keep the tolerance low, so we can still catch bugs in the FP transformations. cheers, --renato
Sebastian Pop via llvm-dev
2016-Oct-12 14:00 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 9:35 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com> wrote: >> Correct me if I misunderstood: you would be ok changing the >> reference output to exactly match the output of "-O0 -ffp-contract=off". > > No, that's not at all what I said.Thanks for clarifying your previous statement: I stand corrected.> > Matching identical outputs to FP tests makes no sense because there's > *always* an error bar.Agreed.> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the > boundaries of an average and its associated error bar.Agreed.> By understanding what's the *expected* output and its associated error > range we can accurately predict what will be the correct > reference_output and the tolerance for each individual test.Agreed.> > Your solution 2 "works" because you're doing the matching yourself, in > the code, and for that, you pay the penalty of running it twice. But > it's not easy to control the tolerance, nor it's stable for all > platforms where we don't yet run the test suite. > > My original proposal, and what I'm still proposing here, is to > understand the tests and make them right, by giving them proper > references and tolerances. If the output is too large, reduce/sample > in a way that doesn't increase the error ranges too much, enough to > keep the tolerance low, so we can still catch bugs in the FP > transformations.This goes in the same direction as what you said earlier in:> To simplify the analysis, you can reduce the output into a single > number, say, adding all the results up. This will generate more > inaccuracies than comparing each value, and if that's too large an > error, then you reduce the number of samples. > > For example, on cholesky, we sampled every 16th item of the array: > > for (i = 0; i < n; i++) { > for (j = 0; j < n; j++) > print_element(A[i][j], j*16, printmat); > fputs(printmat, stderr); > }Wrt "we sampled every 16th item of the array", not really in that test, but I get your point: k = 0; for (i = 0; i < n; i++) { for (j = 0; j < n; j+=16) { print_element(A[i][j], k, printmat); k += 16; } fputs(printmat, stderr); } Ok, let's do this for the 5 benchmarks that do not exactly match. Thanks, Sebastian
Hal Finkel via llvm-dev
2016-Oct-12 14:05 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
----- Original Message -----> From: "Renato Golin" <renato.golin at linaro.org> > To: "Sebastian Pop" <sebpop.llvm at gmail.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "Sebastian Paul Pop" <s.pop at samsung.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, > "Matthias Braun" <matze at braunis.de>, "Clang Dev" <cfe-dev at lists.llvm.org>, "nd" <nd at arm.com>, "Abe Skolnik" > <a.skolnik at samsung.com> > Sent: Wednesday, October 12, 2016 8:35:16 AM > Subject: Re: [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" > > On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com> > wrote: > > Correct me if I misunderstood: you would be ok changing the > > reference output to exactly match the output of "-O0 > > -ffp-contract=off". > > No, that's not at all what I said. > > Matching identical outputs to FP tests makes no sense because there's > *always* an error bar.This is something we need to understand. No, there's not always an error bar. With FMA formation and without non-IEEE-compliant optimizations (i.e. fast-math), the optimized answer should be identical to the non-optimized answer. If these don't match, then we should understand why. This used to be a large problem because of fp80-related issues on x86 processors, but even on x86 if we stick to SSE (etc.) FP instructions, this is not an issue any more. We still do see cross-system discrepancies sometimes because of differences in denormal handling, but on the same system that should be consistent (aside, perhaps, from compiler-level constant-folding issues). -Hal> > The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the > boundaries of an average and its associated error bar. > > By understanding what's the *expected* output and its associated > error > range we can accurately predict what will be the correct > reference_output and the tolerance for each individual test. > > Your solution 2 "works" because you're doing the matching yourself, > in > the code, and for that, you pay the penalty of running it twice. But > it's not easy to control the tolerance, nor it's stable for all > platforms where we don't yet run the test suite. > > My original proposal, and what I'm still proposing here, is to > understand the tests and make them right, by giving them proper > references and tolerances. If the output is too large, reduce/sample > in a way that doesn't increase the error ranges too much, enough to > keep the tolerance low, so we can still catch bugs in the FP > transformations. > > cheers, > --renato >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Renato Golin via llvm-dev
2016-Oct-12 14:16 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 15:05, Hal Finkel <hfinkel at anl.gov> wrote:> This is something we need to understand. No, there's not always an error bar. With FMA formation and without non-IEEE-compliant optimizations (i.e. fast-math), the optimized answer should be identical to the non-optimized answer.What about architectures that this is never respected, like Darwin? In the general case, indeed, optimisation levels should not change the IEEE representation and the tests should be deterministic. But we can't guarantee this will always be the case.> We still do see cross-system discrepancies sometimes because of differences in denormal handling, but on the same system that should be consistent (aside, perhaps, from compiler-level constant-folding issues).But the test-suite doesn't run on a single system, nor it has one reference_output for each system. cheers, --renato
Sebastian Pop via llvm-dev
2016-Oct-12 14:19 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 9:35 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com> wrote: >> Correct me if I misunderstood: you would be ok changing the >> reference output to exactly match the output of "-O0 -ffp-contract=off". > > No, that's not at all what I said. > > Matching identical outputs to FP tests makes no sense because there's > *always* an error bar. > > The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the > boundaries of an average and its associated error bar. > > By understanding what's the *expected* output and its associated error > range we can accurately predict what will be the correct > reference_output and the tolerance for each individual test. > > Your solution 2 "works" because you're doing the matching yourself, in > the code, and for that, you pay the penalty of running it twice. But > it's not easy to control the tolerance, nor it's stable for all > platforms where we don't yet run the test suite. > > My original proposal, and what I'm still proposing here, is to > understand the tests and make them right, by giving them proper > references and tolerances.There is also the problem that I documented for 5 of the benchmarks where the error margin between -Ofast and "-O0 -ffp-contract=off" is too big: polybench/linear-algebra/kernels/symm, FP_ABSTOLERANCE=1e1 polybench/linear-algebra/solvers/gramschmidt, FP_ABSTOLERANCE=1e0 polybench/medley/reg_detect, FP_ABSTOLERANCE=1e4 polybench/stencils/adi, FP_ABSTOLERANCE=1e4 These differences come from the fact that these benchmarks contain reductions of 1000+ values. The reductions are cumulating errors, making end result diverge as the problem size increases.> If the output is too large, reduce/sample > in a way that doesn't increase the error ranges too much, enough to > keep the tolerance low, so we can still catch bugs in the FP > transformations. > > cheers, > --renato
Joerg Sonnenberger via llvm-dev
2016-Oct-12 14:37 UTC
[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 02:35:16PM +0100, Renato Golin via cfe-dev wrote:> Matching identical outputs to FP tests makes no sense because there's > *always* an error bar.That is plainly wrong and a very common misconception about floating point. A very good example for something that is *required* to give the very same result all the time is strtod. If a compiler change results in different output, it is a bug. It is surprisingly difficult to ensure that, but yes, there are floating point routines where absolute no change must be added. This doesn't mean that the rest of the proposal is wrong -- FMA formation is after all valid inside expresions, so variance is possible. Joerg
Renato Golin via llvm-dev
2016-Oct-12 14:41 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 15:19, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> polybench/linear-algebra/kernels/symm, FP_ABSTOLERANCE=1e1 > polybench/linear-algebra/solvers/gramschmidt, FP_ABSTOLERANCE=1e0 > polybench/medley/reg_detect, FP_ABSTOLERANCE=1e4 > polybench/stencils/adi, FP_ABSTOLERANCE=1e4Understanding it would be interesting to see what's at play. 1e4 may be very large if the individual results are small, but acceptable if they're all big anyway. We don't want to have a large tolerance just because the reduced value is large, so sampling may be a better strategy (it normally is). cheers, --renato
Renato Golin via llvm-dev
2016-Oct-12 15:10 UTC
[llvm-dev] [cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 15:37, Joerg Sonnenberger via llvm-dev <llvm-dev at lists.llvm.org> wrote:> That is plainly wrong and a very common misconception about floating > point. A very good example for something that is *required* to give the > very same result all the time is strtod. If a compiler change results in > different output, it is a bug. It is surprisingly difficult to ensure > that, but yes, there are floating point routines where absolute no > change must be added.That was a general remark, not an absolute one as both you and Hal interpreted as. :) But I'll repeat my response to Hal: Not all hardwares / systems are the same. For example, Darwin has -ffast-math always enabled, so O3 will produce different results than O0. I have added a lot of extra logic to the tests to minimise the uncertainties, for example not relying on the platform's libraries for printing, trigonometric or RNG functions, sampling results, etc. in order to reduce the variability *across* platforms. The reference_output is per test for all platforms, not per platform. cheers, --renato
Mehdi Amini via llvm-dev
2016-Oct-12 18:29 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
> On Oct 12, 2016, at 7:05 AM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ----- Original Message ----- >> From: "Renato Golin" <renato.golin at linaro.org <mailto:renato.golin at linaro.org>> >> To: "Sebastian Pop" <sebpop.llvm at gmail.com <mailto:sebpop.llvm at gmail.com>> >> Cc: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>, "Sebastian Paul Pop" <s.pop at samsung.com <mailto:s.pop at samsung.com>>, "llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>, >> "Matthias Braun" <matze at braunis.de <mailto:matze at braunis.de>>, "Clang Dev" <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>, "nd" <nd at arm.com <mailto:nd at arm.com>>, "Abe Skolnik" >> <a.skolnik at samsung.com <mailto:a.skolnik at samsung.com>> >> Sent: Wednesday, October 12, 2016 8:35:16 AM >> Subject: Re: [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" >> >> On 12 October 2016 at 14:26, Sebastian Pop <sebpop.llvm at gmail.com <mailto:sebpop.llvm at gmail.com>> >> wrote: >>> Correct me if I misunderstood: you would be ok changing the >>> reference output to exactly match the output of "-O0 >>> -ffp-contract=off". >> >> No, that's not at all what I said. >> >> Matching identical outputs to FP tests makes no sense because there's >> *always* an error bar. > > This is something we need to understand. No, there's not always an error bar. With FMA formation and without non-IEEE-compliant optimizations (i.e. fast-math), the optimized answer should be identical to the non-optimized answer.Can you clarify: in my mind the F in FMA is for “fused”, i.e. no intermediate truncation, i.e. not the same numerical result. But you imply the opposite above? — Mehdi> If these don't match, then we should understand why. This used to be a large problem because of fp80-related issues on x86 processors, but even on x86 if we stick to SSE (etc.) FP instructions, this is not an issue any more. We still do see cross-system discrepancies sometimes because of differences in denormal handling, but on the same system that should be consistent (aside, perhaps, from compiler-level constant-folding issues). > > -Hal > >> >> The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within the >> boundaries of an average and its associated error bar. >> >> By understanding what's the *expected* output and its associated >> error >> range we can accurately predict what will be the correct >> reference_output and the tolerance for each individual test. >> >> Your solution 2 "works" because you're doing the matching yourself, >> in >> the code, and for that, you pay the penalty of running it twice. But >> it's not easy to control the tolerance, nor it's stable for all >> platforms where we don't yet run the test suite. >> >> My original proposal, and what I'm still proposing here, is to >> understand the tests and make them right, by giving them proper >> references and tolerances. If the output is too large, reduce/sample >> in a way that doesn't increase the error ranges too much, enough to >> keep the tolerance low, so we can still catch bugs in the FP >> transformations. >> >> cheers, >> --renato >> > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/4176203c/attachment.html>