thr3ads.net - llvm dev - [llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Sebastian Pop via llvm-dev

2016-Oct-12 04:20 UTC

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Tue, Oct 11, 2016 at 10:39 PM, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:> On Tue, Oct 11, 2016 at 10:20 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>> ----- Original Message -----
>>> From: "Renato Golin" <renato.golin at linaro.org>
>>> To: "Sebastian Pop" <sebpop.llvm at gmail.com>
>>> Cc: "Hal Finkel" <hfinkel at anl.gov>,
"Sebastian Paul Pop" <s.pop at samsung.com>,
"llvm-dev" <llvm-dev at lists.llvm.org>,
>>> "Matthias Braun" <matze at braunis.de>, "Clang
Dev" <cfe-dev at lists.llvm.org>, "nd" <nd at
arm.com>, "Abe Skolnik"
>>> <a.skolnik at samsung.com>
>>> Sent: Tuesday, October 11, 2016 6:33:43 AM
>>> Subject: Re: [test-suite] making polybench/symm succeed with
"-Ofast" and "-ffp-contract=on"
>>>
>>> On 11 October 2016 at 12:15, Sebastian Pop <sebpop.llvm at
gmail.com>
>>> wrote:
>>> >>  1. Only test the non-FP-contracted output
>>> >
>>> > Yes, this is what I'm doing.
>>>
>>> If the whole test is about testing multiplications, what's the
point
>>> of this?
>>>
>>>
>>> >>  2. Run the FP-contracted test only for a very small size
(so that
>>> >>  we'll stay within some reasonable tolerance of the
reference
>>> >>  output)
>>> >>  3. Change the matrix to something that will make the test
>>> >>  numerically stable (it does not look like the matrix
itself
>>> >>  matters to the performance; where do the values come
from?).
>>>
>>> 3 is more sound, 2 may be more practical.
>>>
>>>
>>> > -      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni;
>>> > -      B[i][j] = ((DATA_TYPE) i*j) / ni;
>>> > +      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni;
>>> > +      B[i][j] = ((DATA_TYPE) i-j) / ni;
>>> >      }
>>> >    for (i = 0; i < nj; i++)
>>> >      for (j = 0; j < nj; j++)
>>> > -      A[i][j] = ((DATA_TYPE) i*j) / ni;
>>> > +      A[i][j] = ((DATA_TYPE) i-j) / ni;
>>>
>>> Changing from multiplication to subtraction changes completely the
>>> nature of the test and goes towards "return 0;", ie,
fiddling with
>>> the
>>> code so that the compiler "behaves" better. This is *not*
a solution.
It is not uncommon to see in several polybench tests adjustments to
the initial values:

  /*
  LLVM: This change ensures we do not calculate nan values, which are
        formatted differently on different platforms and which may also
        be optimized unexpectedly.
  Original code:
  for (i = 0; i < ni; i++)
    for (j = 0; j < nj; j++) {
      A[i][j] = ((DATA_TYPE) i*j) / ni;
      Q[i][j] = ((DATA_TYPE) i*(j+1)) / nj;
    }
  for (i = 0; i < nj; i++)
    for (j = 0; j < nj; j++)
      R[i][j] = ((DATA_TYPE) i*(j+2)) / nj;
  */
  for (i = 0; i < ni; i++)
    for (j = 0; j < nj; j++) {
      A[i][j] = ((DATA_TYPE) i*j+ni) / ni;
      Q[i][j] = ((DATA_TYPE) i*(j+1)+nj) / nj;
    }
  for (i = 0; i < nj; i++)
    for (j = 0; j < nj; j++)
      R[i][j] = ((DATA_TYPE) i*(j+2)+nj) / nj;

git grepping gives us:

linear-algebra/kernels/cholesky/cholesky.c:  LLVM: This change ensures
we do not calculate nan values, which are
linear-algebra/kernels/cholesky/cholesky.c:      LLVM: This change
ensures we do not calculate nan values, which are
linear-algebra/kernels/cholesky/cholesky.c:      LLVM: This change
ensures we do not calculate nan values, which are
linear-algebra/kernels/trisolv/trisolv.c:  LLVM: This change ensures
we do not calculate nan values, which are
linear-algebra/solvers/gramschmidt/gramschmidt.c:  LLVM: This change
ensures we do not calculate nan values, which are
linear-algebra/solvers/lu/lu.c:  LLVM: This change ensures we do not
calculate nan values, which are

Sebastian Pop via llvm-dev

2016-Oct-12 04:35 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

polybench/linear-algebra/solvers/gramschmidt/ exposes the same problems as symm.
It does not match the reference output at -O0 -ffp-contract=off,
and it only passes all elements comparisons for FP_ABSTOLERANCE=1 for
"-Ofast" vs. "-O0 -ffp-contract=off".

Renato Golin via llvm-dev

2016-Oct-12 08:38 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 05:20, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:>   /*
>   LLVM: This change ensures we do not calculate nan values, which are
>         formatted differently on different platforms and which may also
>         be optimized unexpectedly.
This comment is there since it was originally introduced by Tobias.
We'll have to ask him what changes were done to understand how this is
relevant to your current proposal.

cheers,
--renato

Renato Golin via llvm-dev

2016-Oct-12 09:01 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 12 October 2016 at 05:35, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:> polybench/linear-algebra/solvers/gramschmidt/ exposes the same problems as
symm.
> It does not match the reference output at -O0 -ffp-contract=off,
> and it only passes all elements comparisons for FP_ABSTOLERANCE=1 for
> "-Ofast" vs. "-O0 -ffp-contract=off".
I think we're going about this in a completely wrong way.

The current reference output is specific to fp-contract=off, and
making it work for fp-contract=on makes no sense at all.

For all we know, fp-contract=on generates *more accurate* results, not
less. But it may also have less predictable results *across* different
targets, thus the need to a tolerance.

FP_TOLERANCE is *not* about making the new results match an old
reference, but about showing the *real* uncertainties of FP
transformation on *different* targets.

So, if you want to fix this test for good, here are the steps you need to take:

1. Checkout the test-suite on different platforms, x86_64, ARM,
AArch64, PPC, MIPS. The more the merrier.
2. Enable fp-contract=on, run the tests on all platforms, record the
outputs, ignore the differences.
3. Collate each platofrm's output for each test and see how different they
are

To make it easier to compare, in the past, I've used this trick:

1. Run in one platform, ex. x86_64, ignored the reference
2. Copy the output of those tests back to the reference_output
3. Run on a different platform, tweaking the tolerance until it
"passes"
4. Run on yet another platform, making sure you don't need to tweak
the tolerance yet again

If the tolerance is "too high" for that test, we can further discuss
how to change it to make it better. If not, you found a solution.

If you want to make it even better, do some analysis on the
distribution of the results, per test, and pick the average as the
reference output and one or two standard deviations as the tolerance.
This should pass on most architectures.

To simplify the analysis, you can reduce the output into a single
number, say, adding all the results up. This will generate more
inaccuracies than comparing each value, and if that's too large an
error, then you reduce the number of samples.

For example, on cholesky, we sampled every 16th item of the array:

  for (i = 0; i < n; i++) {
    for (j = 0; j < n; j++)
      print_element(A[i][j], j*16, printmat);
    fputs(printmat, stderr);
  }

using "print_element" because calling printf sucks.

These modifications are ok, because they don't change the tests nor
hides them from compiler changes.

cheers,
--renato

Sebastian Pop via llvm-dev

2016-Oct-12 13:42 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Wed, Oct 12, 2016 at 4:38 AM, Renato Golin <renato.golin at linaro.org>
wrote:> On 12 October 2016 at 05:20, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:
>>   /*
>>   LLVM: This change ensures we do not calculate nan values, which are
>>         formatted differently on different platforms and which may also
>>         be optimized unexpectedly.
>
> This comment is there since it was originally introduced by Tobias.
> We'll have to ask him what changes were done to understand how this is
> relevant to your current proposal.
>
The code before is in the comments: we know exactly what Tobi has changed.
Most of these changes are in the initialization of the arrays, though
there are also
changes to the computational kernel.

Polybench was designed to stress loop optimizations in the polyhedral model.
The intent of adding Polybench to the test-suite was to stress loop
optimizations in Polly.
Those initial changes by Tobi reflect this intent: neither the FP
computation, nor the initial values matter much.
I would appreciate if Tobi could share his point of view on Polybench:
I added him to the CC list.

We are currently trying to modify Polybench to test something
different than what it was designed for.
This goes along with my earlier comment about the SPEC benchmarks:
there are benchmarks that have been designed to test FP computations.
If we need more FP benchmarks in the test-suite, we should try to
identify and add benchmarks in which "FP expert" people put thought in
correctly designing the tests to check FP computations.

Sebastian

llvm dev - Oct 2016 - [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"