thr3ads.net - llvm dev - [llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Sebastian Pop via llvm-dev

2016-Oct-11 11:15 UTC

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Mon, Oct 10, 2016 at 5:02 PM, Hal Finkel <hfinkel at anl.gov>
wrote:> ----- Original Message -----
>> From: "Sebastian Pop" <sebpop.llvm at gmail.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "Sebastian Paul Pop" <s.pop at samsung.com>,
"llvm-dev" <llvm-dev at lists.llvm.org>, "Matthias
Braun"
>> <matze at braunis.de>, "Clang Dev" <cfe-dev at
lists.llvm.org>, "nd" <nd at arm.com>, "Abe
Skolnik" <a.skolnik at samsung.com>,
>> "Renato Golin" <renato.golin at linaro.org>
>> Sent: Monday, October 10, 2016 9:10:01 AM
>> Subject: [test-suite] making polybench/symm succeed with
"-Ofast" and "-ffp-contract=on"
>>
>> Hi,
>>
>> I would need some help to fix polybench/symm:
>>
>> void kernel_symm(int ni, int nj,
>> DATA_TYPE alpha,
>> DATA_TYPE beta,
>> DATA_TYPE POLYBENCH_2D(C,NI,NJ,ni,nj),
>> DATA_TYPE POLYBENCH_2D(A,NJ,NJ,nj,nj),
>> DATA_TYPE POLYBENCH_2D(B,NI,NJ,ni,nj))
>> {
>>   int i, j, k;
>>   DATA_TYPE acc;
>>
>>   /*  C := alpha*A*B + beta*C, A is symetric */
>>   for (i = 0; i < _PB_NI; i++)
>>     for (j = 0; j < _PB_NJ; j++)
>>       {
>>         acc = 0;
>>         for (k = 0; k < j - 1; k++)
>>           {
>>              C[k][j] += alpha * A[k][i] * B[i][j];
>>              acc += B[k][j] * A[k][i];
>>           }
>>         C[i][j] = beta * C[i][j] + alpha * A[i][i] * B[i][j] + alpha
>>         * acc;
>>       }
>> }
>>
>> Compiling this kernel with __attribute__((optnone)) and outputing the
>> contents of the C[][] array does not match the reference output.
>
> Why is this? What compiler are you using? Are we not using IEEE FP @ -O0
(e.g. using x87 floating point)? IEEE FP, without FMA, should be completely
deterministic. Sounds like a bug.
This is with clang top of tree, on a x86_64-linux.
I created https://reviews.llvm.org/D25465 with the changes that I have
to the symm benchmark.
>
>> Furthermore, compiling this kernel at -Ofast and comparing against
>> -O0
>> only passes for FP_ABSTOLERANCE=10.
>> All the 10 other polybench tests that I have transformed to check FP
>> are passing at FP_ABSTOLERANCE=1e-5 (and most likely they could pass
>> at an even more reduced tolerance.)
>>
>> The symm benchmark seems to accumulate all the errors as it is a big
>> reduction from the first elements of the C[][] array into the last
>> elements.
>> I'm not sure we can rely on this benchmark to check FP correctness.
>>
>> One option is to completely specify which optimization flags have
>> been
>> used to compute the reference output and only use that to compile
>> this
>> benchmark.
>>
>> Please share your ideas on how to deal with this particular test.
>
> If the test is not numerically stable, we can:
>
>  1. Only test the non-FP-contracted output
Yes, this is what I'm doing.
>  2. Run the FP-contracted test only for a very small size (so that
we'll stay within some reasonable tolerance of the reference output)
>  3. Change the matrix to something that will make the test numerically
stable (it does not look like the matrix itself matters to the performance;
where do the values come from?).
>
The values may be very large towards the end of the C array.
The test now passes with FP_ABSTOLERANCE=1e-5 when lowering the values
in the input arrays with this patch:

diff --git
a/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c
b/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c
index 0a1bdf3..7fc3cb1 100644
--- a/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c
+++ b/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c
@@ -35,12 +35,12 @@ void init_array(int ni, int nj,
   *beta = 2123;
   for (i = 0; i < ni; i++)
     for (j = 0; j < nj; j++) {
-      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni;
-      B[i][j] = ((DATA_TYPE) i*j) / ni;
+      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni;
+      B[i][j] = ((DATA_TYPE) i-j) / ni;
     }
   for (i = 0; i < nj; i++)
     for (j = 0; j < nj; j++)
-      A[i][j] = ((DATA_TYPE) i*j) / ni;
+      A[i][j] = ((DATA_TYPE) i-j) / ni;
 }

Of course we need to update the reference output hash if we decide to
use this patch.

Sebastian

Renato Golin via llvm-dev

2016-Oct-11 11:33 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On 11 October 2016 at 12:15, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:>>  1. Only test the non-FP-contracted output
>
> Yes, this is what I'm doing.
If the whole test is about testing multiplications, what's the point of
this?

>>  2. Run the FP-contracted test only for a very small size (so that
we'll stay within some reasonable tolerance of the reference output)
>>  3. Change the matrix to something that will make the test numerically
stable (it does not look like the matrix itself matters to the performance;
where do the values come from?).
3 is more sound, 2 may be more practical.

> -      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni;
> -      B[i][j] = ((DATA_TYPE) i*j) / ni;
> +      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni;
> +      B[i][j] = ((DATA_TYPE) i-j) / ni;
>      }
>    for (i = 0; i < nj; i++)
>      for (j = 0; j < nj; j++)
> -      A[i][j] = ((DATA_TYPE) i*j) / ni;
> +      A[i][j] = ((DATA_TYPE) i-j) / ni;
Changing from multiplication to subtraction changes completely the
nature of the test and goes towards "return 0;", ie, fiddling with the
code so that the compiler "behaves" better. This is *not* a solution.

Hal,

For large scale numerical programs, if fp-contract can result in large
scale differences, we need to think about this approach by default.

If the loop above cannot be contained in an 1e-8 range for double
values over a large dataset, than I guess the transformation is going
a bit too far.

If not, we should be able to come up with a reasonable tolerance that
makes the test still be relevant.

cheers,
--renato

Sebastian Pop via llvm-dev

2016-Oct-11 11:46 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Tue, Oct 11, 2016 at 6:33 AM, Renato Golin <renato.golin at linaro.org>
wrote:> On 11 October 2016 at 12:15, Sebastian Pop <sebpop.llvm at gmail.com>
wrote:
>>>  1. Only test the non-FP-contracted output
>>
>> Yes, this is what I'm doing.
>
> If the whole test is about testing multiplications, what's the point of
this?
>
>
>>>  2. Run the FP-contracted test only for a very small size (so that
we'll stay within some reasonable tolerance of the reference output)
>>>  3. Change the matrix to something that will make the test
numerically stable (it does not look like the matrix itself matters to the
performance; where do the values come from?).
>
> 3 is more sound, 2 may be more practical.
2 sounds like you are asking to only run checkFP on the first elements
of the array.
In that case what would be the last element to check?
>
>
>> -      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni;
>> -      B[i][j] = ((DATA_TYPE) i*j) / ni;
>> +      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni;
>> +      B[i][j] = ((DATA_TYPE) i-j) / ni;
>>      }
>>    for (i = 0; i < nj; i++)
>>      for (j = 0; j < nj; j++)
>> -      A[i][j] = ((DATA_TYPE) i*j) / ni;
>> +      A[i][j] = ((DATA_TYPE) i-j) / ni;
>
> Changing from multiplication to subtraction changes completely the
> nature of the test and goes towards "return 0;", ie, fiddling
with the
> code so that the compiler "behaves" better. This is *not* a
solution.
Another observation: when changing * with + the test only passes at
-Ofast with FP_ABSTOLERANCE=1e-4.

Sebastian

Hal Finkel via llvm-dev

2016-Oct-12 03:20 UTC

head link

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

----- Original Message -----> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Sebastian Pop" <sebpop.llvm at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "Sebastian Paul
Pop" <s.pop at samsung.com>, "llvm-dev" <llvm-dev at
lists.llvm.org>,
> "Matthias Braun" <matze at braunis.de>, "Clang
Dev" <cfe-dev at lists.llvm.org>, "nd" <nd at
arm.com>, "Abe Skolnik"
> <a.skolnik at samsung.com>
> Sent: Tuesday, October 11, 2016 6:33:43 AM
> Subject: Re: [test-suite] making polybench/symm succeed with
"-Ofast" and "-ffp-contract=on"
> 
> On 11 October 2016 at 12:15, Sebastian Pop <sebpop.llvm at gmail.com>
> wrote:
> >>  1. Only test the non-FP-contracted output
> >
> > Yes, this is what I'm doing.
> 
> If the whole test is about testing multiplications, what's the point
> of this?
> 
> 
> >>  2. Run the FP-contracted test only for a very small size (so that
> >>  we'll stay within some reasonable tolerance of the reference
> >>  output)
> >>  3. Change the matrix to something that will make the test
> >>  numerically stable (it does not look like the matrix itself
> >>  matters to the performance; where do the values come from?).
> 
> 3 is more sound, 2 may be more practical.
> 
> 
> > -      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni;
> > -      B[i][j] = ((DATA_TYPE) i*j) / ni;
> > +      C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni;
> > +      B[i][j] = ((DATA_TYPE) i-j) / ni;
> >      }
> >    for (i = 0; i < nj; i++)
> >      for (j = 0; j < nj; j++)
> > -      A[i][j] = ((DATA_TYPE) i*j) / ni;
> > +      A[i][j] = ((DATA_TYPE) i-j) / ni;
> 
> Changing from multiplication to subtraction changes completely the
> nature of the test and goes towards "return 0;", ie, fiddling
with
> the
> code so that the compiler "behaves" better. This is *not* a
solution.
> 
> Hal,
> 
> For large scale numerical programs, if fp-contract can result in
> large
> scale differences, we need to think about this approach by default.
Obviously a lot of people have done an awful lot of thinking about this over
many years, and contractions-by-default is the reality on many systems. If you
have a program that is numerically unstable, simulating a chaotic system, etc.
then any difference, often no matter how small, will lead to large-scale
differences in the output. As a result, there will be some tests that don't
have a useful tolerance; sometimes these are badly-implemented tests, but
sometimes the sensitivity represents an underling physical reality of a
simulated system (there's a lot of very-interesting mathematical theory
behind this, e.g.
https://en.wikipedia.org/wiki/Chaos_theory#Sensitivity_to_initial_coonditions).
>From a user-experience perspective, this can be very unfortunate. It can be
hard to understand why compiler optimizations, or different compilers, produce
executables that produce different outputs for identical input configurations.
It contributes to feelings that floating point is hard and confusing. However,
not using the contractions also leads to equally-confusing performance
discrepancies between our compiler and others (and between the observed and
expected performance). We have a classic "Damned if you do, damned if you
don't" situation. However, I lean toward enabling the contractions by
default because other compilers do it (so users need to learn about what's
going on anyway - we can't shield them from this regardless of what we do)
and it gives users the performance they expect (which increases our user base
and makes many users happier).
 -Hal
> 
> If the loop above cannot be contained in an 1e-8 range for double
> values over a large dataset, than I guess the transformation is going
> a bit too far.
> 
> If not, we should be able to come up with a reasonable tolerance that
> makes the test still be relevant.
> 
> cheers,
> --renato
> 
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Oct 2016 - [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

Seemingly Similar Threads