Sebastian Pop via llvm-dev
2016-Oct-12 04:20 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Tue, Oct 11, 2016 at 10:39 PM, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> On Tue, Oct 11, 2016 at 10:20 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> ----- Original Message ----- >>> From: "Renato Golin" <renato.golin at linaro.org> >>> To: "Sebastian Pop" <sebpop.llvm at gmail.com> >>> Cc: "Hal Finkel" <hfinkel at anl.gov>, "Sebastian Paul Pop" <s.pop at samsung.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, >>> "Matthias Braun" <matze at braunis.de>, "Clang Dev" <cfe-dev at lists.llvm.org>, "nd" <nd at arm.com>, "Abe Skolnik" >>> <a.skolnik at samsung.com> >>> Sent: Tuesday, October 11, 2016 6:33:43 AM >>> Subject: Re: [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on" >>> >>> On 11 October 2016 at 12:15, Sebastian Pop <sebpop.llvm at gmail.com> >>> wrote: >>> >> 1. Only test the non-FP-contracted output >>> > >>> > Yes, this is what I'm doing. >>> >>> If the whole test is about testing multiplications, what's the point >>> of this? >>> >>> >>> >> 2. Run the FP-contracted test only for a very small size (so that >>> >> we'll stay within some reasonable tolerance of the reference >>> >> output) >>> >> 3. Change the matrix to something that will make the test >>> >> numerically stable (it does not look like the matrix itself >>> >> matters to the performance; where do the values come from?). >>> >>> 3 is more sound, 2 may be more practical. >>> >>> >>> > - C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i*j) / ni; >>> > - B[i][j] = ((DATA_TYPE) i*j) / ni; >>> > + C_StrictFP[i][j] = C[i][j] = ((DATA_TYPE) i-j) / ni; >>> > + B[i][j] = ((DATA_TYPE) i-j) / ni; >>> > } >>> > for (i = 0; i < nj; i++) >>> > for (j = 0; j < nj; j++) >>> > - A[i][j] = ((DATA_TYPE) i*j) / ni; >>> > + A[i][j] = ((DATA_TYPE) i-j) / ni; >>> >>> Changing from multiplication to subtraction changes completely the >>> nature of the test and goes towards "return 0;", ie, fiddling with >>> the >>> code so that the compiler "behaves" better. This is *not* a solution.It is not uncommon to see in several polybench tests adjustments to the initial values: /* LLVM: This change ensures we do not calculate nan values, which are formatted differently on different platforms and which may also be optimized unexpectedly. Original code: for (i = 0; i < ni; i++) for (j = 0; j < nj; j++) { A[i][j] = ((DATA_TYPE) i*j) / ni; Q[i][j] = ((DATA_TYPE) i*(j+1)) / nj; } for (i = 0; i < nj; i++) for (j = 0; j < nj; j++) R[i][j] = ((DATA_TYPE) i*(j+2)) / nj; */ for (i = 0; i < ni; i++) for (j = 0; j < nj; j++) { A[i][j] = ((DATA_TYPE) i*j+ni) / ni; Q[i][j] = ((DATA_TYPE) i*(j+1)+nj) / nj; } for (i = 0; i < nj; i++) for (j = 0; j < nj; j++) R[i][j] = ((DATA_TYPE) i*(j+2)+nj) / nj; git grepping gives us: linear-algebra/kernels/cholesky/cholesky.c: LLVM: This change ensures we do not calculate nan values, which are linear-algebra/kernels/cholesky/cholesky.c: LLVM: This change ensures we do not calculate nan values, which are linear-algebra/kernels/cholesky/cholesky.c: LLVM: This change ensures we do not calculate nan values, which are linear-algebra/kernels/trisolv/trisolv.c: LLVM: This change ensures we do not calculate nan values, which are linear-algebra/solvers/gramschmidt/gramschmidt.c: LLVM: This change ensures we do not calculate nan values, which are linear-algebra/solvers/lu/lu.c: LLVM: This change ensures we do not calculate nan values, which are
Sebastian Pop via llvm-dev
2016-Oct-12 04:35 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
polybench/linear-algebra/solvers/gramschmidt/ exposes the same problems as symm. It does not match the reference output at -O0 -ffp-contract=off, and it only passes all elements comparisons for FP_ABSTOLERANCE=1 for "-Ofast" vs. "-O0 -ffp-contract=off".
Renato Golin via llvm-dev
2016-Oct-12 08:38 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 05:20, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> /* > LLVM: This change ensures we do not calculate nan values, which are > formatted differently on different platforms and which may also > be optimized unexpectedly.This comment is there since it was originally introduced by Tobias. We'll have to ask him what changes were done to understand how this is relevant to your current proposal. cheers, --renato
Renato Golin via llvm-dev
2016-Oct-12 09:01 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On 12 October 2016 at 05:35, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> polybench/linear-algebra/solvers/gramschmidt/ exposes the same problems as symm. > It does not match the reference output at -O0 -ffp-contract=off, > and it only passes all elements comparisons for FP_ABSTOLERANCE=1 for > "-Ofast" vs. "-O0 -ffp-contract=off".I think we're going about this in a completely wrong way. The current reference output is specific to fp-contract=off, and making it work for fp-contract=on makes no sense at all. For all we know, fp-contract=on generates *more accurate* results, not less. But it may also have less predictable results *across* different targets, thus the need to a tolerance. FP_TOLERANCE is *not* about making the new results match an old reference, but about showing the *real* uncertainties of FP transformation on *different* targets. So, if you want to fix this test for good, here are the steps you need to take: 1. Checkout the test-suite on different platforms, x86_64, ARM, AArch64, PPC, MIPS. The more the merrier. 2. Enable fp-contract=on, run the tests on all platforms, record the outputs, ignore the differences. 3. Collate each platofrm's output for each test and see how different they are To make it easier to compare, in the past, I've used this trick: 1. Run in one platform, ex. x86_64, ignored the reference 2. Copy the output of those tests back to the reference_output 3. Run on a different platform, tweaking the tolerance until it "passes" 4. Run on yet another platform, making sure you don't need to tweak the tolerance yet again If the tolerance is "too high" for that test, we can further discuss how to change it to make it better. If not, you found a solution. If you want to make it even better, do some analysis on the distribution of the results, per test, and pick the average as the reference output and one or two standard deviations as the tolerance. This should pass on most architectures. To simplify the analysis, you can reduce the output into a single number, say, adding all the results up. This will generate more inaccuracies than comparing each value, and if that's too large an error, then you reduce the number of samples. For example, on cholesky, we sampled every 16th item of the array: for (i = 0; i < n; i++) { for (j = 0; j < n; j++) print_element(A[i][j], j*16, printmat); fputs(printmat, stderr); } using "print_element" because calling printf sucks. These modifications are ok, because they don't change the tests nor hides them from compiler changes. cheers, --renato
Sebastian Pop via llvm-dev
2016-Oct-12 13:42 UTC
[llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 4:38 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 12 October 2016 at 05:20, Sebastian Pop <sebpop.llvm at gmail.com> wrote: >> /* >> LLVM: This change ensures we do not calculate nan values, which are >> formatted differently on different platforms and which may also >> be optimized unexpectedly. > > This comment is there since it was originally introduced by Tobias. > We'll have to ask him what changes were done to understand how this is > relevant to your current proposal. >The code before is in the comments: we know exactly what Tobi has changed. Most of these changes are in the initialization of the arrays, though there are also changes to the computational kernel. Polybench was designed to stress loop optimizations in the polyhedral model. The intent of adding Polybench to the test-suite was to stress loop optimizations in Polly. Those initial changes by Tobi reflect this intent: neither the FP computation, nor the initial values matter much. I would appreciate if Tobi could share his point of view on Polybench: I added him to the CC list. We are currently trying to modify Polybench to test something different than what it was designed for. This goes along with my earlier comment about the SPEC benchmarks: there are benchmarks that have been designed to test FP computations. If we need more FP benchmarks in the test-suite, we should try to identify and add benchmarks in which "FP expert" people put thought in correctly designing the tests to check FP computations. Sebastian