Sebastian Pop via llvm-dev
2016-Oct-06 15:11 UTC
[llvm-dev] test-suite: a new proposal for how to move forward to make "test-suite" more automatic, more flexible, and more maintainable, especially WRT reference outputs
On Thu, Oct 6, 2016 at 5:02 AM, Kristof Beyls via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi Abe, > > My 2 cents: > I have been using the test-suite mainly in benchmarking mode as a convenient > way to track performance changes in top-of-trunk. > I've observed that some of the programs (IIRC, especially the ones in > SingleSource/Benchmarks/Polybench/) produce a lot of output (megabytes). > This caused a lot of noise in performance measurements, as the execution > time was dominated by printing out the data, rather than the actual useful > computations. Renato removed the worst noise in > http://reviews.llvm.org/D10991. > > That experience made me think that for the programs in the test-suite, > ideally they should print out only a small amount of output to be checked. > For example, by adapting individual programs that output a lot of data to > only print a summary/aggregate of the data, that somehow is likely to change > when a miscomputation happened. > > If we could go in that direction, I don't see much need for storing hashes > or even compressed output as reference data. > I think that needing compressed reference data may make the test-suite ever > so slightly harder to set up: another dependency on an external tool. Not > that I can imagine that having a dependency on e.g. gzip would be > problematic on any platform. > > Anyway, I thought I'd just share my opinion of it being ideal that the > programs in the test-suite would only produce small outputs, to avoid noisy > benchmark results. If that would be a direction we could go into, there may > not be much needed for storing hashes or compressed reference output. >Kristof, I agree with your point of view. There is a very easy way to output only one double from the polybench: - compile the kernel with fp-contract=off and -fno-fast-math - add a "+" reduction loop of all the elements in the output array (also compiled with strict FP computations such that the output is deterministic) - print the result of the reduction instead of printing the full array. Thanks, Sebastian
Sebastian Pop via llvm-dev
2016-Oct-06 15:15 UTC
[llvm-dev] test-suite: a new proposal for how to move forward to make "test-suite" more automatic, more flexible, and more maintainable, especially WRT reference outputs
Adding Tobi in CC to get his review about the proposed change to Polybench. Thanks, Sebastian On Thu, Oct 6, 2016 at 11:11 AM, Sebastian Pop <sebpop.llvm at gmail.com> wrote:> On Thu, Oct 6, 2016 at 5:02 AM, Kristof Beyls via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Hi Abe, >> >> My 2 cents: >> I have been using the test-suite mainly in benchmarking mode as a convenient >> way to track performance changes in top-of-trunk. >> I've observed that some of the programs (IIRC, especially the ones in >> SingleSource/Benchmarks/Polybench/) produce a lot of output (megabytes). >> This caused a lot of noise in performance measurements, as the execution >> time was dominated by printing out the data, rather than the actual useful >> computations. Renato removed the worst noise in >> http://reviews.llvm.org/D10991. >> >> That experience made me think that for the programs in the test-suite, >> ideally they should print out only a small amount of output to be checked. >> For example, by adapting individual programs that output a lot of data to >> only print a summary/aggregate of the data, that somehow is likely to change >> when a miscomputation happened. >> >> If we could go in that direction, I don't see much need for storing hashes >> or even compressed output as reference data. >> I think that needing compressed reference data may make the test-suite ever >> so slightly harder to set up: another dependency on an external tool. Not >> that I can imagine that having a dependency on e.g. gzip would be >> problematic on any platform. >> >> Anyway, I thought I'd just share my opinion of it being ideal that the >> programs in the test-suite would only produce small outputs, to avoid noisy >> benchmark results. If that would be a direction we could go into, there may >> not be much needed for storing hashes or compressed reference output. >> > > Kristof, I agree with your point of view. > > There is a very easy way to output only one double from the polybench: > - compile the kernel with fp-contract=off and -fno-fast-math > - add a "+" reduction loop of all the elements in the output array > (also compiled with strict FP computations such that the output is > deterministic) > - print the result of the reduction instead of printing the full array. > > Thanks, > Sebastian
Renato Golin via llvm-dev
2016-Oct-06 15:38 UTC
[llvm-dev] [cfe-dev] test-suite: a new proposal for how to move forward to make "test-suite" more automatic, more flexible, and more maintainable, especially WRT reference outputs
On 6 October 2016 at 16:11, Sebastian Pop via cfe-dev <cfe-dev at lists.llvm.org> wrote:> There is a very easy way to output only one double from the polybench: > - compile the kernel with fp-contract=off and -fno-fast-mathSebastian, please stop crossing the wires. This is a separate discussion.> - add a "+" reduction loop of all the elements in the output array > (also compiled with strict FP computations such that the output is > deterministic)addition can saturate/overflow and lose precision, especially if we have hundreds of thousands of results or if the type is float, not double. Whatever the aggregation function we use has to be meaningful. One way I did in the past was to aggregate in blocks when the results weren't likely to saturate/overflow/lose precision, ie. the end result had a similar magnitude as the individual results. This gave us huge benefits in I/O and comparison times, and can work with polybench, but someone will have to go through it and make sure the aggregated numbers are not orders of magnitude greater than the individual results. cheers, --renato
Sebastian Pop via llvm-dev
2016-Oct-06 18:17 UTC
[llvm-dev] [cfe-dev] test-suite: a new proposal for how to move forward to make "test-suite" more automatic, more flexible, and more maintainable, especially WRT reference outputs
On Thu, Oct 6, 2016 at 11:38 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 6 October 2016 at 16:11, Sebastian Pop via cfe-dev > <cfe-dev at lists.llvm.org> wrote: >> There is a very easy way to output only one double from the polybench: >> - compile the kernel with fp-contract=off and -fno-fast-math > > Sebastian, please stop crossing the wires. This is a separate discussion.We need to get deterministic output for all possible combinations of CFLAGS the users will compile the test-suite with.> > >> - add a "+" reduction loop of all the elements in the output array >> (also compiled with strict FP computations such that the output is >> deterministic) > > addition can saturate/overflow and lose precision, especially if we > have hundreds of thousands of results or if the type is float, not > double. Whatever the aggregation function we use has to be meaningful.Agreed. I'm also fine using any stable hashing function and link polybench tests against that.