Pankaj Kukreja via llvm-dev
2018-May-25 18:49 UTC
[llvm-dev] Using Google Benchmark Library
Hi, I am adding some benchmarks to the test-suite as a part of my GSoC project. I am planning to use the google benchmark library on some benchmarks. I would like to know your opinion/suggestion on how I should proceed with this library and how the design should be(like limiting the number of times a kernel is executed so that overall runtime of test-suite can be controlled, multiple inputs sizes, adding it to small kernels etc.). I would love to hear any other suggestions that you may have. Thanks, Pankaj -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180526/fb41284f/attachment.html>
On 05/25/2018 01:49 PM, Pankaj Kukreja via llvm-dev wrote:> Hi, > I am adding some benchmarks to the test-suite as a part of my GSoC > project. I am planning to use the google benchmark library on some > benchmarks. I would like to know your opinion/suggestion on how I > should proceedHi, Pankaj, We have a directory in the test suite, MicroBenchmarks, for things which depend on Google's benchmark library. Your benchmarks should likely go there. You'll need to post a patch for review for the particular things that you'd like to add.> with this library and how the design should be(like limiting the > number of times a kernel is executed so that overall runtime > of test-suite can be controlled, multiple inputs sizes, adding it to > small kernels etc.).The benchmark library can dynamically pick the number of iterations, and we definitely need to keep a handle on the overall runtime of the test suite. I don't think that we yet have guidelines for microbenchmark timing, but for full applications, we try to keep the runtime down to ~1s. -Hal> I would love to hear any other suggestions that you may have. > > Thanks, > Pankaj > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180525/27b17656/attachment.html>
Michael Kruse via llvm-dev
2018-May-25 20:09 UTC
[llvm-dev] Using Google Benchmark Library
2018-05-25 13:49 GMT-05:00 Pankaj Kukreja via llvm-dev <llvm-dev at lists.llvm.org>:> Hi, > I am adding some benchmarks to the test-suite as a part of my GSoC project. > I am planning to use the google benchmark library on some benchmarks. I > would like to know your opinion/suggestion on how I should proceed with this > library and how the design should be(like limiting the number of times a > kernel is executed so that overall runtime of test-suite can be controlled, > multiple inputs sizes, adding it to small kernels etc.). > I would love to hear any other suggestions that you may have.I would like to add some details of the intent. We would like to benchmark common algorithms that appear in many sources (Think of linear algebra, image processing, etc.). Usually, they only consist of a single function. Only the kernel itself is of interest, but not e.g. the data initialization. Depending on the optimization (e.g. by Polly, parallelization, offloading), the execution time can vary widely. Pankaj already added a review at https://reviews.llvm.org/D46735 where the idea to use Google Benchmark came up. However, as such it does not fulfill all the requirements, in particular, it does not check for correctness. Here is a list of things I would like to see: - Check correct output - Execution time - Compilation time - LLVM pass statistics - Code size - Hardware performance counters - Multiple problem sizes - Measuring the above for the kernel only. - Optional very large problem sizes (that test every cache level), disabled by default - Repeated execution to average-out noise - Taking cold/warm cache into account Here is an idea on how this could be implemented: Every benchmark consists of two files: The source for the kernel and a driver that initializes the input data, knows how to call the kernel in the other file and can check the correct output. The framework recognizes all benchmarks and their drivers and does the following: - Compile the driver - Time the compilation of the kernel using -save-stats=obj - Get the kernel's code size using llvm-size - Link driver, kernel and Google Benchmark together. - Instruct the driver to run the kernel with a small problem size and check the correctness. - Instructs Google Benchmark to run the kernel to get a reliable average execution time of the kernel (without the input data initialization) - LNT's --exec-multisample does not need to run the benchmarks multiple times, as Google Benchmark already did so. Michael
Dean Michael Berris via llvm-dev
2018-May-27 10:19 UTC
[llvm-dev] Using Google Benchmark Library
> On 26 May 2018, at 06:09, Michael Kruse via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > 2018-05-25 13:49 GMT-05:00 Pankaj Kukreja via llvm-dev > <llvm-dev at lists.llvm.org>: >> Hi, >> I am adding some benchmarks to the test-suite as a part of my GSoC project. >> I am planning to use the google benchmark library on some benchmarks. I >> would like to know your opinion/suggestion on how I should proceed with this >> library and how the design should be(like limiting the number of times a >> kernel is executed so that overall runtime of test-suite can be controlled, >> multiple inputs sizes, adding it to small kernels etc.). >> I would love to hear any other suggestions that you may have. > > I would like to add some details of the intent. > > We would like to benchmark common algorithms that appear in many > sources (Think of linear algebra, image processing, etc.). Usually, > they only consist of a single function. Only the kernel itself is of > interest, but not e.g. the data initialization. Depending on the > optimization (e.g. by Polly, parallelization, offloading), the > execution time can vary widely. > > Pankaj already added a review at > https://reviews.llvm.org/D46735 > where the idea to use Google Benchmark came up. However, as such it > does not fulfill all the requirements, in particular, it does not > check for correctness. > > Here is a list of things I would like to see: > - Check correct output > - Execution time > - Compilation time > - LLVM pass statistics > - Code size > - Hardware performance counters > - Multiple problem sizes > - Measuring the above for the kernel only. > - Optional very large problem sizes (that test every cache level), > disabled by default > - Repeated execution to average-out noise > - Taking cold/warm cache into account > > Here is an idea on how this could be implemented: > Every benchmark consists of two files: The source for the kernel and a > driver that initializes the input data, knows how to call the kernel > in the other file and can check the correct output. > The framework recognizes all benchmarks and their drivers and does the > following: > - Compile the driver > - Time the compilation of the kernel using -save-stats=obj > - Get the kernel's code size using llvm-size > - Link driver, kernel and Google Benchmark together.I think you might run into artificial overhead here if you’re not careful. In particular you might run into: - Missed in-lining opportunity in the benchmark. If you expect the kernels to be potentially inlined, this might be a problem. - The link order might cause interference depending on the linker being used. - If you’re doing LTO then that would add an additional wrinkle. They’re not show-stoppers, but these are some of the things to look out for and consider.> - Instruct the driver to run the kernel with a small problem size and > check the correctness.In practice, what I’ve seen is mixing unit tests which perform correctness checks (using Google Test/Mock) and then co-locating the benchmarks in the same file. This way you can choose to run just the tests or the benchmarks in the same compilation mode. I’m not sure whether there’s already a copy of the Google Test/Mock libraries in the test-suite, but I’d think those shouldn’t be too hard (nor controversial) to add.> - Instructs Google Benchmark to run the kernel to get a reliable > average execution time of the kernel (without the input data > initialization)There’s ways to write the benchmarks so that you only measure a small part of the actual benchmark. The manuals will be really helpful in pointing out how to do that. https://github.com/google/benchmark#passing-arguments In particular, you can pause the timing when you’re doing the data initialisation and then resume just before you run the kernel.> - LNT's --exec-multisample does not need to run the benchmarks > multiple times, as Google Benchmark already did so.I thought recent patches already does some of this. Hal would know. Cheers PS. I’d be happy to do some reviews of uses of the Google Benchmark library, if you need additional reviewers. -- Dean