Kristof Beyls
2013-Oct-29 15:54 UTC
[LLVMdev] [RFC] Performance tracking and benchmarking infrastructure BoF
Hi, Next week at the developers meeting, I'm chairing a BoF session on improving our performance tracking and benchmarking infrastructure. I'd like to make the most out of the 45 minute slot. Therefore, I'd like to start the discussion a bit earlier here, giving everyone who can't come to the BoF a chance to put in their 2 cents. At the same time, I hope this will also give me a chance to collect and structure ideas and opinions resulting in a better starting point for the discussion during the BoF session. The main motivation for organizing this BoF is my impression that it's harder to collaborate on patches that improve the quality of generated code, compared to collaborating on patches that fix bugs. I think that with enhancements to the buildbot infrastructure, it should be possible to make it easier to collaborate on performance-enhancing-patches. I'd like to discuss here, and during the BoF session, what the enhancements are that we'd need the most, why we need them, and what the main expected difficulties are that we'd need to overcome to get these enhancements implemented. To kick off the discussion, let me propose a number of enhancements in functionality that I think would enable easier collaboration on performance-enhancing patches the most. I'd very much welcome feedback and more ideas: * Early and automated detection of significant performance regressions, with a low rate of false positives. Rationale: The buildbots currently do a great job at catching accidental correctness regressions in an automated fashion with a reasonably low false positive rate. It'd be great if they would also catch significant performance regressions automatically with a reasonable false positive rate. * Having common public performance data, before committing a patch, enabling everyone to review and evaluate the positive and negative effects of optimization patches. Rationale: Currently, very little performance data is typically provided when a patch is put up for review. Having a reasonable set of performance numbers would make it easier for reviewers to evaluate the value of a patch. * Make it possible to evaluate the performance impact of a patch on architectures or platforms that the developer doesn't have access to before committing the patch. Rationale: Most developers probably do not have access to all architectures or platforms that the community as a whole cares about. Being able to verify that a patch doesn't regress performance on other platforms is probably as useful as testing basic correctness on platforms a developer doesn't have access to. The regression tests provide the functionality to check that there are no serious correctness regressions on other platforms. A way to verify that performance isn't negatively affected on platforms a developer doesn't have access to would be nice. One way to achieve it would be to allow a top-of-trunk+patch build to be run on all benchmarks in the test-suite, on all boards reserved for benchmarking in the buildbot setup. I'm sure that to get the above functionality implemented, quite a few technical and non-technical issues have to be resolved. In the interest of keeping this email a bit more focussed, I've decided to not yet mention the issues I'm expecting, but just the functional enhancements that I think are the most useful. Thanks, Kristof
Possibly Parallel Threads
- [LLVMdev] Proposal: Improvements to Performance Tracking Infrastructure.
- [LLVMdev] Proposal: Improvements to Performance Tracking Infrastructure.
- [LLVMdev] Dev Meeting BOF: Performance Tracking
- BoF: Enhancing LLVM's Floating-Point Exception and Rounding Mode Support
- [LLVMdev] Dev Meeting BOF: Performance Tracking