similar to: [LLVMdev] RFC:LNT Improvements

Displaying 20 results from an estimated 50000 matches similar to: "[LLVMdev] RFC:LNT Improvements"

2014 Apr 30
2
[LLVMdev] RFC:LNT Improvements
On 30 April 2014 10:50, Tobias Grosser <tobias at grosser.es> wrote: > Only then we can judge the effects of changes that are aimed to increase the > quality. Agreed. > My proposal is to do this right ahead. As there is enough data from the > public X86 -O3 runs (10 samples each run, with 3-5 commits between each > run), the only missing piece seems to be the LNT changes to
2014 Apr 30
4
[LLVMdev] RFC:LNT Improvements
On 30/04/2014 16:20, Yi Kong wrote: > Hi Tobias, Renato, > > Thanks for your attention to my RFC. > On 30 April 2014 07:50, Tobias Grosser <tobias at grosser.es> wrote: > >> - Show and graph total compile time > >> There is no obvious way to scale up the compile time of > >> individual benchmarks, so total time is the best thing we can do to >
2014 Apr 30
2
[LLVMdev] RFC:LNT Improvements
On 30 April 2014 07:50, Tobias Grosser <tobias at grosser.es> wrote: > In general, I see such changes as a second step. First, we want to have a > system in place that allows us to reliably detect if a benchmark is noisy or > not, second we want to increase the number of benchmarks that are not noisy > and where we can use the results. I personally use the test-suite for
2014 Apr 30
2
[LLVMdev] RFC:LNT Improvements
On 30 April 2014 10:21, Tobias Grosser <tobias at grosser.es> wrote: > To my understanding, the first patches should just improve LNT to report how > reliable the results are it reports. So there is no way that this can effect > the test suite runs, which means I do not see why we would want to delay > such changes. > > In fact, if we have a good idea which kernels are
2013 Jun 27
7
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
There are a few things we have looked at with LNT runs, so I will share the insights we have had so far. A lot of the problems we have are artificially created by our test protocols instead of the compiler changes themselves. I have been doing a lot of large sample runs of single benchmarks to characterize them better. Some key points: 1) Some benchmarks are bi-modal or multi-modal, single
2017 Feb 27
3
Noisy benchmark results?
Two other things: 1) I get massively more stable execution times on 16.04 than on 14.04 on both x86 and ARM because 16.04 does far fewer gratuitous moves from one core to another, even without explicit pinning. 2) turn off ASLR: "echo 0 > /proc/sys/kernel/randomize_va_space". As well as getting stable addresses for debugging repeatability, it also stabilizes execution time
2013 Jun 27
0
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
On Jun 27, 2013, at 9:27 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 27 June 2013 17:05, Tobias Grosser <tobias at grosser.es> wrote: > We are looking for a good way/value to show the reliability of individual results in the UI. Do you have some experience, what a good measure of the reliability of test results is? > > Hi Tobi, > > I had a look at
2017 Feb 27
8
Noisy benchmark results?
Hi, I'm trying to run the benchmark suite: http://llvm.org/docs/TestingGuide.html#test-suite-quickstart I'm doing it the lnt way, as described at: http://llvm.org/docs/lnt/quickstart.html I don't know what to expect but the results seems to be quite noisy and unstable. E.g I've done two runs on two different commits that only differ by a space in CODE_OWNERS.txt on my 12
2017 Feb 28
2
Noisy benchmark results?
> On Feb 27, 2017, at 1:36 AM, Kristof Beyls via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Mikael, > > Some noisiness in benchmark results is expected, but the numbers you see seem to be higher than I'd expect. > A number of tricks people use to get lower noise results are (with the lnt runtest nt command line options to enable it between brackets): > *
2015 May 15
6
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
tl;dr in low data situations we don’t look at past information, and that increases the false positive regression rate. We should look at the possibly incorrect recent past runs to fix that. Motivation: LNT’s current regression detection system has false positive rate that is too high to make it useful. With test suites as large as the llvm “test-suite” a single report will show hundreds of
2013 Jun 27
0
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
Just forwarding this to the list, my original reply was bounced. On Jun 27, 2013, at 11:14 AM, Chris Matthews <chris.matthews at apple.com> wrote: > There are a few things we have looked at with LNT runs, so I will share the insights we have had so far. A lot of the problems we have are artificially created by our test protocols instead of the compiler changes themselves. I have been
2014 Oct 16
3
[LLVMdev] Segfault on AArch64 LNT
Hi, Have you guys seen this? http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/1522 There are a lot of commits in there, and I'm far away from ARM64 hardware for a few days, so if one of you guys could have a look, it'd be great. :) cheers, --renato
2015 May 18
2
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
Hi Chris and others! I totally support any work in this direction. In the current state LNT’s regression detection system is too noisy, which makes it almost impossible to use in some cases. If after each run a developer gets a dozen of ‘regressions’, none of which happens to be real, he/she won’t care about such reports after a while. We clearly need to filter out as much noise as we can - and
2013 Jun 27
2
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
On 27 June 2013 17:05, Tobias Grosser <tobias at grosser.es> wrote: > We are looking for a good way/value to show the reliability of individual > results in the UI. Do you have some experience, what a good measure of the > reliability of test results is? > Hi Tobi, I had a look at this a while ago, but never got around to actually work on it. My idea was to never use
2015 May 19
5
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
The reruns flag already does that. It helps a bit, but only as long as the the benchmark is flagged as regressed. > On May 18, 2015, at 8:28 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Mon, May 18, 2015 at 11:24 AM, Mikhail Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote: > Hi Chris and others! > > I
2014 Jan 17
4
[LLVMdev] Why is the default LNT aggregation function min instead of mean
On Thu, Jan 16, 2014 at 5:32 PM, Tobias Grosser <tobias at grosser.es> wrote: > On 01/17/2014 02:17 AM, David Blaikie wrote: > >> Right - you usually won't see a normal distribution in the noise of test >> results. You'll see results clustered around the lower bound with a long >> tail of slower and slower results. Depending on how many samples you do it
2014 Aug 01
11
[LLVMdev] Dev Meeting BOF: Performance Tracking
All, I'm curious to know if anyone is interested in tracking performance (compile-time and/or execution-time) from a community perspective? This is a much loftier goal then just supporting build bots. If so, I'd be happy to propose a BOF at the upcoming Dev Meeting. Chad
2014 Jan 17
2
[LLVMdev] Why is the default LNT aggregation function min instead of mean
Right - you usually won't see a normal distribution in the noise of test results. You'll see results clustered around the lower bound with a long tail of slower and slower results. Depending on how many samples you do it might be appropriate to take the mean of the best 3, for example - but the general approach of taking the fastest N does have some basis in any case. Not necessarily the
2017 Mar 01
2
Noisy benchmark results?
On 28 Feb 2017, at 22:50, Michael Zolotukhin <mzolotukhin at apple.com<mailto:mzolotukhin at apple.com>> wrote: I also usually rerun suspiciously improved or regressed tests to verify the performance change. Most of the time, if it was just a noise, the test doesn’t appear on another run. I wish LNT (or any other script) could do that for me :) Michael Doesn't the lnt runtest nt
2015 May 27
2
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
Update: in that same block of 10,000 LLVM/Clang revisions, this the number of distinct SHA1 hashes for the binaries of the following benchmarks: 7 MultiSource/Applications/aha/aha 2 MultiSource/Benchmarks/BitBench/drop3/drop3 10 MultiSource/Benchmarks/BitBench/five11/five11 7 MultiSource/Benchmarks/BitBench/uudecode/uudecode 3 MultiSource/Benchmarks/BitBench/uuencode/uuencode 5