Renato Golin
2015-May-21 08:43 UTC
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com> wrote:> In the last 10,000 revisions of LLVM+Clang, only 10 revisions actually > caused the binary of MultiSource/Benchmarks/BitBench/five11 to change. So if > just store a hash of the binary in the database, we should be able to pool > all samples we have collected while the binary is the the same as it > currently is, which will let us use significantly more datapoints for the > reference.+1> Also, we can trivially eliminate running the regression detection algorithm > if the binary hasn't changed.+2! --renato
Chris Matthews
2015-May-21 18:24 UTC
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
I agree this is a great idea. I think it needs to be fleshed out a little though. It would still be wise to run the regression detection algorithm, because the test suite changes and the machines change, and the algorithm is not perfect yet. It would be a valuable source of information though. This is not a small change to how LNT works, so I think some due diligence is necessary. Is clang *really* that deterministic, especially over successive revs? I know it is supposed to be. Does anyone have any data to show this is going to be an effective approach? It seems like there are benchmarks in the test-suite which use __DATE__ and __TIME__ in them. I assume that will be a problem?> On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at linaro.org> wrote: > > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com> wrote: >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions actually >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to change. So if >> just store a hash of the binary in the database, we should be able to pool >> all samples we have collected while the binary is the the same as it >> currently is, which will let us use significantly more datapoints for the >> reference. > > +1 > > >> Also, we can trivially eliminate running the regression detection algorithm >> if the binary hasn't changed. > > +2! > > --renato > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Renato Golin
2015-May-21 18:30 UTC
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
We also need to consider multiple binaries, like shared libraries that are compiled, for instance, libc++. Maybe a simple binary diff + ldd + binary diff on all deps would work... Cheers, Renato On 21 May 2015 7:24 pm, "Chris Matthews" <chris.matthews at apple.com> wrote:> I agree this is a great idea. I think it needs to be fleshed out a little > though. > > It would still be wise to run the regression detection algorithm, because > the test suite changes and the machines change, and the algorithm is not > perfect yet. It would be a valuable source of information though. > > This is not a small change to how LNT works, so I think some due diligence > is necessary. Is clang *really* that deterministic, especially over > successive revs? I know it is supposed to be. Does anyone have any data > to show this is going to be an effective approach? It seems like there are > benchmarks in the test-suite which use __DATE__ and __TIME__ in them. I > assume that will be a problem? > > > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at linaro.org> > wrote: > > > > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com> wrote: > >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions actually > >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to change. > So if > >> just store a hash of the binary in the database, we should be able to > pool > >> all samples we have collected while the binary is the the same as it > >> currently is, which will let us use significantly more datapoints for > the > >> reference. > > > > +1 > > > > > >> Also, we can trivially eliminate running the regression detection > algorithm > >> if the binary hasn't changed. > > > > +2! > > > > --renato > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150521/1603eb9b/attachment.html>
Sean Silva
2015-May-21 21:13 UTC
[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
On Thu, May 21, 2015 at 11:24 AM, Chris Matthews <chris.matthews at apple.com> wrote:> I agree this is a great idea. I think it needs to be fleshed out a little > though. > > It would still be wise to run the regression detection algorithm, because > the test suite changes and the machines change, and the algorithm is not > perfect yet. It would be a valuable source of information though. >How would running it as part of regular testing change anything? Presumably the only purpose it would serve is retrospectively going back and seeing false-positives in the aggregate. But if we are already doing offline analysis, we can run the regression detection algorithm (or any prospective ones) offline on the raw data; it doesn't take that long.> > This is not a small change to how LNT works, so I think some due diligence > is necessary. Is clang *really* that deterministic, especially over > successive revs?Yes. Actually, google's build system depends on this for its caching strategy to work and so the google guys are usually on top of any issues in this respect (thanks google guys!).> I know it is supposed to be. Does anyone have any data to show this is > going to be an effective approach? It seems like there are benchmarks in > the test-suite which use __DATE__ and __TIME__ in them. I assume that will > be a problem? >__DATE__ and __TIME__ should be easy to solve by modifying the benchmark, or teaching clang to always return a fixed value for them (maybe we already have this? IIRC google's build system does something like this; or maybe the do it at the OS level). -- Sean Silva> > > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at linaro.org> > wrote: > > > > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com> wrote: > >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions actually > >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to change. > So if > >> just store a hash of the binary in the database, we should be able to > pool > >> all samples we have collected while the binary is the the same as it > >> currently is, which will let us use significantly more datapoints for > the > >> reference. > > > > +1 > > > > > >> Also, we can trivially eliminate running the regression detection > algorithm > >> if the binary hasn't changed. > > > > +2! > > > > --renato > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150521/39198687/attachment.html>
Kristof Beyls via llvm-dev
2015-Oct-02 07:44 UTC
[llvm-dev] [LLVMdev] Proposal: change LNT's regression detection algorithm and how it is used to reduce false positives
FWIW - the patch to record the hash of binaries from the test-suite into the LNT database has finally landed yesterday, see r249026, r249034, r249035. So far, LNT only records the hash data into its database, but doesn't use it in any analysis or chart yet. If you upgrade your instance of LNT now, hashes will start being recorded. Future uses of these hashes in LNT analyses will be able to make use of historical hashes from the point in time you've started using the now top-of-trunk LNT. One idea on how to use the data, next to the automatic noise analysis algorithm, is to color the background of charts based on the hash value, so that it's immediately visible for which time periods the binary remained the same. At least for the sparklines on the daily report page, this shouldn't be too hard to do. We ought to also upgrade the instance of LNT running at llvm.org/perf, but I'm still a bit confused over who knows how to do that? Tanya or Daniel, could you do that? Thanks, Kristof> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Chris Matthews > Sent: 21 May 2015 19:25 > To: Renato Golin > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Proposal: change LNT's regression detection > algorithm and how it is used to reduce false positives > > I agree this is a great idea. I think it needs to be fleshed out a > little though. > > It would still be wise to run the regression detection algorithm, > because the test suite changes and the machines change, and the > algorithm is not perfect yet. It would be a valuable source of > information though. > > This is not a small change to how LNT works, so I think some due > diligence is necessary. Is clang *really* that deterministic, > especially over successive revs? I know it is supposed to be. Does > anyone have any data to show this is going to be an effective approach? > It seems like there are benchmarks in the test-suite which use __DATE__ > and __TIME__ in them. I assume that will be a problem? > > > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at linaro.org> > wrote: > > > > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com> wrote: > >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions > >> actually caused the binary of MultiSource/Benchmarks/BitBench/five11 > >> to change. So if just store a hash of the binary in the database, we > >> should be able to pool all samples we have collected while the binary > >> is the the same as it currently is, which will let us use > >> significantly more datapoints for the reference. > > > > +1 > > > > > >> Also, we can trivially eliminate running the regression detection > >> algorithm if the binary hasn't changed. > > > > +2! > > > > --renato > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev