thr3ads.net - llvm dev - [LLVMdev] Recording hash of binaries in test-suite and LNT. [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Kristof Beyls

2015-Jul-07 18:37 UTC

[LLVMdev] Recording hash of binaries in test-suite and LNT.

I've implemented a test-suite patch and an LNT patch to calculate a hash
function
for each binary in the test-suite & to store it in the LNT database.

The test-suite patch is surprisingly simple. The only thing I had to do
to get stable hashes is to strip out the .comment and all .note sections.
The attached spreadsheet shows the calculated hashes by the patch across
the test-suite for a range of LLVM svn revisions from last week, each
roughly a day apart from each other. It does show indeed that on about
half of the days the binaries didn't change. The hashes were collected
on a linux-x86_64 system.

The attached lnt patch is quite a bit bigger - adding a new type of
sample field (hash) and adapting the rest of LNT to make LNT's regression
tests pass. I didn't attempt to make use of the hash values in any of
LNT's analyses or reports in this patch. I've got a vague idea that
maybe
the first easy & useful additions could be to color-code the background
in the run-chart with the hash-value of the binary. That way, you could
see which sample points were produced by identical binaries. The same
could be done for the spark lines on the daily report page.

Bottom line: at least on linux platforms, it seems that it's pretty
straightforward
to compute useful hashes from binaries pretty easily, see the attached
test-suite patch. I'm assuming that on Darwin platforms the exact same
patch - or maybe with some tweaks on which sections to strip - should
work too, but don't know enough about Darwin to know for sure.

The LNT changes are indeed more invasive. I've attached my current version
of the patch I've got for that.

What do you think of this approach?

Thanks,

Kristof
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> On Behalf Of Chris Matthews
> Sent: 21 May 2015 19:25
> To: Renato Golin
> Cc: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Proposal: change LNT's regression detection
> algorithm and how it is used to reduce false positives
> 
> I agree this is a great idea.  I think it needs to be fleshed out a
> little though.
> 
> It would still be wise to run the regression detection algorithm,
> because the test suite changes and the machines change, and the
> algorithm is not perfect yet.  It would be a valuable source of
> information though.
> 
> This is not a small change to how LNT works, so I think some due
> diligence is necessary.  Is clang *really* that deterministic,
> especially over successive revs?  I know it is supposed to be.  Does
> anyone have any data to show this is going to be an effective approach?
> It seems like there are benchmarks in the test-suite which use __DATE__
> and __TIME__ in them. I assume that will be a problem?
> 
> > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org>
> wrote:
> >
> > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com>
wrote:
> >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
> >> actually caused the binary of
MultiSource/Benchmarks/BitBench/five11
> >> to change. So if just store a hash of the binary in the database,
we
> >> should be able to pool all samples we have collected while the
binary
> >> is the the same as it currently is, which will let us use
> >> significantly more datapoints for the reference.
> >
> > +1
> >
> >
> >> Also, we can trivially eliminate running the regression detection
> >> algorithm if the binary hasn't changed.
> >
> > +2!
> >
> > --renato
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-support-for-storing-hash-of-test-binaries.patch
Type: application/octet-stream
Size: 60554 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/bb86ee1f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-suite-hash-binaries.patch
Type: application/octet-stream
Size: 2093 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/bb86ee1f/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-suite_hash_comparisons.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 301435 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/bb86ee1f/attachment.xlsx>

Chris Matthews

2015-Jul-07 21:47 UTC

head link

[LLVMdev] Recording hash of binaries in test-suite and LNT.

This is a big patch, it might take me a while to review it.
> On Jul 7, 2015, at 11:37 AM, Kristof Beyls <kristof.beyls at arm.com>
wrote:
> 
> <0001-Add-support-for-storing-hash-of-test-binaries.patch>

Sean Silva

2015-Jul-08 02:59 UTC

head link

[LLVMdev] Recording hash of binaries in test-suite and LNT.

Is there a way to avoid running the perf test for binaries that haven't
changed? I guess that it might be useful for a bit of redundancy, but for
doing the analysis I was doing, which involved bisecting back through
history to pinpoint at which revisions the hashes changed, it would be
useful to avoid wasting time benchmarking programs known to be the same
binary (if that matters, then there is a bug in how the perf is being
measured, or it is an unrelated system problem which, while it might be
interesting to dive into, may not be the focus).

It's interesting that you had to strip out the .comment and .note. I
didn't
have to do that on mac. Do you know if there is a linker flag or compiler
flag on linux that we can use to avoid outputting them in the first place?

-- Sean Silva

On Tue, Jul 7, 2015 at 11:37 AM, Kristof Beyls <kristof.beyls at arm.com>
wrote:
> I've implemented a test-suite patch and an LNT patch to calculate a
hash
> function
> for each binary in the test-suite & to store it in the LNT database.
>
> The test-suite patch is surprisingly simple. The only thing I had to do
> to get stable hashes is to strip out the .comment and all .note sections.
> The attached spreadsheet shows the calculated hashes by the patch across
> the test-suite for a range of LLVM svn revisions from last week, each
> roughly a day apart from each other. It does show indeed that on about
> half of the days the binaries didn't change. The hashes were collected
> on a linux-x86_64 system.
>
> The attached lnt patch is quite a bit bigger - adding a new type of
> sample field (hash) and adapting the rest of LNT to make LNT's
regression
> tests pass. I didn't attempt to make use of the hash values in any of
> LNT's analyses or reports in this patch. I've got a vague idea that
maybe
> the first easy & useful additions could be to color-code the background
> in the run-chart with the hash-value of the binary. That way, you could
> see which sample points were produced by identical binaries. The same
> could be done for the spark lines on the daily report page.
>
> Bottom line: at least on linux platforms, it seems that it's pretty
> straightforward
> to compute useful hashes from binaries pretty easily, see the attached
> test-suite patch. I'm assuming that on Darwin platforms the exact same
> patch - or maybe with some tweaks on which sections to strip - should
> work too, but don't know enough about Darwin to know for sure.
>
> The LNT changes are indeed more invasive. I've attached my current
version
> of the patch I've got for that.
>
> What do you think of this approach?
>
> Thanks,
>
> Kristof
>
> > -----Original Message-----
> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> > On Behalf Of Chris Matthews
> > Sent: 21 May 2015 19:25
> > To: Renato Golin
> > Cc: LLVM Developers Mailing List
> > Subject: Re: [LLVMdev] Proposal: change LNT's regression detection
> > algorithm and how it is used to reduce false positives
> >
> > I agree this is a great idea.  I think it needs to be fleshed out a
> > little though.
> >
> > It would still be wise to run the regression detection algorithm,
> > because the test suite changes and the machines change, and the
> > algorithm is not perfect yet.  It would be a valuable source of
> > information though.
> >
> > This is not a small change to how LNT works, so I think some due
> > diligence is necessary.  Is clang *really* that deterministic,
> > especially over successive revs?  I know it is supposed to be.  Does
> > anyone have any data to show this is going to be an effective
approach?
> > It seems like there are benchmarks in the test-suite which use
__DATE__
> > and __TIME__ in them. I assume that will be a problem?
> >
> > > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org>
> > wrote:
> > >
> > > On 20 May 2015 at 23:31, Sean Silva <chisophugis at
gmail.com> wrote:
> > >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
> > >> actually caused the binary of
MultiSource/Benchmarks/BitBench/five11
> > >> to change. So if just store a hash of the binary in the
database, we
> > >> should be able to pool all samples we have collected while
the binary
> > >> is the the same as it currently is, which will let us use
> > >> significantly more datapoints for the reference.
> > >
> > > +1
> > >
> > >
> > >> Also, we can trivially eliminate running the regression
detection
> > >> algorithm if the binary hasn't changed.
> > >
> > > +2!
> > >
> > > --renato
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/802bb79e/attachment.html>

Kristof Beyls

2015-Jul-08 09:51 UTC

head link

[LLVMdev] Recording hash of binaries in test-suite and LNT.

Not running the LNT perf tests for binaries that haven’t changed: I don’t think
there currently is a way to do that. If someone added that, I guess the most
complex part will be implementing the different format(s) in which to
communicate to LNT what the hash of the previous version of the tests are. There
is already logic to do the build and run step separately – search for
“config.build_threads” in lnt/tests/nt.py. There is also already logic to only
run sub-parts of the test-suite. The logic that needs adding is filtering the
tests to run based on comparing the hash values from the build step with
wherever the predefined uninteresting hash values come from.

I don’t know of compiler or linker flags to not produce or remove those .comment
and .note sections. It’s easy to strip them out before producing a hash, so I
think stripping them is better than requiring LNT to inject the necessary
command line options for all possible compilers and linkers in use.

Thanks,

Kristof

From: Sean Silva [mailto:chisophugis at gmail.com] 
Sent: 08 July 2015 04:00
To: Kristof Beyls
Cc: Chris Matthews; Renato Golin; LLVM Developers Mailing List; llvm-commits;
Smith, Kevin B; Philip Reames; Daniel Dunbar
Subject: Re: Recording hash of binaries in test-suite and LNT.

Is there a way to avoid running the perf test for binaries that haven't
changed? I guess that it might be useful for a bit of redundancy, but for doing
the analysis I was doing, which involved bisecting back through history to
pinpoint at which revisions the hashes changed, it would be useful to avoid
wasting time benchmarking programs known to be the same binary (if that matters,
then there is a bug in how the perf is being measured, or it is an unrelated
system problem which, while it might be interesting to dive into, may not be the
focus).

It's interesting that you had to strip out the .comment and .note. I
didn't have to do that on mac. Do you know if there is a linker flag or
compiler flag on linux that we can use to avoid outputting them in the first
place?

-- Sean Silva

On Tue, Jul 7, 2015 at 11:37 AM, Kristof Beyls <kristof.beyls at arm.com>
wrote:

I've implemented a test-suite patch and an LNT patch to calculate a hash
function
for each binary in the test-suite & to store it in the LNT database.

The test-suite patch is surprisingly simple. The only thing I had to do
to get stable hashes is to strip out the .comment and all .note sections.
The attached spreadsheet shows the calculated hashes by the patch across
the test-suite for a range of LLVM svn revisions from last week, each
roughly a day apart from each other. It does show indeed that on about
half of the days the binaries didn't change. The hashes were collected
on a linux-x86_64 system.

The attached lnt patch is quite a bit bigger - adding a new type of
sample field (hash) and adapting the rest of LNT to make LNT's regression
tests pass. I didn't attempt to make use of the hash values in any of
LNT's analyses or reports in this patch. I've got a vague idea that
maybe
the first easy & useful additions could be to color-code the background
in the run-chart with the hash-value of the binary. That way, you could
see which sample points were produced by identical binaries. The same
could be done for the spark lines on the daily report page.

Bottom line: at least on linux platforms, it seems that it's pretty
straightforward
to compute useful hashes from binaries pretty easily, see the attached
test-suite patch. I'm assuming that on Darwin platforms the exact same
patch - or maybe with some tweaks on which sections to strip - should
work too, but don't know enough about Darwin to know for sure.

The LNT changes are indeed more invasive. I've attached my current version
of the patch I've got for that.

What do you think of this approach?

Thanks,

Kristof
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> On Behalf Of Chris Matthews
> Sent: 21 May 2015 19:25
> To: Renato Golin
> Cc: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Proposal: change LNT's regression detection
> algorithm and how it is used to reduce false positives
>
> I agree this is a great idea.  I think it needs to be fleshed out a
> little though.
>
> It would still be wise to run the regression detection algorithm,
> because the test suite changes and the machines change, and the
> algorithm is not perfect yet.  It would be a valuable source of
> information though.
>
> This is not a small change to how LNT works, so I think some due
> diligence is necessary.  Is clang *really* that deterministic,
> especially over successive revs?  I know it is supposed to be.  Does
> anyone have any data to show this is going to be an effective approach?
> It seems like there are benchmarks in the test-suite which use __DATE__
> and __TIME__ in them. I assume that will be a problem?
>
> > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org>
> wrote:
> >
> > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com>
wrote:
> >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
> >> actually caused the binary of
MultiSource/Benchmarks/BitBench/five11
> >> to change. So if just store a hash of the binary in the database,
we
> >> should be able to pool all samples we have collected while the
binary
> >> is the the same as it currently is, which will let us use
> >> significantly more datapoints for the reference.
> >
> > +1
> >
> >
> >> Also, we can trivially eliminate running the regression detection
> >> algorithm if the binary hasn't changed.
> >
> > +2!
> >
> > --renato
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150708/fabd2b25/attachment.html>

Kristof Beyls

2015-Jul-15 18:20 UTC

head link

[LLVMdev] Recording hash of binaries in test-suite and LNT.

Sure - no problem.

To make the review easier, I've uploaded it into phabricator at
http://reviews.llvm.org/D11231.

Thanks,

Kristof
> -----Original Message-----
> From: Chris Matthews [mailto:chris.matthews at apple.com]
> Sent: 07 July 2015 22:47
> To: Kristof Beyls
> Cc: Renato Golin; LLVM Developers Mailing List; llvm-commits
> Subject: Re: [LLVMdev] Recording hash of binaries in test-suite and LNT.
> 
> This is a big patch, it might take me a while to review it.
> 
> > On Jul 7, 2015, at 11:37 AM, Kristof Beyls <kristof.beyls at
arm.com>
> wrote:
> >
> > <0001-Add-support-for-storing-hash-of-test-binaries.patch>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Jul 2015 - [LLVMdev] Recording hash of binaries in test-suite and LNT.

[LLVMdev] Recording hash of binaries in test-suite and LNT.

[LLVMdev] Recording hash of binaries in test-suite and LNT.

[LLVMdev] Recording hash of binaries in test-suite and LNT.

[LLVMdev] Recording hash of binaries in test-suite and LNT.

[LLVMdev] Recording hash of binaries in test-suite and LNT.

Maybe Matching Threads