thr3ads.net - llvm dev - [LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Chris Matthews

2015-Jun-02 19:04 UTC

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

I like that idea!

> On Jun 2, 2015, at 12:00 PM, Smith, Kevin B <kevin.b.smith at
intel.com> wrote:
> 
> The code for cmpimage and getdep consists of five source files, with the
following sizes
>  
> $ wc *
>   5912  20353 191869 cmpimage.cpp
>    290   1328  10668 elf.h
>   1496   5006  41691 getdep.cpp
>    233    959   7692 macho.h
>    403   1831  18394 pecoff.h
>   8334  29477 270314 total
>  
> to build each of them is just a simple compilation for whatever C++
compiler you happen to be using (clang, icc, cl, g++)
>  
> $(CXX) –o cmpimage –O2 cmpimage.cpp
> $(CXX) –o getdep –O2 getdep.cpp
>  
> This seems like it would fit rather easily into test-suite/tools, which
already exists and has a Makefile that the commands to build
> these could be integrated into.
>  
> This is my best guess/opinion based on a cursory look over the test-suite
directory structure.
>  
> Kevin
>  
>  
> From: Chris Matthews [mailto:chris.matthews at apple.com] 
> Sent: Thursday, May 28, 2015 1:02 PM
> To: Smith, Kevin B
> Cc: Philip Reames; Sean Silva; LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>  
> Where is the best place to keep this?  
>  
> - As third party tool we all use?
> - Contribute as new project?
> - Lives in test-suite/utils?
> - Lives in llvm/utils?
>  
> On May 28, 2015, at 11:22 AM, Smith, Kevin B <kevin.b.smith at intel.com
<mailto:kevin.b.smith at intel.com>> wrote:
>  
> OK, there is interest from at least a couple of people.  What should next
steps be?
>  
> Kevin
>  
> From: Chris Matthews [mailto:chris.matthews at apple.com
<mailto:chris.matthews at apple.com>]
> Sent: Thursday, May 28, 2015 10:57 AM
> To: Philip Reames
> Cc: Smith, Kevin B; Sean Silva; LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>  
> I agree. I think there are a lot of exciting uses for this tool.  A stage 3
build bot would be another one.
>  
> On May 28, 2015, at 10:14 AM, Philip Reames <listmail at
philipreames.com <mailto:listmail at philipreames.com>> wrote:
>  
> I'd love to see this tool contributed, even it isn't used for
regression detection work.  I've got a couple of hacked up scripts which do
similar things and having a robust tool available for this would be very useful.
> 
> Philip
> 
> On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
> Intel has a binary comparator tool that we have been using for several
years for comparing output binaries
> to see if the code within them is considered identical.  We use it to
eliminate runs (and therefore some performance noise)
> from our own performance tracking tools.
>  
> We are willing to contribute the source code for this to the LLVM community
if there is interest.
>  
> There are two programs involved:  getdep, which displays the list of
DLL/.so dependencies of the image in question, and cmpimage itself, which does
the comparison ignoring the parts not contributed by the compiler.  The cmpimage
program is also almost completely derived from the published object format
descriptions.
>  
> Let me know if there is interest in these pieces of tooling, and if so,
what you think next steps should be.
>  
> Kevin B. Smith
>  
> From: llvmdev-bounces at cs.uiuc.edu <mailto:llvmdev-bounces at
cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu
<mailto:llvmdev-bounces at cs.uiuc.edu>] On Behalf Of Sean Silva
> Sent: Thursday, May 21, 2015 2:14 PM
> To: Chris Matthews
> Cc: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>  
>  
>  
> On Thu, May 21, 2015 at 11:24 AM, Chris Matthews <chris.matthews at
apple.com <mailto:chris.matthews at apple.com>> wrote:
> I agree this is a great idea.  I think it needs to be fleshed out a little
though.
> 
> It would still be wise to run the regression detection algorithm, because
the test suite changes and the machines change, and the algorithm is not perfect
yet.  It would be a valuable source of information though.
>  
> How would running it as part of regular testing change anything? Presumably
the only purpose it would serve is retrospectively going back and seeing
false-positives in the aggregate. But if we are already doing offline analysis,
we can run the regression detection algorithm (or any prospective ones) offline
on the raw data; it doesn't take that long.
>  
> 
> This is not a small change to how LNT works, so I think some due diligence
is necessary.  Is clang *really* that deterministic, especially over successive
revs?
>  
> Yes. Actually, google's build system depends on this for its caching
strategy to work and so the google guys are usually on top of any issues in this
respect (thanks google guys!).
>  
>  
> I know it is supposed to be.  Does anyone have any data to show this is
going to be an effective approach?  It seems like there are benchmarks in the
test-suite which use __DATE__ and __TIME__ in them. I assume that will be a
problem?
>  
> __DATE__ and __TIME__ should be easy to solve by modifying the benchmark,
or teaching clang to always return a fixed value for them (maybe we already have
this? IIRC google's build system does something like this; or maybe the do
it at the OS level).
>  
> -- Sean Silva
>  
> 
> > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org <mailto:renato.golin at linaro.org>> wrote:
> >
> > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com
<mailto:chisophugis at gmail.com>> wrote:
> >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
actually
> >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to
change. So if
> >> just store a hash of the binary in the database, we should be able
to pool
> >> all samples we have collected while the binary is the the same as
it
> >> currently is, which will let us use significantly more datapoints
for the
> >> reference.
> >
> > +1
> >
> >
> >> Also, we can trivially eliminate running the regression detection
algorithm
> >> if the binary hasn't changed.
> >
> > +2!
> >
> > --renato
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>  
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>  
>  
>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150602/331b21a0/attachment.html>

Philip Reames

2015-Jun-02 21:24 UTC

head link

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

Personally, I would prefer this either live in it's own repository, or 
llvm/tools/.  None of my use cases will likely involve the test-suite.

p.s. If this is going to end up an llvm tool, it will need to follow 
LLVM style.

p.p.s. We should probably start a new thread with the proposed addition 
since I imagine many folks are ignoring this one by now given how deep 
it's gotten.

Philip

On 06/02/2015 12:04 PM, Chris Matthews wrote:> I like that idea!
>
>
>> On Jun 2, 2015, at 12:00 PM, Smith, Kevin B <kevin.b.smith at
intel.com
>> <mailto:kevin.b.smith at intel.com>> wrote:
>>
>> The code for cmpimage and getdep consists of five source files, with 
>> the following sizes
>>
>> $ wc *
>>
>>   5912  20353 191869 cmpimage.cpp
>>
>>    290   1328  10668 elf.h
>>
>>   1496   5006  41691 getdep.cpp
>>
>>    233    959   7692 macho.h
>>
>>    403   1831  18394 pecoff.h
>>
>>   8334  29477 270314 total
>>
>> to build each of them is just a simple compilation for whatever C++ 
>> compiler you happen to be using (clang, icc, cl, g++)
>>
>> $(CXX) –o cmpimage –O2 cmpimage.cpp
>>
>> $(CXX) –o getdep –O2 getdep.cpp
>>
>> This seems like it would fit rather easily into test-suite/tools, 
>> which already exists and has a Makefile that the commands to build
>>
>> these could be integrated into.
>>
>> This is my best guess/opinion based on a cursory look over the 
>> test-suite directory structure.
>>
>> Kevin
>>
>> *From:*Chris Matthews [mailto:chris.matthews at apple.com]
>> *Sent:* Thursday, May 28, 2015 1:02 PM
>> *To:* Smith, Kevin B
>> *Cc:* Philip Reames; Sean Silva; LLVM Developers Mailing List
>> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection 
>> algorithm and how it is used to reduce false positives
>>
>> Where is the best place to keep this?
>>
>> - As third party tool we all use?
>>
>> - Contribute as new project?
>>
>> - Lives in test-suite/utils?
>>
>> - Lives in llvm/utils?
>>
>>     On May 28, 2015, at 11:22 AM, Smith, Kevin B
>>     <kevin.b.smith at intel.com <mailto:kevin.b.smith at
intel.com>> wrote:
>>
>>     OK, there is interest from at least a couple of people.  What
>>     should next steps be?
>>
>>     Kevin
>>
>>     *From:*Chris Matthews [mailto:chris.matthews at apple.com]
>>     *Sent:* Thursday, May 28, 2015 10:57 AM
>>     *To:* Philip Reames
>>     *Cc:* Smith, Kevin B; Sean Silva; LLVM Developers Mailing List
>>     *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression
>>     detection algorithm and how it is used to reduce false positives
>>
>>     I agree. I think there are a lot of exciting uses for this tool.
>>      A stage 3 build bot would be another one.
>>
>>         On May 28, 2015, at 10:14 AM, Philip Reames
>>         <listmail at philipreames.com
>>         <mailto:listmail at philipreames.com>> wrote:
>>
>>         I'd love to see this tool contributed, even it isn't
used for
>>         regression detection work.  I've got a couple of hacked up
>>         scripts which do similar things and having a robust tool
>>         available for this would be very useful.
>>
>>         Philip
>>
>>         On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
>>
>>             Intel has a binary comparator tool that we have been
>>             using for several years for comparing output binaries
>>
>>             to see if the code within them is considered identical.
>>             We use it to eliminate runs (and therefore some
>>             performance noise)
>>
>>             from our own performance tracking tools.
>>
>>             We are willing to contribute the source code for this to
>>             the LLVM community if there is interest.
>>
>>             There are two programs involved: getdep, which displays
>>             the list of DLL/.so dependencies of the image in
>>             question, and cmpimage itself, which does the comparison
>>             ignoring the parts not contributed by the compiler.  The
>>             cmpimage program is also almost completely derived from
>>             the published object format descriptions.
>>
>>             Let me know if there is interest in these pieces of
>>             tooling, and if so, what you think next steps should be.
>>
>>             Kevin B. Smith
>>
>>             *From:*llvmdev-bounces at cs.uiuc.edu
>>             <mailto:llvmdev-bounces at cs.uiuc.edu>
>>             [mailto:llvmdev-bounces at cs.uiuc.edu] *On Behalf Of *Sean
>>             Silva
>>             *Sent:* Thursday, May 21, 2015 2:14 PM
>>             *To:* Chris Matthews
>>             *Cc:* LLVM Developers Mailing List
>>             *Subject:* Re: [LLVMdev] Proposal: change LNT’s
>>             regression detection algorithm and how it is used to
>>             reduce false positives
>>
>>             On Thu, May 21, 2015 at 11:24 AM, Chris Matthews
>>             <chris.matthews at apple.com
>>             <mailto:chris.matthews at apple.com>> wrote:
>>
>>             I agree this is a great idea.  I think it needs to be
>>             fleshed out a little though.
>>
>>             It would still be wise to run the regression detection
>>             algorithm, because the test suite changes and the
>>             machines change, and the algorithm is not perfect yet. 
>>             It would be a valuable source of information though.
>>
>>             How would running it as part of regular testing change
>>             anything? Presumably the only purpose it would serve is
>>             retrospectively going back and seeing false-positives in
>>             the aggregate. But if we are already doing offline
>>             analysis, we can run the regression detection algorithm
>>             (or any prospective ones) offline on the raw data; it
>>             doesn't take that long.
>>
>>
>>                 This is not a small change to how LNT works, so I
>>                 think some due diligence is necessary.  Is clang
>>                 *really* that deterministic, especially over
>>                 successive revs?
>>
>>             Yes. Actually, google's build system depends on this
for
>>             its caching strategy to work and so the google guys are
>>             usually on top of any issues in this respect (thanks
>>             google guys!).
>>
>>                 I know it is supposed to be. Does anyone have any
>>                 data to show this is going to be an effective
>>                 approach?  It seems like there are benchmarks in the
>>                 test-suite which use __DATE__ and __TIME__ in them. I
>>                 assume that will be a problem?
>>
>>             __DATE__ and __TIME__ should be easy to solve by
>>             modifying the benchmark, or teaching clang to always
>>             return a fixed value for them (maybe we already have
>>             this? IIRC google's build system does something like
>>             this; or maybe the do it at the OS level).
>>
>>             -- Sean Silva
>>
>>
>>                 > On May 21, 2015, at 1:43 AM, Renato Golin
>>                 <renato.golin at linaro.org
>>                 <mailto:renato.golin at linaro.org>> wrote:
>>                 >
>>                 > On 20 May 2015 at 23:31, Sean Silva
>>                 <chisophugis at gmail.com
>>                 <mailto:chisophugis at gmail.com>> wrote:
>>                 >> In the last 10,000 revisions of LLVM+Clang,
only
>>                 10 revisions actually
>>                 >> caused the binary of
>>                 MultiSource/Benchmarks/BitBench/five11 to change. So if
>>                 >> just store a hash of the binary in the
database,
>>                 we should be able to pool
>>                 >> all samples we have collected while the binary
is
>>                 the the same as it
>>                 >> currently is, which will let us use
significantly
>>                 more datapoints for the
>>                 >> reference.
>>                 >
>>                 > +1
>>                 >
>>                 >
>>                 >> Also, we can trivially eliminate running the
>>                 regression detection algorithm
>>                 >> if the binary hasn't changed.
>>                 >
>>                 > +2!
>>                 >
>>                 > --renato
>>
>>                 > _______________________________________________
>>                 > LLVM Developers mailing list
>>                 > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>                 http://llvm.cs.uiuc.edu
<http://llvm.cs.uiuc.edu/>
>>                 > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>>
>>             _______________________________________________
>>
>>             LLVM Developers mailing list
>>
>>             LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at
cs.uiuc.edu>          http://llvm.cs.uiuc.edu 
<http://llvm.cs.uiuc.edu/>
>>
>>             http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150602/7636fc74/attachment.html>

Sean Silva

2015-Jun-03 01:04 UTC

head link

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

On Tue, Jun 2, 2015 at 2:24 PM, Philip Reames <listmail at
philipreames.com>
wrote:
>  Personally, I would prefer this either live in it's own repository, or
> llvm/tools/.  None of my use cases will likely involve the test-suite.
>
> p.s. If this is going to end up an llvm tool, it will need to follow LLVM
> style.
>
> p.p.s. We should probably start a new thread with the proposed addition
> since I imagine many folks are ignoring this one by now given how deep
it's
> gotten.
>
Maybe just putting it on github for now is easiest to at least make it
generally available for review. If we later want to officially pull it in
or integrate it with our build system we can do that.

-- Sean Silva

>
>
> Philip
>
>
> On 06/02/2015 12:04 PM, Chris Matthews wrote:
>
> I like that idea!
>
>
>   On Jun 2, 2015, at 12:00 PM, Smith, Kevin B <kevin.b.smith at
intel.com>
> wrote:
>
>   The code for cmpimage and getdep consists of five source files, with
> the following sizes
>
>
>
> $ wc *
>
>   5912  20353 191869 cmpimage.cpp
>
>    290   1328  10668 elf.h
>
>   1496   5006  41691 getdep.cpp
>
>    233    959   7692 macho.h
>
>    403   1831  18394 pecoff.h
>
>   8334  29477 270314 total
>
>
>
> to build each of them is just a simple compilation for whatever C++
> compiler you happen to be using (clang, icc, cl, g++)
>
>
>
> $(CXX) –o cmpimage –O2 cmpimage.cpp
>
> $(CXX) –o getdep –O2 getdep.cpp
>
>
>
> This seems like it would fit rather easily into test-suite/tools, which
> already exists and has a Makefile that the commands to build
>
> these could be integrated into.
>
>
>
> This is my best guess/opinion based on a cursory look over the test-suite
> directory structure.
>
>
>
> Kevin
>
>
>
>
>
> *From:* Chris Matthews [mailto:chris.matthews at apple.com
> <chris.matthews at apple.com>]
> *Sent:* Thursday, May 28, 2015 1:02 PM
> *To:* Smith, Kevin B
> *Cc:* Philip Reames; Sean Silva; LLVM Developers Mailing List
> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection
> algorithm and how it is used to reduce false positives
>
>
>
> Where is the best place to keep this?
>
>
>
> - As third party tool we all use?
>
> - Contribute as new project?
>
> - Lives in test-suite/utils?
>
> - Lives in llvm/utils?
>
>
>
>  On May 28, 2015, at 11:22 AM, Smith, Kevin B <kevin.b.smith at
intel.com>
> wrote:
>
>
>
> OK, there is interest from at least a couple of people.  What should next
> steps be?
>
>
>
> Kevin
>
>
>
> *From:* Chris Matthews [mailto:chris.matthews at apple.com
> <chris.matthews at apple.com>]
> *Sent:* Thursday, May 28, 2015 10:57 AM
> *To:* Philip Reames
> *Cc:* Smith, Kevin B; Sean Silva; LLVM Developers Mailing List
> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection
> algorithm and how it is used to reduce false positives
>
>
>
> I agree. I think there are a lot of exciting uses for this tool.  A stage
> 3 build bot would be another one.
>
>
>
>  On May 28, 2015, at 10:14 AM, Philip Reames <listmail at
philipreames.com>
> wrote:
>
>
>
> I'd love to see this tool contributed, even it isn't used for
regression
> detection work.  I've got a couple of hacked up scripts which do
similar
> things and having a robust tool available for this would be very useful.
>
> Philip
>
> On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
>
> Intel has a binary comparator tool that we have been using for several
> years for comparing output binaries
>
> to see if the code within them is considered identical.  We use it to
> eliminate runs (and therefore some performance noise)
>
> from our own performance tracking tools.
>
>
>
> We are willing to contribute the source code for this to the LLVM
> community if there is interest.
>
>
>
> There are two programs involved:  getdep, which displays the list of
> DLL/.so dependencies of the image in question, and cmpimage itself, which
> does the comparison ignoring the parts not contributed by the compiler.
> The cmpimage program is also almost completely derived from the published
> object format descriptions.
>
>
>
> Let me know if there is interest in these pieces of tooling, and if so,
> what you think next steps should be.
>
>
>
> Kevin B. Smith
>
>
>
> *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu
> <llvmdev-bounces at cs.uiuc.edu>] *On Behalf Of *Sean Silva
> *Sent:* Thursday, May 21, 2015 2:14 PM
> *To:* Chris Matthews
> *Cc:* LLVM Developers Mailing List
> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection
> algorithm and how it is used to reduce false positives
>
>
>
>
>
>
>
> On Thu, May 21, 2015 at 11:24 AM, Chris Matthews <chris.matthews at
apple.com>
> wrote:
>
> I agree this is a great idea.  I think it needs to be fleshed out a little
> though.
>
> It would still be wise to run the regression detection algorithm, because
> the test suite changes and the machines change, and the algorithm is not
> perfect yet.  It would be a valuable source of information though.
>
>
>
> How would running it as part of regular testing change anything?
> Presumably the only purpose it would serve is retrospectively going back
> and seeing false-positives in the aggregate. But if we are already doing
> offline analysis, we can run the regression detection algorithm (or any
> prospective ones) offline on the raw data; it doesn't take that long.
>
>
>
>
> This is not a small change to how LNT works, so I think some due diligence
> is necessary.  Is clang *really* that deterministic, especially over
> successive revs?
>
>
>
> Yes. Actually, google's build system depends on this for its caching
> strategy to work and so the google guys are usually on top of any issues in
> this respect (thanks google guys!).
>
>
>
>
>
> I know it is supposed to be.  Does anyone have any data to show this is
> going to be an effective approach?  It seems like there are benchmarks in
> the test-suite which use __DATE__ and __TIME__ in them. I assume that will
> be a problem?
>
>
>
> __DATE__ and __TIME__ should be easy to solve by modifying the benchmark,
> or teaching clang to always return a fixed value for them (maybe we already
> have this? IIRC google's build system does something like this; or
maybe
> the do it at the OS level).
>
>
>
> -- Sean Silva
>
>
>
>
> > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org>
> wrote:
> >
> > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com>
wrote:
> >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
actually
> >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to
change.
> So if
> >> just store a hash of the binary in the database, we should be able
to
> pool
> >> all samples we have collected while the binary is the the same as
it
> >> currently is, which will let us use significantly more datapoints
for
> the
> >> reference.
> >
> > +1
> >
> >
> >> Also, we can trivially eliminate running the regression detection
> algorithm
> >> if the binary hasn't changed.
> >
> > +2!
> >
> > --renato
>
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
>
>
>  _______________________________________________
>
> LLVM Developers mailing list
>
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150602/77dc9979/attachment.html>

Chris Matthews

2015-Jun-03 21:44 UTC

head link

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

On the original thread topic, in r238965 I committed a much better detection
algorithm, which uses a min of diffs approach. And in r238968 I updated the
daily report to pass more data to make use of this.  The change brings the false
positive rate down to about ~15% on our internal reports.  For the last two
weeks I have actually been able to detect real regression in the 1% range!

This approach only works when the previous samples set is large enough to have
some meaningful previous information in it.  Call sites of the regression
detection have to be changed to pass more data.  For the daily report I changed
it from comparing last run on current and previous days, to comparing last run
on first day to all runs on previous day.

If this works out well, lets consider changing Run comparisons and Field
comparisons to work in a similar way.
> On Jun 2, 2015, at 6:04 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> 
> 
> 
> On Tue, Jun 2, 2015 at 2:24 PM, Philip Reames <listmail at
philipreames.com <mailto:listmail at philipreames.com>> wrote:
> Personally, I would prefer this either live in it's own repository, or
llvm/tools/.  None of my use cases will likely involve the test-suite.
> 
> p.s. If this is going to end up an llvm tool, it will need to follow LLVM
style.
> 
> p.p.s. We should probably start a new thread with the proposed addition
since I imagine many folks are ignoring this one by now given how deep it's
gotten.
> 
> Maybe just putting it on github for now is easiest to at least make it
generally available for review. If we later want to officially pull it in or
integrate it with our build system we can do that.
> 
> -- Sean Silva
>  
> 
> 
> Philip
> 
> 
> On 06/02/2015 12:04 PM, Chris Matthews wrote:
>> I like that idea!
>> 
>> 
>>> On Jun 2, 2015, at 12:00 PM, Smith, Kevin B <kevin.b.smith at
intel.com <mailto:kevin.b.smith at intel.com>> wrote:
>>> 
>>> The code for cmpimage and getdep consists of five source files,
with the following sizes
>>> 
>>>  
>>> 
>>> $ wc *
>>> 
>>>   5912  20353 191869 cmpimage.cpp
>>> 
>>>    290   1328  10668 elf.h
>>> 
>>>   1496   5006  41691 getdep.cpp
>>> 
>>>    233    959   7692 macho.h
>>> 
>>>    403   1831  18394 pecoff.h
>>> 
>>>   8334  29477 270314 total
>>> 
>>>  
>>> 
>>> to build each of them is just a simple compilation for whatever C++
compiler you happen to be using (clang, icc, cl, g++)
>>> 
>>>  
>>> 
>>> $(CXX) –o cmpimage –O2 cmpimage.cpp
>>> 
>>> $(CXX) –o getdep –O2 getdep.cpp
>>> 
>>>  
>>> 
>>> This seems like it would fit rather easily into test-suite/tools,
which already exists and has a Makefile that the commands to build
>>> 
>>> these could be integrated into.
>>> 
>>>  
>>> 
>>> This is my best guess/opinion based on a cursory look over the
test-suite directory structure.
>>> 
>>>  
>>> 
>>> Kevin
>>> 
>>>  
>>> 
>>>  
>>> 
>>> From: Chris Matthews [mailto:chris.matthews at apple.com
<mailto:chris.matthews at apple.com>]
>>> Sent: Thursday, May 28, 2015 1:02 PM
>>> To: Smith, Kevin B
>>> Cc: Philip Reames; Sean Silva; LLVM Developers Mailing List
>>> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>>> 
>>>  
>>> 
>>> Where is the best place to keep this?  
>>> 
>>>  
>>> 
>>> - As third party tool we all use?
>>> 
>>> - Contribute as new project?
>>> 
>>> - Lives in test-suite/utils?
>>> 
>>> - Lives in llvm/utils?
>>> 
>>>  
>>> 
>>> On May 28, 2015, at 11:22 AM, Smith, Kevin B <kevin.b.smith at
intel.com <mailto:kevin.b.smith at intel.com>> wrote:
>>> 
>>>  
>>> 
>>> OK, there is interest from at least a couple of people.  What
should next steps be?
>>> 
>>>  
>>> 
>>> Kevin
>>> 
>>>  
>>> 
>>> From: Chris Matthews [mailto:chris.matthews at apple.com
<mailto:chris.matthews at apple.com>]
>>> Sent: Thursday, May 28, 2015 10:57 AM
>>> To: Philip Reames
>>> Cc: Smith, Kevin B; Sean Silva; LLVM Developers Mailing List
>>> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>>> 
>>>  
>>> 
>>> I agree. I think there are a lot of exciting uses for this tool.  A
stage 3 build bot would be another one.
>>> 
>>>  
>>> 
>>> On May 28, 2015, at 10:14 AM, Philip Reames <listmail at
philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>> 
>>>  
>>> 
>>> I'd love to see this tool contributed, even it isn't used
for regression detection work.  I've got a couple of hacked up scripts which
do similar things and having a robust tool available for this would be very
useful.
>>> 
>>> Philip
>>> 
>>> On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
>>> 
>>> Intel has a binary comparator tool that we have been using for
several years for comparing output binaries
>>> 
>>> to see if the code within them is considered identical.  We use it
to eliminate runs (and therefore some performance noise)
>>> 
>>> from our own performance tracking tools.
>>> 
>>>  
>>> 
>>> We are willing to contribute the source code for this to the LLVM
community if there is interest.
>>> 
>>>  
>>> 
>>> There are two programs involved:  getdep, which displays the list
of DLL/.so dependencies of the image in question, and cmpimage itself, which
does the comparison ignoring the parts not contributed by the compiler.  The
cmpimage program is also almost completely derived from the published object
format descriptions.
>>> 
>>>  
>>> 
>>> Let me know if there is interest in these pieces of tooling, and if
so, what you think next steps should be.
>>> 
>>>  
>>> 
>>> Kevin B. Smith
>>> 
>>>  
>>> 
>>> From: llvmdev-bounces at cs.uiuc.edu <mailto:llvmdev-bounces at
cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu
<mailto:llvmdev-bounces at cs.uiuc.edu>] On Behalf Of Sean Silva
>>> Sent: Thursday, May 21, 2015 2:14 PM
>>> To: Chris Matthews
>>> Cc: LLVM Developers Mailing List
>>> Subject: Re: [LLVMdev] Proposal: change LNT’s regression detection
algorithm and how it is used to reduce false positives
>>> 
>>>  
>>> 
>>>  
>>> 
>>>  
>>> 
>>> On Thu, May 21, 2015 at 11:24 AM, Chris Matthews <chris.matthews
at apple.com <mailto:chris.matthews at apple.com>> wrote:
>>> 
>>> I agree this is a great idea.  I think it needs to be fleshed out a
little though.
>>> 
>>> It would still be wise to run the regression detection algorithm,
because the test suite changes and the machines change, and the algorithm is not
perfect yet.  It would be a valuable source of information though.
>>> 
>>>  
>>> 
>>> How would running it as part of regular testing change anything?
Presumably the only purpose it would serve is retrospectively going back and
seeing false-positives in the aggregate. But if we are already doing offline
analysis, we can run the regression detection algorithm (or any prospective
ones) offline on the raw data; it doesn't take that long.
>>> 
>>>  
>>> 
>>> 
>>> This is not a small change to how LNT works, so I think some due
diligence is necessary.  Is clang *really* that deterministic, especially over
successive revs?
>>> 
>>>  
>>> 
>>> Yes. Actually, google's build system depends on this for its
caching strategy to work and so the google guys are usually on top of any issues
in this respect (thanks google guys!).
>>> 
>>>  
>>> 
>>>  
>>> 
>>> I know it is supposed to be.  Does anyone have any data to show
this is going to be an effective approach?  It seems like there are benchmarks
in the test-suite which use __DATE__ and __TIME__ in them. I assume that will be
a problem?
>>> 
>>>  
>>> 
>>> __DATE__ and __TIME__ should be easy to solve by modifying the
benchmark, or teaching clang to always return a fixed value for them (maybe we
already have this? IIRC google's build system does something like this; or
maybe the do it at the OS level).
>>> 
>>>  
>>> 
>>> -- Sean Silva
>>> 
>>>  
>>> 
>>> 
>>> > On May 21, 2015, at 1:43 AM, Renato Golin <renato.golin at
linaro.org <mailto:renato.golin at linaro.org>> wrote:
>>> >
>>> > On 20 May 2015 at 23:31, Sean Silva <chisophugis at
gmail.com <mailto:chisophugis at gmail.com>> wrote:
>>> >> In the last 10,000 revisions of LLVM+Clang, only 10
revisions actually
>>> >> caused the binary of
MultiSource/Benchmarks/BitBench/five11 to change. So if
>>> >> just store a hash of the binary in the database, we should
be able to pool
>>> >> all samples we have collected while the binary is the the
same as it
>>> >> currently is, which will let us use significantly more
datapoints for the
>>> >> reference.
>>> >
>>> > +1
>>> >
>>> >
>>> >> Also, we can trivially eliminate running the regression
detection algorithm
>>> >> if the binary hasn't changed.
>>> >
>>> > +2!
>>> >
>>> > --renato
>>> 
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>  
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>>  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>       
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>>  
>>> 
>>>  
>>> 
>>>  
>>> 
>> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150603/309c79d3/attachment.html>

llvm dev - Jun 2015 - [LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives