thr3ads.net - llvm dev - [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization [May 2016]

If this information is useful, please help other people find it:
Share via:

Elena Lepilkina via llvm-dev

2016-May-13 06:21 UTC

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

Hi all,

As we understood great changes will be done in LNT, so we are waiting to new LNT
version and stopped our work in LNT.

One more question about using test-suite separately with cmake. Cmake can only
build all tests and generate lit tests. After that we can run LIT and get report
which is not equal with report (simple) got with make. Cmake test-suite version
has no features to run custom metrics and generate other report type, right?

Are these features of make-version of test-suite planned to be added?

Thanks,

Elena.

From: daniel.dunbar at gmail.com [mailto:daniel.dunbar at gmail.com] On Behalf
Of Daniel Dunbar
Sent: Wednesday, April 27, 2016 10:15 AM
To: Elena Lepilkina <Elena.Lepilkina at synopsys.com>
Cc: Kristof Beyls <Kristof.Beyls at arm.com>; James Molloy <james at
jamesmolloy.co.uk>; llvm-dev <llvm-dev at lists.llvm.org>; nd <nd at
arm.com>
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization

Hi all,

First off, let me ask one question about the use case you intend to support:

Is your expectation for LLVM statistics style data that this would be present
for almost all runs in a database, or that it would only be present for a small
subset of runs?


Second, here is my general perspective on the LNT system architecture:

1. It is very important to me that LNT continue to support an
"turn-key" experience. In fact, I *wish* that experience would get
easier, not harder, for example by moving all interactions with the server to a
true admin interface. I would prefer not to introduce additional dependencies in
that case.

I will add that I am not completely opposed to moving to a "turn-key"
architecture which requires substantially more components (e.g., PostgreSQL,
memcached, etc.) as long as it was packaged in such a way that it could offer a
nice turn key experience. For example, if someone was willing to implement &
document a Vagrant or Docker based development model, that would be ok with me,
as long as there was still the option to do fully native deployments on OS X
with real system administrator support.

2. Our internal application architecture is severely lacking. I believe the
standard architecture these days would be to have (load-balancer + front-end +
[load-balancer] + back-end + database), and I think partly we are suffering from
missing most of that infrastructure. In particular, we are missing two important
components:
- There should be a separate backend, which would allow us to implement improved
caching, and a clear model for long-lived or persistent state. I filed this as:
https://llvm.org/bugs/show_bug.cgi?id=27534
- This would give us a place to manage background processing tasks, for example
jobs that reprocess the raw sample data for efficient queries.
If we had such a thing, we could consider using something like memcached for
improving caching (particularly in a larger deployment).

3. My intention was always to use JSON blobs for situations where a small % of
samples want to include arbitrary blobs of extra data. I think standardizing on
PostgreSQL/JSON is the right approach here, given the standard PaaS support for
PostgreSQL and its existing use within our deployments. SQLAlchemy has native
support for this.

4. I don't actually think that a NoSQL database buys us much if anything
(particularly when we have PostgreSQL/JSON available). We are not in a situation
where we would need something like eventual consistency around writes, which
leaves us wanting raw query performance over large sets of relatively
well-structured data. I believe that is a situation which is best served by SQL
with properly designed tables. This is also an area where having an
infrastructure that could handle background processing to load query-optimized
tables & indices would be valuable.

I think it is a mistake to believe that using NoSQL and a schema-less model
without also introducing substantial caching will somehow give better
performance than a real SQL database, and would be very interested to see
performance data showing otherwise (e.g., compare your MongoDB implementation to
one in which all data was in a full-schematized table with proper indices).

5. The database schema used by LNT (which is dynamically instantiated in
response to test suite configurations) is admittedly unorthodox. However, I
still believe this is the right answer to support a turn-key performance
tracking solution that can be customized by the user based on the fields they
wish to track (in fact, if Elena's data is present for almost all samples
then it is exactly the designed use case). I'm not yet convinced that the
right answer here isn't to improve the actual implementation of this
approach; for example, we could require the definition to be created when the
LNT instance is set up, which might solve some of the issues Chris mentioned.
Or, we could expose improved tools or a UI for interacting with the schema via
an admin interface, for example to promote individual from a JSON blob to being
first class for improved performance/reporting.

6. To echo one of James's points, I also do think that ultimately the most
complicated part of adding support for arbitrary metrics is not the database
implementation, but managing the complexity in the reporting and graphing logic
(which is already cumbersome). I am fine seeing a layered implementation
approach where we first focus on how just to get the data in, but I do think it
is worth spending some time looking at how to manage the visualizations of that
data.

 - Daniel


On Tue, Apr 26, 2016 at 11:17 PM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Of course it'll be running as now. But user will need have installed
MongoDB.

Installation on linux with support of .deb packages is quite easy.
sudo apt-key adv --keyserver
hkp://keyserver.ubuntu.com:80<http://keyserver.ubuntu.com:80> --recv
EA312927
echo "deb http://repo.mongodb.org/apt/debian wheezy/mongodb-org/3.2
main" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo service mongod start

After that mongo will be running service in localhost:27017. After that user
should create database
mongo
use <db name>

And set database name in config file. As additional fields will be host and port
for users who will do settings for their server, which will have default
values(localhost:27017).

After that old steps
~/mysandbox/bin/python ~/lnt/setup.py develop
lnt create ~/myperfdb
lnt runserver ~/myperfdb

MongoDB has detailed instructions for installing for all operating systems.

Extra 6-7 commands for install MongoDB, which takes about 2 minutes and should
be executed once shouldn't be a great problem for new users, who would like
to try LNT.

Thanks,

Elena.

-----Original Message-----
From: Kristof Beyls [mailto:Kristof.Beyls at arm.com<mailto:Kristof.Beyls at
arm.com>]
Sent: Tuesday, April 26, 2016 5:15 PM
To: James Molloy <james at jamesmolloy.co.uk<mailto:james at
jamesmolloy.co.uk>>
Cc: Elena Lepilkina <Elena.Lepilkina at
synopsys.com<mailto:Elena.Lepilkina at synopsys.com>>; Chris Matthews
<chris.matthews at apple.com<mailto:chris.matthews at apple.com>>;
llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; nd <nd at arm.com<mailto:nd at arm.com>>
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization
I think it's important that it remains simple to get a simple local instance
up and running.
That will make it easier for people to give LNT a try, and also makes it easier
for LNT developers to have everything they need for developing locally.
I have no experience with NoSQL database engines. Would it be possible, assuming
you have the MongoDB/other packages installed for your system, to just run

$ ~/mysandbox/bin/python ~/lnt/setup.py develop $ lnt create ~/myperfdb $ lnt
runserver ~/myperfdb

and be up and running (which is roughly what is required at the moment)?
Of course good documentation could go a long way if a few extra steps would be
needed.

I do agree with James that if there are no major concerns for using a NoSQL
database, it would be easiest if we only supported one database engine.
For example, I had to do quite a bit of LNT regression test mangling to make
them work on both sqlite and postgres, and it remains harder than it should be
to write regression tests that test the database interface.

Thanks,

Kristof

> On 26 Apr 2016, at 11:37, James Molloy via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Hi Elena,
>
> Thanks for pushing forward with this. I like the idea of using a NoSQL
solution.
>
> My primary reservation is about adding the new NoSQL stuff as an extra
backend. I think the current database backend and its use of SQLAlchemy is
extremely complex and is in fact the most complex part of LNT. Adding something
more (as opposed to *replacing* it) would just make this worse and make it more
likely that contributors wouldn't be able to test LNT very well (having
three engines to test: SQLite, PostgreSQL and MongoDB).
>
> I think it'd be far better all around, if we decide to go with the
NoSQL solution, to just bite the bullet and force users who want to run a server
to install MongoDB.
>
> In my experience most of the teams I've seen using LNT have a single
LNT server instance and submit results to that, rather than launching small
instances to run "lnt viewcomparison".
>
> Cheers,
>
> James
>
> On Tue, 26 Apr 2016 at 09:15 Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
> Hi everyone.
>
>
>
> Thanks to everyone who took participant in discussion of this proposal.
>
> After discussion we understood how other users use LNT and how great
datasets may be.
>
>
>
> So there is new updated proposal.
>
> (Google docs version with some images -
> https://docs.google.com/document/d/11qHNWRIQ2gc2aWH3gNBeeCUB3-JPe7AoMt
> n7n9JoCeY/edit?usp=sharing)
>
>
>
> Goal is the same.
>
> Enable LNT support of custom metrics such as: user-defined run-time and
static metrics (power, etc.) and LLVM pass statistic counters. Provide
integration with LLVM testsuite to automatically collect LLVM statistic counters
or custom metrics.
>
>
>
> Analysis of current Database
>
>
>
> Limitations
>
> 1.      This structure isn’t flexible.
>
> There is no opportunity to run another test-suite except simple one.
>
> 2.      Performance is quite bad when database has a lot of records.
>
> For example, rendering graph is too slow. On
green-dragon-07-x86_64-O3-flto:42 SingleSource/Benchmarks/Shootout/objinst  
compile_time need for rendering 191.8 seconds.
>
> 3.       It’s difficult to add new features which need queries to sample
table in database(if we use BLOB field for custom metrics).
>
> Queries will be needed for more complex analysis. For example, if we would
like to add some additional check for tests which compile time is too long, we
should get result of query where this metric is  greater than some constant.
>
> Or we would like to compare tests with different run options, so we should
get only some tests but not all.
>
> BLOB field will help to save current structure and make system a bit more
flexible. But in the nearest future it will be not enough.
>
> Getting all metrics of all tests will make work slow on great datasets. And
this way isn’t enough optimal.
>
> So we wouldn’t like to do BLOB field, which wouldn’t help to add new
features and have flexible system in future.
>
>
>
> Proposal
>
>
>
> We suggest to do third part of LNT (as Chris Matthews suggested). This part
will be used for getting custom metrics and running any test-suite.
>
> We suggest to use NoSQL database (for example, MongoDB or JSON/JSONB
extension of PostgresSQL, which let use PostgresSQL as NoSQL database) for this
part. This part will be enable if there is path to NoSQL database in config
file.
>
> It helps to have one Sample table(collection in NoSQL). If we use
schemaless feature in MongoDB, for example, then it’s possible to add new fields
when new testsuite is running.  Then there would be one table with a lot of
fields, some of which are empty. At any moment of time it will be possible to
change schema of table(document).
>
> A small prototype was made with MongoDB and ORM MongoEngine. This ORM was
choosen because MongoAlchemy doesn’t support schemaless features and last
MongoKit version has error with last pymongo release.
>
> I try it on virtual machine and get next results on 5 000 000 records.
>
> Current scheme - 13.72 seconds
>
> MongoDB – 1.35 seconds.
>
> Results of course will be better on real server machine .
>
>
>
> For use some test-suite user should describe fields in file with format
.fields such way:
>
> {
>
>  "Fields" : [{
>
>    "TestSuiteName" : "Bytecode",
>
>    "Type" : "integer",
>
>    "BiggerIsBetter" : 0,
>
>     "Show" : true
>
>  },
>
>  {
>
>    "TestSuiteName" : "GCC",
>
>    "Type" : "real",
>
>    "BiggerIsBetter" : 0,
>
>    "Name" : "GCC time"
>
>  },
>
>  {
>
>    "TestSuiteName" : "JIT",
>
>    "Type" : "real",
>
>    "BiggerIsBetter" : 0,
>
>    "Name" : "JIT Compile time",
>
>    "Show" : true
>
>  },
>
>  {
>
>    "TestSuiteName" : "GCC/LLC",
>
>    "Type" : "string",
>
>    "BiggerIsBetter" : 0
>
>  }]
>
> }
>
>
>
> There was added one field “Show” for describing if this metric should be
shown by default on web page (as James Molloy suggested). Other metrics would be
added to page if user chooses them in view options.
>
>
>
> Conclusion
>
>
>
> This change will let user to choose if he wants to use flexible powerful
system or use limited version with SQLite database.
>
> If user chooses NoSQL version his data can be copied from its old database
to new one. This will help to use new features without losing old data.
>
>
>
> The actual question is which NoSQL database will be better for LNT. We are
interested in opinions of people, who know features of LNT better.
>
>
>
> Thanks,
>
>
>
> Elena.
>
>
>
> From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
> Elena Lepilkina via llvm-dev
> Sent: Tuesday, April 26, 2016 9:07 AM
> To: chris.matthews at apple.com<mailto:chris.matthews at apple.com>
>
>
> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi, Chris.
>
>
>
> Thank you for your answer about compile tests. As I understood during
looking through code of compile tests they don’t use test suite at all. Am I
right? There is lack of information and examples of running compile tests in LNT
documentation.
>
> We understood that there are two groups of users: users using servers and
collecting a lot of data and SQLite users, but these users as I think wouldn’t
have about millions of sample records.
>
> I think that it’s obvious that there is no universal solution for simple
installing process and flexible high-loaded system.
>
> I will update proposal and take into consideration your suggestion about
third part of test-suite.
>
>
>
> Thanks
>
>
>
> Elena.
>
>
>
> From: chris.matthews at apple.com<mailto:chris.matthews at apple.com>
[mailto:chris.matthews at apple.com<mailto:chris.matthews at apple.com>]
> Sent: Monday, April 25, 2016 8:06 PM
> To: Elena Lepilkina <Elena.Lepilkina at
synopsys.com<mailto:Elena.Lepilkina at synopsys.com>>
> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> I am really torn about this.
>
>
>
> When I implemented the regression tracking stuff recently, it really showed
me how badly we are scaling.  On our production server, the run ingestion can
take well over 100s.  Time is mostly spent in FieldChange generation and
regression grouping. Both have to access a lot of recent samples. This is not
the end of the world, because it runs in a background process.  Where this
really sucks is when a regression has a lot indicators. The web interface
renders these in a graph, and just trying to pull down 100 graphs worth of data
kills the server.  I ended up limiting those to a max of 10 datasets, and even
that takes 30s to load.
>
>
>
> So I do think we need some improvements to the scalability.
>
>
>
> LNT usage is spread between two groups. Users who setup big servers, with
Postgres and apache/Gunicorn. For those uses I think a NoSQL is the way to go.  
However, our second (and probably more common) user, is the people running
little instance on their own machine to do some local compiler benchmarking. 
Their setup process needs to be dead simple, and I think requiring a NoSQL
database to be setup on their machine first is a no starter.  Like we do with
sqlite, I think we need a transparent fall back for people who don’t have a
NoSQL database.
>
>
>
> Would it be helpful to anyone if I got a dump of the
llvm.org<http://llvm.org> LNT Postgres database?  It is a good dataset big
dataset to test with, and I assume everyone is okay with it being public, since
the LNT server already is.
>
>
>
>
>
> On Apr 25, 2016, at 4:33 AM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
>
>
>
>
>
> From: Elena Lepilkina
> Sent: Monday, April 25, 2016 2:33 PM
> To: 'James Molloy' <james at jamesmolloy.co.uk<mailto:james
at jamesmolloy.co.uk>>; Kristof Beyls
> <Kristof.Beyls at arm.com<mailto:Kristof.Beyls at arm.com>>;
Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at
apple.com>>
> Cc: nd <nd at arm.com<mailto:nd at arm.com>>; Matthias Braun
<matze at braunis.de<mailto:matze at braunis.de>>
> Subject: RE: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi everyone,
>
>
>
> Thank you for your answer. BLOB format adds some more actions for working
with metrics. We know that ComparisonResult class makes analysis work. But it
gets all metrics by request from database, we will need additional time for work
with fields during analysis in ComparisonResult class. May be it will be better
to do one Sample table for each testsuite, as it was suggested before. It should
be more quickly, shouldn’t it? Moreover, next wished LNT changes will need
getting some metrics separately and BLOB format will add some delay in time for
queries.
>
>
>
> As we see now problem of performance is actual, because time for rendering
graph page is about 3 minutes.
>
> <image001.png>
>
> So maybe it will be better to start working with NoSql databases? I made a
small prototype with TestSuite, TestSuiteFields, Test, Run and Sample tables for
getting time metrics. It works quickly. And using NoSQL helps solve problems
with  different fields for samples metrics fields. Then it will be possible to
store different metrics for different testsuites in one table.
>
> What do you think about this proposal?
>
> I used MongoDB, but I know that there is NoSQL extension for
> PostgresSQL with JSONB fields which are more
>
> effective than JSON-encoded BLOB, because it can be included in queries
very simply and let use indexes.
>
>
>
> About proposal that not all metrics should be shown. It can be added as a
field in JSON in .fields file, which describes fields getted from test-suite. To
see other metrics user should choose them with checkboxes in view options. Will
be this solution suitable?
>
> We can make as you wrote
>
> “I'd also suggest that if we're adding many more metrics to a test,
we should create a "test sample information" page that the test link
goes to instead of just the graph. This page could contain all counter/metric
data, historic sparklines, the full graph and profiling links.
>
> ”
>
> But the render time of this page will be too great because of graph render
time. In my opinion, some users wouldn’t like to wait so long for see some
additional metrics.
>
>
>
> Thanks for your suggestions,
>
>
>
> Elena.
>
> From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
> James Molloy via llvm-dev
> Sent: Monday, April 25, 2016 12:43 PM
> To: Kristof Beyls <Kristof.Beyls at arm.com<mailto:Kristof.Beyls at
arm.com>>; Mehdi Amini
> <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>>
> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; nd <nd at arm.com<mailto:nd at arm.com>>;
Matthias
> Braun <matze at braunis.de<mailto:matze at braunis.de>>
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi Sergey, Elena,
>
>
>
> Firstly, thanks for this RFC. It's great to see more people actively
using and modifying LNT and the test metrics support in general is rather weak
currently.
>
>
>
> Metrics
>
> -------
>
>
>
> I agree with Daniel and Kristof that your proposed schema changes have the
potential to make many queries extremely slow. Certainly for the metrics
enhancements, I don't see a reason why we need such a radical change in
schema.
>
>
>
> To add custom metrics on the fly, we need to change the schema for the
Sample table. Currently this consists of a column for each metric, but actually
we never ever query those metric values. We never query for example for
"all failing tests in a run" - when we do analyses we use the
ComparisonResult class which reads *all* samples from the database for a run and
performs the analysis entirely in Python.
>
>
>
> Therefore, having a semi-structured format where some fields are
first-class columns and the rest are in a JSON-encoded BLOB (as Daniel suggests)
seems totally acceptable. There is certainly an argument now that we're
using the wrong backend storage solution and that a key-value store might be
more suitable, but that's a very invasive change and I don't think
we've reached the point where we need to force a move from the simplicity of
SQLite.
>
>
>
> Adding an extra BLOB column would be easy - there would just need to be
logic in testsuitedb.py for reading and writing it - the Sample model class
would expose the JSON-encoded fields as normal python fields so the rest of LNT
would be isolated from this change.
>
>
>
> But I think this is a small detail compared to the bigger problem of how to
effectively *display* all this new data. Currently every new metric gets its own
separate table in the report/run views, and this does not scale well at all.
>
>
>
> I think we need some more concepts in the metric system to make it
scaleable:
>
>
>
>   * What "attribute" of the test is this metric measuring? For
example, both "exec_time" and "score" measure the same
attribute; performance of the generated code. It's superfluous to have them
displayed in separate tables. However mem_size and compile_time both measure
completely different aspects of the test.
>
>   * Is this metric useful to display at the top level? or should it only be
exposed when more data about a test result is requested?
>
>     * An example of this is the pass statistics. I don't want my daily
report view cluttered by the time spent in register allocation for every test!
OK, this is useful information when debugging a problem, but it should be
available when requested rather than by default.
>
>
>
> An example of why we need the above is your screenshots in your google doc.
I'm looking at the last screenshot, and it's incredibly difficult to
read and get useful information out of.
>
>
>
> I'd also suggest that if we're adding many more metrics to a test,
we should create a "test sample information" page that the test link
goes to instead of just the graph. This page could contain all counter/metric
data, historic sparklines, the full graph and profiling links.
>
>
>
> Cheers,
>
>
>
> James
>
>
>
> On Fri, 22 Apr 2016 at 10:17 Kristof Beyls via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
>
> On 22 Apr 2016, at 11:14, Mehdi Amini <mehdi.amini at
apple.com<mailto:mehdi.amini at apple.com>> wrote:
>
>
>
>
> On Apr 22, 2016, at 12:45 AM, Kristof Beyls via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
>
>
> On 21 Apr 2016, at 17:44, Sergey Yakoushkin <sergey.yakoushkin at
gmail.com<mailto:sergey.yakoushkin at gmail.com>> wrote:
>
>
>
> Hi Kristof,
>
>
>
>        The way we use LNT, we would run different configuration (e.g. -O3
vs -Os) as different "machines" in LNT's model.
>
>
>
> O2/O3 is indeed bad example. We're also using different machines for
Os/O3 - such parameters apply to all tests and we don't propose major
changes.
>
> Elena was only extending LNT interface a bit to ease LLVM-testsuite
execution with different compiler or HW flags.
>
>
>
> Oh I see, this boils down to extending the lnt runtest interface to be
> able to specify a set of configurations, rather than a single
> configuration and making
>
> sure configurations get submitted under different machine names? We
> kick off the different configuration runs through a script invoking
> lnt runtest multiple
>
> times. I don't see a big problem with extending the lnt runtest
> interface to do this, assuming it doesn't break the underlying
> concepts assumed throughout
>
> LNT. Maybe the only downside is that this will add even more command
> line options to lnt runtest, which already has a lot (too many?)
> command line
>
> options.
>
>
>
> Maybe some changes are required to analyze and compare metrics between
"machines": e.g. code size/performance between Os/O2/O3.
>
> Do you perform such comparisons?
>
>
>
> We typically do these kinds of comparisons when we test our patches
pre-commit, i.e. comparing for example '-O3' with '-O3 'mllvm
-enable-my-new-pass'.
>
> To stick with the LNT concepts, tests enabling new passes are stored as a
different "machine".
>
> The only way I know to be able to do a comparison between runs on 2
> different "machine"s is to manually edit the URL for run vs run
> comparison
>
> and fill in the runids of the 2 runs you want to compare.
>
> For example, the following URL is a comparison of
green-dragon-07-x86_64-O3-flto vs green-dragon-06-x86_64-O0-g on the public
llvm.org/perf<http://llvm.org/perf> server:
>
> http://llvm.org/perf/db_default/v4/nts/70644?compare_to=70634
>
> I had to manually look up and fill in the run ids 70644 and 70634.
>
> It would be great if there was a better way to be able to do these kind of
comparisons - i.e. not having to manually fill in run ids, but having a webui to
easily find and pick the runs you want to compare.
>
> (As an aside: I find it intriguing that the URL above suggests that there
are quite a few cases where "-O0 -g" produces faster code than
"-O3 -flto").
>
>
>
> Can you be more explicit which ones? I don't see any regression (other
than compared to the baseline, or on the compile time).
>
>
>
> --
>
> Mehdi
>
>
>
> D'Oh! I was misinterpreting the compile time differences as execution
time differences. Indeed, there is no unexpected result in there.
>
> Sorry for the noise!
>
>
>
> Kristof
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160513/74fc8acb/attachment-0001.html>

James Molloy via llvm-dev

2016-May-13 07:45 UTC

head link

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

Hi Elena,

I'm sorry to hear you're stopping work on LNT for the moment :(
> Are these features of make-version of test-suite planned to be added?
The test-suite runner has support for custom metrics. What features in
particular (can you be more specific?) about the Make-based test-suite do
you miss in the CMake-based one?

Cheers,

James

On Fri, 13 May 2016 at 07:21 Elena Lepilkina <Elena.Lepilkina at
synopsys.com>
wrote:
> Hi all,
>
>
>
> As we understood great changes will be done in LNT, so we are waiting to
> new LNT version and stopped our work in LNT.
>
>
>
> One more question about using test-suite separately with cmake. Cmake can
> only build all tests and generate lit tests. After that we can run LIT and
> get report which is not equal with report (simple) got with make. Cmake
> test-suite version has no features to run custom metrics and generate other
> report type, right?
>
>
>
> Are these features of make-version of test-suite planned to be added?
>
>
>
> Thanks,
>
>
>
> Elena.
>
>
>
> *From:* daniel.dunbar at gmail.com [mailto:daniel.dunbar at gmail.com] *On
> Behalf Of *Daniel Dunbar
>
>
> *Sent:* Wednesday, April 27, 2016 10:15 AM
>
> *To:* Elena Lepilkina <Elena.Lepilkina at synopsys.com>
> *Cc:* Kristof Beyls <Kristof.Beyls at arm.com>; James Molloy <
> james at jamesmolloy.co.uk>; llvm-dev <llvm-dev at
lists.llvm.org>; nd <
> nd at arm.com>
>
>
> *Subject:* Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi all,
>
>
>
> First off, let me ask one question about the use case you intend to
> support:
>
>
>
> Is your expectation for LLVM statistics style data that this would be
> present for almost all runs in a database, or that it would only be present
> for a small subset of runs?
>
>
>
>
>
> Second, here is my general perspective on the LNT system architecture:
>
>
>
> 1. It is very important to me that LNT continue to support an
"turn-key"
> experience. In fact, I *wish* that experience would get easier, not harder,
> for example by moving all interactions with the server to a true admin
> interface. I would prefer not to introduce additional dependencies in that
> case.
>
>
>
> I will add that I am not completely opposed to moving to a
"turn-key"
> architecture which requires substantially more components (e.g.,
> PostgreSQL, memcached, etc.) as long as it was packaged in such a way that
> it could offer a nice turn key experience. For example, if someone was
> willing to implement & document a Vagrant or Docker based development
> model, that would be ok with me, as long as there was still the option to
> do fully native deployments on OS X with real system administrator support.
>
>
>
> 2. Our internal application architecture is severely lacking. I believe
> the standard architecture these days would be to have (load-balancer +
> front-end + [load-balancer] + back-end + database), and I think partly we
> are suffering from missing most of that infrastructure. In particular, we
> are missing two important components:
>
> - There should be a separate backend, which would allow us to implement
> improved caching, and a clear model for long-lived or persistent state. I
> filed this as: https://llvm.org/bugs/show_bug.cgi?id=27534
>
> - This would give us a place to manage background processing tasks, for
> example jobs that reprocess the raw sample data for efficient queries.
>
> If we had such a thing, we could consider using something like memcached
> for improving caching (particularly in a larger deployment).
>
>
>
> 3. My intention was always to use JSON blobs for situations where a small
> % of samples want to include arbitrary blobs of extra data. I think
> standardizing on PostgreSQL/JSON is the right approach here, given the
> standard PaaS support for PostgreSQL and its existing use within our
> deployments. SQLAlchemy has native support for this.
>
>
>
> 4. I don't actually think that a NoSQL database buys us much if
anything
> (particularly when we have PostgreSQL/JSON available). We are not in a
> situation where we would need something like eventual consistency around
> writes, which leaves us wanting raw query performance over large sets of
> relatively well-structured data. I believe that is a situation which is
> best served by SQL with properly designed tables. This is also an area
> where having an infrastructure that could handle background processing to
> load query-optimized tables & indices would be valuable.
>
>
>
> I think it is a mistake to believe that using NoSQL and a schema-less
> model without also introducing substantial caching will somehow give better
> performance than a real SQL database, and would be very interested to see
> performance data showing otherwise (e.g., compare your MongoDB
> implementation to one in which all data was in a full-schematized table
> with proper indices).
>
>
>
> 5. The database schema used by LNT (which is dynamically instantiated in
> response to test suite configurations) is admittedly unorthodox. However, I
> still believe this is the right answer to support a turn-key performance
> tracking solution that can be customized by the user based on the fields
> they wish to track (in fact, if Elena's data is present for almost all
> samples then it is exactly the designed use case). I'm not yet
convinced
> that the right answer here isn't to improve the actual implementation
of
> this approach; for example, we could require the definition to be created
> when the LNT instance is set up, which might solve some of the issues Chris
> mentioned. Or, we could expose improved tools or a UI for interacting with
> the schema via an admin interface, for example to promote individual from a
> JSON blob to being first class for improved performance/reporting.
>
>
>
> 6. To echo one of James's points, I also do think that ultimately the
most
> complicated part of adding support for arbitrary metrics is not the
> database implementation, but managing the complexity in the reporting and
> graphing logic (which is already cumbersome). I am fine seeing a layered
> implementation approach where we first focus on how just to get the data
> in, but I do think it is worth spending some time looking at how to manage
> the visualizations of that data.
>
>
>
>  - Daniel
>
>
>
>
>
> On Tue, Apr 26, 2016 at 11:17 PM, Elena Lepilkina via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Of course it'll be running as now. But user will need have installed
> MongoDB.
>
> Installation on linux with support of .deb packages is quite easy.
> sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
> echo "deb http://repo.mongodb.org/apt/debian wheezy/mongodb-org/3.2
main"
> | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
> sudo apt-get update
> sudo apt-get install -y mongodb-org
> sudo service mongod start
>
> After that mongo will be running service in localhost:27017. After that
> user should create database
> mongo
> use <db name>
>
> And set database name in config file. As additional fields will be host
> and port for users who will do settings for their server, which will have
> default values(localhost:27017).
>
> After that old steps
> ~/mysandbox/bin/python ~/lnt/setup.py develop
> lnt create ~/myperfdb
> lnt runserver ~/myperfdb
>
> MongoDB has detailed instructions for installing for all operating systems.
>
> Extra 6-7 commands for install MongoDB, which takes about 2 minutes and
> should be executed once shouldn't be a great problem for new users, who
> would like to try LNT.
>
> Thanks,
>
> Elena.
>
> -----Original Message-----
> From: Kristof Beyls [mailto:Kristof.Beyls at arm.com]
> Sent: Tuesday, April 26, 2016 5:15 PM
> To: James Molloy <james at jamesmolloy.co.uk>
> Cc: Elena Lepilkina <Elena.Lepilkina at synopsys.com>; Chris Matthews
<
> chris.matthews at apple.com>; llvm-dev <llvm-dev at
lists.llvm.org>; nd <
> nd at arm.com>
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and
> test parameterization
>
> I think it's important that it remains simple to get a simple local
> instance up and running.
> That will make it easier for people to give LNT a try, and also makes it
> easier for LNT developers to have everything they need for developing
> locally.
> I have no experience with NoSQL database engines. Would it be possible,
> assuming you have the MongoDB/other packages installed for your system, to
> just run
>
> $ ~/mysandbox/bin/python ~/lnt/setup.py develop $ lnt create ~/myperfdb $
> lnt runserver ~/myperfdb
>
> and be up and running (which is roughly what is required at the moment)?
> Of course good documentation could go a long way if a few extra steps
> would be needed.
>
> I do agree with James that if there are no major concerns for using a
> NoSQL database, it would be easiest if we only supported one database
> engine.
> For example, I had to do quite a bit of LNT regression test mangling to
> make them work on both sqlite and postgres, and it remains harder than it
> should be to write regression tests that test the database interface.
>
> Thanks,
>
> Kristof
>
>
> > On 26 Apr 2016, at 11:37, James Molloy via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi Elena,
> >
> > Thanks for pushing forward with this. I like the idea of using a NoSQL
> solution.
> >
> > My primary reservation is about adding the new NoSQL stuff as an extra
> backend. I think the current database backend and its use of SQLAlchemy is
> extremely complex and is in fact the most complex part of LNT. Adding
> something more (as opposed to *replacing* it) would just make this worse
> and make it more likely that contributors wouldn't be able to test LNT
very
> well (having three engines to test: SQLite, PostgreSQL and MongoDB).
> >
> > I think it'd be far better all around, if we decide to go with the
NoSQL
> solution, to just bite the bullet and force users who want to run a server
> to install MongoDB.
> >
> > In my experience most of the teams I've seen using LNT have a
single LNT
> server instance and submit results to that, rather than launching small
> instances to run "lnt viewcomparison".
> >
> > Cheers,
> >
> > James
> >
> > On Tue, 26 Apr 2016 at 09:15 Elena Lepilkina via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > Hi everyone.
> >
> >
> >
> > Thanks to everyone who took participant in discussion of this
proposal.
> >
> > After discussion we understood how other users use LNT and how great
> datasets may be.
> >
> >
> >
> > So there is new updated proposal.
> >
> > (Google docs version with some images -
> > https://docs.google.com/document/d/11qHNWRIQ2gc2aWH3gNBeeCUB3-JPe7AoMt
> > n7n9JoCeY/edit?usp=sharing)
> >
> >
> >
> > Goal is the same.
> >
> > Enable LNT support of custom metrics such as: user-defined run-time
and
> static metrics (power, etc.) and LLVM pass statistic counters. Provide
> integration with LLVM testsuite to automatically collect LLVM statistic
> counters or custom metrics.
> >
> >
> >
> > Analysis of current Database
> >
> >
> >
> > Limitations
> >
> > 1.      This structure isn’t flexible.
> >
> > There is no opportunity to run another test-suite except simple one.
> >
> > 2.      Performance is quite bad when database has a lot of records.
> >
> > For example, rendering graph is too slow. On
> green-dragon-07-x86_64-O3-flto:42 SingleSource/Benchmarks/Shootout/objinst
>  compile_time need for rendering 191.8 seconds.
> >
> > 3.       It’s difficult to add new features which need queries to
sample
> table in database(if we use BLOB field for custom metrics).
> >
> > Queries will be needed for more complex analysis. For example, if we
> would like to add some additional check for tests which compile time is too
> long, we should get result of query where this metric is  greater than some
> constant.
> >
> > Or we would like to compare tests with different run options, so we
> should get only some tests but not all.
> >
> > BLOB field will help to save current structure and make system a bit
> more flexible. But in the nearest future it will be not enough.
> >
> > Getting all metrics of all tests will make work slow on great
datasets.
> And this way isn’t enough optimal.
> >
> > So we wouldn’t like to do BLOB field, which wouldn’t help to add new
> features and have flexible system in future.
> >
> >
> >
> > Proposal
> >
> >
> >
> > We suggest to do third part of LNT (as Chris Matthews suggested). This
> part will be used for getting custom metrics and running any test-suite.
> >
> > We suggest to use NoSQL database (for example, MongoDB or JSON/JSONB
> extension of PostgresSQL, which let use PostgresSQL as NoSQL database) for
> this part. This part will be enable if there is path to NoSQL database in
> config file.
> >
> > It helps to have one Sample table(collection in NoSQL). If we use
> schemaless feature in MongoDB, for example, then it’s possible to add new
> fields when new testsuite is running.  Then there would be one table with a
> lot of fields, some of which are empty. At any moment of time it will be
> possible to change schema of table(document).
> >
> > A small prototype was made with MongoDB and ORM MongoEngine. This ORM
> was choosen because MongoAlchemy doesn’t support schemaless features and
> last MongoKit version has error with last pymongo release.
> >
> > I try it on virtual machine and get next results on 5 000 000 records.
> >
> > Current scheme - 13.72 seconds
> >
> > MongoDB – 1.35 seconds.
> >
> > Results of course will be better on real server machine .
> >
> >
> >
> > For use some test-suite user should describe fields in file with
format
> .fields such way:
> >
> > {
> >
> >  "Fields" : [{
> >
> >    "TestSuiteName" : "Bytecode",
> >
> >    "Type" : "integer",
> >
> >    "BiggerIsBetter" : 0,
> >
> >     "Show" : true
> >
> >  },
> >
> >  {
> >
> >    "TestSuiteName" : "GCC",
> >
> >    "Type" : "real",
> >
> >    "BiggerIsBetter" : 0,
> >
> >    "Name" : "GCC time"
> >
> >  },
> >
> >  {
> >
> >    "TestSuiteName" : "JIT",
> >
> >    "Type" : "real",
> >
> >    "BiggerIsBetter" : 0,
> >
> >    "Name" : "JIT Compile time",
> >
> >    "Show" : true
> >
> >  },
> >
> >  {
> >
> >    "TestSuiteName" : "GCC/LLC",
> >
> >    "Type" : "string",
> >
> >    "BiggerIsBetter" : 0
> >
> >  }]
> >
> > }
> >
> >
> >
> > There was added one field “Show” for describing if this metric should
be
> shown by default on web page (as James Molloy suggested). Other metrics
> would be added to page if user chooses them in view options.
> >
> >
> >
> > Conclusion
> >
> >
> >
> > This change will let user to choose if he wants to use flexible
powerful
> system or use limited version with SQLite database.
> >
> > If user chooses NoSQL version his data can be copied from its old
> database to new one. This will help to use new features without losing old
> data.
> >
> >
> >
> > The actual question is which NoSQL database will be better for LNT. We
> are interested in opinions of people, who know features of LNT better.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Elena.
> >
> >
> >
> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf
Of
> > Elena Lepilkina via llvm-dev
> > Sent: Tuesday, April 26, 2016 9:07 AM
> > To: chris.matthews at apple.com
> >
> >
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> > and test parameterization
> >
> >
> >
> > Hi, Chris.
> >
> >
> >
> > Thank you for your answer about compile tests. As I understood during
> looking through code of compile tests they don’t use test suite at all. Am
> I right? There is lack of information and examples of running compile tests
> in LNT documentation.
> >
> > We understood that there are two groups of users: users using servers
> and collecting a lot of data and SQLite users, but these users as I think
> wouldn’t have about millions of sample records.
> >
> > I think that it’s obvious that there is no universal solution for
simple
> installing process and flexible high-loaded system.
> >
> > I will update proposal and take into consideration your suggestion
about
> third part of test-suite.
> >
> >
> >
> > Thanks
> >
> >
> >
> > Elena.
> >
> >
> >
> > From: chris.matthews at apple.com [mailto:chris.matthews at apple.com]
> > Sent: Monday, April 25, 2016 8:06 PM
> > To: Elena Lepilkina <Elena.Lepilkina at synopsys.com>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> > and test parameterization
> >
> >
> >
> > I am really torn about this.
> >
> >
> >
> > When I implemented the regression tracking stuff recently, it really
> showed me how badly we are scaling.  On our production server, the run
> ingestion can take well over 100s.  Time is mostly spent in FieldChange
> generation and regression grouping. Both have to access a lot of recent
> samples. This is not the end of the world, because it runs in a background
> process.  Where this really sucks is when a regression has a lot
> indicators. The web interface renders these in a graph, and just trying to
> pull down 100 graphs worth of data kills the server.  I ended up limiting
> those to a max of 10 datasets, and even that takes 30s to load.
> >
> >
> >
> > So I do think we need some improvements to the scalability.
> >
> >
> >
> > LNT usage is spread between two groups. Users who setup big servers,
> with Postgres and apache/Gunicorn. For those uses I think a NoSQL is the
> way to go.   However, our second (and probably more common) user, is the
> people running little instance on their own machine to do some local
> compiler benchmarking.  Their setup process needs to be dead simple, and I
> think requiring a NoSQL database to be setup on their machine first is a no
> starter.  Like we do with sqlite, I think we need a transparent fall back
> for people who don’t have a NoSQL database.
> >
> >
> >
> > Would it be helpful to anyone if I got a dump of the llvm.org LNT
> Postgres database?  It is a good dataset big dataset to test with, and I
> assume everyone is okay with it being public, since the LNT server already
> is.
> >
> >
> >
> >
> >
> > On Apr 25, 2016, at 4:33 AM, Elena Lepilkina via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> >
> >
> >
> >
> >
> >
> > From: Elena Lepilkina
> > Sent: Monday, April 25, 2016 2:33 PM
> > To: 'James Molloy' <james at jamesmolloy.co.uk>; Kristof
Beyls
> > <Kristof.Beyls at arm.com>; Mehdi Amini <mehdi.amini at
apple.com>
> > Cc: nd <nd at arm.com>; Matthias Braun <matze at
braunis.de>
> > Subject: RE: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> > and test parameterization
> >
> >
> >
> > Hi everyone,
> >
> >
> >
> > Thank you for your answer. BLOB format adds some more actions for
> working with metrics. We know that ComparisonResult class makes analysis
> work. But it gets all metrics by request from database, we will need
> additional time for work with fields during analysis in ComparisonResult
> class. May be it will be better to do one Sample table for each testsuite,
> as it was suggested before. It should be more quickly, shouldn’t it?
> Moreover, next wished LNT changes will need getting some metrics separately
> and BLOB format will add some delay in time for queries.
> >
> >
> >
> > As we see now problem of performance is actual, because time for
> rendering graph page is about 3 minutes.
> >
> > <image001.png>
> >
> > So maybe it will be better to start working with NoSql databases? I
made
> a small prototype with TestSuite, TestSuiteFields, Test, Run and Sample
> tables for getting time metrics. It works quickly. And using NoSQL helps
> solve problems with  different fields for samples metrics fields. Then it
> will be possible to store different metrics for different testsuites in one
> table.
> >
> > What do you think about this proposal?
> >
> > I used MongoDB, but I know that there is NoSQL extension for
> > PostgresSQL with JSONB fields which are more
> >
> > effective than JSON-encoded BLOB, because it can be included in
queries
> very simply and let use indexes.
> >
> >
> >
> > About proposal that not all metrics should be shown. It can be added
as
> a field in JSON in .fields file, which describes fields getted from
> test-suite. To see other metrics user should choose them with checkboxes in
> view options. Will be this solution suitable?
> >
> > We can make as you wrote
> >
> > “I'd also suggest that if we're adding many more metrics to a
test, we
> should create a "test sample information" page that the test link
goes to
> instead of just the graph. This page could contain all counter/metric data,
> historic sparklines, the full graph and profiling links.
> >
> > ”
> >
> > But the render time of this page will be too great because of graph
> render time. In my opinion, some users wouldn’t like to wait so long for
> see some additional metrics.
> >
> >
> >
> > Thanks for your suggestions,
> >
> >
> >
> > Elena.
> >
> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf
Of
> > James Molloy via llvm-dev
> > Sent: Monday, April 25, 2016 12:43 PM
> > To: Kristof Beyls <Kristof.Beyls at arm.com>; Mehdi Amini
> > <mehdi.amini at apple.com>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; nd <nd at
arm.com>; Matthias
> > Braun <matze at braunis.de>
> > Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> > and test parameterization
> >
> >
> >
> > Hi Sergey, Elena,
> >
> >
> >
> > Firstly, thanks for this RFC. It's great to see more people
actively
> using and modifying LNT and the test metrics support in general is rather
> weak currently.
> >
> >
> >
> > Metrics
> >
> > -------
> >
> >
> >
> > I agree with Daniel and Kristof that your proposed schema changes have
> the potential to make many queries extremely slow. Certainly for the
> metrics enhancements, I don't see a reason why we need such a radical
> change in schema.
> >
> >
> >
> > To add custom metrics on the fly, we need to change the schema for the
> Sample table. Currently this consists of a column for each metric, but
> actually we never ever query those metric values. We never query for
> example for "all failing tests in a run" - when we do analyses we
use the
> ComparisonResult class which reads *all* samples from the database for a
> run and performs the analysis entirely in Python.
> >
> >
> >
> > Therefore, having a semi-structured format where some fields are
> first-class columns and the rest are in a JSON-encoded BLOB (as Daniel
> suggests) seems totally acceptable. There is certainly an argument now that
> we're using the wrong backend storage solution and that a key-value
store
> might be more suitable, but that's a very invasive change and I
don't think
> we've reached the point where we need to force a move from the
simplicity
> of SQLite.
> >
> >
> >
> > Adding an extra BLOB column would be easy - there would just need to
be
> logic in testsuitedb.py for reading and writing it - the Sample model class
> would expose the JSON-encoded fields as normal python fields so the rest of
> LNT would be isolated from this change.
> >
> >
> >
> > But I think this is a small detail compared to the bigger problem of
how
> to effectively *display* all this new data. Currently every new metric gets
> its own separate table in the report/run views, and this does not scale
> well at all.
> >
> >
> >
> > I think we need some more concepts in the metric system to make it
> scaleable:
> >
> >
> >
> >   * What "attribute" of the test is this metric measuring?
For example,
> both "exec_time" and "score" measure the same
attribute; performance of the
> generated code. It's superfluous to have them displayed in separate
tables.
> However mem_size and compile_time both measure completely different aspects
> of the test.
> >
> >   * Is this metric useful to display at the top level? or should it
only
> be exposed when more data about a test result is requested?
> >
> >     * An example of this is the pass statistics. I don't want my
daily
> report view cluttered by the time spent in register allocation for every
> test! OK, this is useful information when debugging a problem, but it
> should be available when requested rather than by default.
> >
> >
> >
> > An example of why we need the above is your screenshots in your google
> doc. I'm looking at the last screenshot, and it's incredibly
difficult to
> read and get useful information out of.
> >
> >
> >
> > I'd also suggest that if we're adding many more metrics to a
test, we
> should create a "test sample information" page that the test link
goes to
> instead of just the graph. This page could contain all counter/metric data,
> historic sparklines, the full graph and profiling links.
> >
> >
> >
> > Cheers,
> >
> >
> >
> > James
> >
> >
> >
> > On Fri, 22 Apr 2016 at 10:17 Kristof Beyls via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> >
> >
> > On 22 Apr 2016, at 11:14, Mehdi Amini <mehdi.amini at apple.com>
wrote:
> >
> >
> >
> >
> > On Apr 22, 2016, at 12:45 AM, Kristof Beyls via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> >
> >
> >
> > On 21 Apr 2016, at 17:44, Sergey Yakoushkin <sergey.yakoushkin at
gmail.com>
> wrote:
> >
> >
> >
> > Hi Kristof,
> >
> >
> >
> >        The way we use LNT, we would run different configuration (e.g.
> -O3 vs -Os) as different "machines" in LNT's model.
> >
> >
> >
> > O2/O3 is indeed bad example. We're also using different machines
for
> Os/O3 - such parameters apply to all tests and we don't propose major
> changes.
> >
> > Elena was only extending LNT interface a bit to ease LLVM-testsuite
> execution with different compiler or HW flags.
> >
> >
> >
> > Oh I see, this boils down to extending the lnt runtest interface to be
> > able to specify a set of configurations, rather than a single
> > configuration and making
> >
> > sure configurations get submitted under different machine names? We
> > kick off the different configuration runs through a script invoking
> > lnt runtest multiple
> >
> > times. I don't see a big problem with extending the lnt runtest
> > interface to do this, assuming it doesn't break the underlying
> > concepts assumed throughout
> >
> > LNT. Maybe the only downside is that this will add even more command
> > line options to lnt runtest, which already has a lot (too many?)
> > command line
> >
> > options.
> >
> >
> >
> > Maybe some changes are required to analyze and compare metrics between
> "machines": e.g. code size/performance between Os/O2/O3.
> >
> > Do you perform such comparisons?
> >
> >
> >
> > We typically do these kinds of comparisons when we test our patches
> pre-commit, i.e. comparing for example '-O3' with '-O3
'mllvm
> -enable-my-new-pass'.
> >
> > To stick with the LNT concepts, tests enabling new passes are stored
as
> a different "machine".
> >
> > The only way I know to be able to do a comparison between runs on 2
> > different "machine"s is to manually edit the URL for run vs
run
> > comparison
> >
> > and fill in the runids of the 2 runs you want to compare.
> >
> > For example, the following URL is a comparison of
> green-dragon-07-x86_64-O3-flto vs green-dragon-06-x86_64-O0-g on the public
> llvm.org/perf server:
> >
> > http://llvm.org/perf/db_default/v4/nts/70644?compare_to=70634
> >
> > I had to manually look up and fill in the run ids 70644 and 70634.
> >
> > It would be great if there was a better way to be able to do these
kind
> of comparisons - i.e. not having to manually fill in run ids, but having a
> webui to easily find and pick the runs you want to compare.
> >
> > (As an aside: I find it intriguing that the URL above suggests that
> there are quite a few cases where "-O0 -g" produces faster code
than "-O3
> -flto").
> >
> >
> >
> > Can you be more explicit which ones? I don't see any regression
(other
> than compared to the baseline, or on the compile time).
> >
> >
> >
> > --
> >
> > Mehdi
> >
> >
> >
> > D'Oh! I was misinterpreting the compile time differences as
execution
> time differences. Indeed, there is no unexpected result in there.
> >
> > Sorry for the noise!
> >
> >
> >
> > Kristof
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160513/ad909f1c/attachment.html>

Martin J. O'Riordan via llvm-dev

2016-May-14 19:55 UTC

head link

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

A slight diversion if you don’t mind, though related.

 

Currently LNT is not branched along with other components commonly used with
LLVM such as ‘clang’ and ‘compiler-rt’.  I generally try to synch LNT to the
same SVN revision number as a corresponding LLVM branch on the possibly invalid
assumption that this is the version of LNT that LLVM was tested with for that
branch.  Would it be possible to include LNT under the LLVM branch umbrella,
especially if significant changes are being made to LNT that might not be
backward or forward compatible with various branches and versions of LLVM?

 

Thanks,

 

            MartinO

 

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Elena
Lepilkina via llvm-dev
Sent: 13 May 2016 07:21
To: Daniel Dunbar <daniel at zuster.org>; llvm-dev <llvm-dev at
lists.llvm.org>
Cc: nd <nd at arm.com>
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization

 

Hi all,

 

As we understood great changes will be done in LNT, so we are waiting to new LNT
version and stopped our work in LNT.

 

One more question about using test-suite separately with cmake. Cmake can only
build all tests and generate lit tests. After that we can run LIT and get report
which is not equal with report (simple) got with make. Cmake test-suite version
has no features to run custom metrics and generate other report type, right?

 

Are these features of make-version of test-suite planned to be added?

 

Thanks,

 

Elena.

 

From: daniel.dunbar at gmail.com [mailto:daniel.dunbar at gmail.com] On Behalf
Of Daniel Dunbar
Sent: Wednesday, April 27, 2016 10:15 AM
To: Elena Lepilkina <Elena.Lepilkina at synopsys.com>
Cc: Kristof Beyls <Kristof.Beyls at arm.com>; James Molloy <james at
jamesmolloy.co.uk>; llvm-dev <llvm-dev at lists.llvm.org>; nd <nd at
arm.com>
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization

 

Hi all,

 

First off, let me ask one question about the use case you intend to support:

 

Is your expectation for LLVM statistics style data that this would be present
for almost all runs in a database, or that it would only be present for a small
subset of runs?

 

 

Second, here is my general perspective on the LNT system architecture:

 

1. It is very important to me that LNT continue to support an
"turn-key" experience. In fact, I *wish* that experience would get
easier, not harder, for example by moving all interactions with the server to a
true admin interface. I would prefer not to introduce additional dependencies in
that case.

 

I will add that I am not completely opposed to moving to a "turn-key"
architecture which requires substantially more components (e.g., PostgreSQL,
memcached, etc.) as long as it was packaged in such a way that it could offer a
nice turn key experience. For example, if someone was willing to implement &
document a Vagrant or Docker based development model, that would be ok with me,
as long as there was still the option to do fully native deployments on OS X
with real system administrator support.

 

2. Our internal application architecture is severely lacking. I believe the
standard architecture these days would be to have (load-balancer + front-end +
[load-balancer] + back-end + database), and I think partly we are suffering from
missing most of that infrastructure. In particular, we are missing two important
components:

- There should be a separate backend, which would allow us to implement improved
caching, and a clear model for long-lived or persistent state. I filed this as:
https://llvm.org/bugs/show_bug.cgi?id=27534

- This would give us a place to manage background processing tasks, for example
jobs that reprocess the raw sample data for efficient queries.

If we had such a thing, we could consider using something like memcached for
improving caching (particularly in a larger deployment).

 

3. My intention was always to use JSON blobs for situations where a small % of
samples want to include arbitrary blobs of extra data. I think standardizing on
PostgreSQL/JSON is the right approach here, given the standard PaaS support for
PostgreSQL and its existing use within our deployments. SQLAlchemy has native
support for this.

 

4. I don't actually think that a NoSQL database buys us much if anything
(particularly when we have PostgreSQL/JSON available). We are not in a situation
where we would need something like eventual consistency around writes, which
leaves us wanting raw query performance over large sets of relatively
well-structured data. I believe that is a situation which is best served by SQL
with properly designed tables. This is also an area where having an
infrastructure that could handle background processing to load query-optimized
tables & indices would be valuable.

 

I think it is a mistake to believe that using NoSQL and a schema-less model
without also introducing substantial caching will somehow give better
performance than a real SQL database, and would be very interested to see
performance data showing otherwise (e.g., compare your MongoDB implementation to
one in which all data was in a full-schematized table with proper indices).

 

5. The database schema used by LNT (which is dynamically instantiated in
response to test suite configurations) is admittedly unorthodox. However, I
still believe this is the right answer to support a turn-key performance
tracking solution that can be customized by the user based on the fields they
wish to track (in fact, if Elena's data is present for almost all samples
then it is exactly the designed use case). I'm not yet convinced that the
right answer here isn't to improve the actual implementation of this
approach; for example, we could require the definition to be created when the
LNT instance is set up, which might solve some of the issues Chris mentioned.
Or, we could expose improved tools or a UI for interacting with the schema via
an admin interface, for example to promote individual from a JSON blob to being
first class for improved performance/reporting.

 

6. To echo one of James's points, I also do think that ultimately the most
complicated part of adding support for arbitrary metrics is not the database
implementation, but managing the complexity in the reporting and graphing logic
(which is already cumbersome). I am fine seeing a layered implementation
approach where we first focus on how just to get the data in, but I do think it
is worth spending some time looking at how to manage the visualizations of that
data.

 

 - Daniel

 

 

On Tue, Apr 26, 2016 at 11:17 PM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:

Of course it'll be running as now. But user will need have installed
MongoDB.

Installation on linux with support of .deb packages is quite easy.
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80
<http://keyserver.ubuntu.com:80>  --recv EA312927
echo "deb http://repo.mongodb.org/apt/debian wheezy/mongodb-org/3.2
main" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo service mongod start

After that mongo will be running service in localhost:27017. After that user
should create database
mongo
use <db name>

And set database name in config file. As additional fields will be host and port
for users who will do settings for their server, which will have default
values(localhost:27017).

After that old steps
~/mysandbox/bin/python ~/lnt/setup.py develop
lnt create ~/myperfdb
lnt runserver ~/myperfdb

MongoDB has detailed instructions for installing for all operating systems.

Extra 6-7 commands for install MongoDB, which takes about 2 minutes and should
be executed once shouldn't be a great problem for new users, who would like
to try LNT.

Thanks,

Elena.

-----Original Message-----
From: Kristof Beyls [mailto:Kristof.Beyls at arm.com <mailto:Kristof.Beyls at
arm.com> ]
Sent: Tuesday, April 26, 2016 5:15 PM
To: James Molloy <james at jamesmolloy.co.uk <mailto:james at
jamesmolloy.co.uk> >
Cc: Elena Lepilkina <Elena.Lepilkina at synopsys.com
<mailto:Elena.Lepilkina at synopsys.com> >; Chris Matthews
<chris.matthews at apple.com <mailto:chris.matthews at apple.com> >;
llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> >; nd <nd at arm.com <mailto:nd at arm.com> >
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization

I think it's important that it remains simple to get a simple local instance
up and running.
That will make it easier for people to give LNT a try, and also makes it easier
for LNT developers to have everything they need for developing locally.
I have no experience with NoSQL database engines. Would it be possible, assuming
you have the MongoDB/other packages installed for your system, to just run

$ ~/mysandbox/bin/python ~/lnt/setup.py develop $ lnt create ~/myperfdb $ lnt
runserver ~/myperfdb

and be up and running (which is roughly what is required at the moment)?
Of course good documentation could go a long way if a few extra steps would be
needed.

I do agree with James that if there are no major concerns for using a NoSQL
database, it would be easiest if we only supported one database engine.
For example, I had to do quite a bit of LNT regression test mangling to make
them work on both sqlite and postgres, and it remains harder than it should be
to write regression tests that test the database interface.

Thanks,

Kristof

> On 26 Apr 2016, at 11:37, James Molloy via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:
>
> Hi Elena,
>
> Thanks for pushing forward with this. I like the idea of using a NoSQL
solution.
>
> My primary reservation is about adding the new NoSQL stuff as an extra
backend. I think the current database backend and its use of SQLAlchemy is
extremely complex and is in fact the most complex part of LNT. Adding something
more (as opposed to *replacing* it) would just make this worse and make it more
likely that contributors wouldn't be able to test LNT very well (having
three engines to test: SQLite, PostgreSQL and MongoDB).
>
> I think it'd be far better all around, if we decide to go with the
NoSQL solution, to just bite the bullet and force users who want to run a server
to install MongoDB.
>
> In my experience most of the teams I've seen using LNT have a single
LNT server instance and submit results to that, rather than launching small
instances to run "lnt viewcomparison".
>
> Cheers,
>
> James
>
> On Tue, 26 Apr 2016 at 09:15 Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:
> Hi everyone.
>
>
>
> Thanks to everyone who took participant in discussion of this proposal.
>
> After discussion we understood how other users use LNT and how great
datasets may be.
>
>
>
> So there is new updated proposal.
>
> (Google docs version with some images -
> https://docs.google.com/document/d/11qHNWRIQ2gc2aWH3gNBeeCUB3-JPe7AoMt
> n7n9JoCeY/edit?usp=sharing)
>
>
>
> Goal is the same.
>
> Enable LNT support of custom metrics such as: user-defined run-time and
static metrics (power, etc.) and LLVM pass statistic counters. Provide
integration with LLVM testsuite to automatically collect LLVM statistic counters
or custom metrics.
>
>
>
> Analysis of current Database
>
>
>
> Limitations
>
> 1.      This structure isn’t flexible.
>
> There is no opportunity to run another test-suite except simple one.
>
> 2.      Performance is quite bad when database has a lot of records.
>
> For example, rendering graph is too slow. On
green-dragon-07-x86_64-O3-flto:42 SingleSource/Benchmarks/Shootout/objinst  
compile_time need for rendering 191.8 seconds.
>
> 3.       It’s difficult to add new features which need queries to sample
table in database(if we use BLOB field for custom metrics).
>
> Queries will be needed for more complex analysis. For example, if we would
like to add some additional check for tests which compile time is too long, we
should get result of query where this metric is  greater than some constant.
>
> Or we would like to compare tests with different run options, so we should
get only some tests but not all.
>
> BLOB field will help to save current structure and make system a bit more
flexible. But in the nearest future it will be not enough.
>
> Getting all metrics of all tests will make work slow on great datasets. And
this way isn’t enough optimal.
>
> So we wouldn’t like to do BLOB field, which wouldn’t help to add new
features and have flexible system in future.
>
>
>
> Proposal
>
>
>
> We suggest to do third part of LNT (as Chris Matthews suggested). This part
will be used for getting custom metrics and running any test-suite.
>
> We suggest to use NoSQL database (for example, MongoDB or JSON/JSONB
extension of PostgresSQL, which let use PostgresSQL as NoSQL database) for this
part. This part will be enable if there is path to NoSQL database in config
file.
>
> It helps to have one Sample table(collection in NoSQL). If we use
schemaless feature in MongoDB, for example, then it’s possible to add new fields
when new testsuite is running.  Then there would be one table with a lot of
fields, some of which are empty. At any moment of time it will be possible to
change schema of table(document).
>
> A small prototype was made with MongoDB and ORM MongoEngine. This ORM was
choosen because MongoAlchemy doesn’t support schemaless features and last
MongoKit version has error with last pymongo release.
>
> I try it on virtual machine and get next results on 5 000 000 records.
>
> Current scheme - 13.72 seconds
>
> MongoDB – 1.35 seconds.
>
> Results of course will be better on real server machine .
>
>
>
> For use some test-suite user should describe fields in file with format
.fields such way:
>
> {
>
>  "Fields" : [{
>
>    "TestSuiteName" : "Bytecode",
>
>    "Type" : "integer",
>
>    "BiggerIsBetter" : 0,
>
>     "Show" : true
>
>  },
>
>  {
>
>    "TestSuiteName" : "GCC",
>
>    "Type" : "real",
>
>    "BiggerIsBetter" : 0,
>
>    "Name" : "GCC time"
>
>  },
>
>  {
>
>    "TestSuiteName" : "JIT",
>
>    "Type" : "real",
>
>    "BiggerIsBetter" : 0,
>
>    "Name" : "JIT Compile time",
>
>    "Show" : true
>
>  },
>
>  {
>
>    "TestSuiteName" : "GCC/LLC",
>
>    "Type" : "string",
>
>    "BiggerIsBetter" : 0
>
>  }]
>
> }
>
>
>
> There was added one field “Show” for describing if this metric should be
shown by default on web page (as James Molloy suggested). Other metrics would be
added to page if user chooses them in view options.
>
>
>
> Conclusion
>
>
>
> This change will let user to choose if he wants to use flexible powerful
system or use limited version with SQLite database.
>
> If user chooses NoSQL version his data can be copied from its old database
to new one. This will help to use new features without losing old data.
>
>
>
> The actual question is which NoSQL database will be better for LNT. We are
interested in opinions of people, who know features of LNT better.
>
>
>
> Thanks,
>
>
>
> Elena.
>
>
>
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
<mailto:llvm-dev-bounces at lists.llvm.org> ] On Behalf Of
> Elena Lepilkina via llvm-dev
> Sent: Tuesday, April 26, 2016 9:07 AM
> To: chris.matthews at apple.com <mailto:chris.matthews at apple.com> 
>
>
> Cc: llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> >
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi, Chris.
>
>
>
> Thank you for your answer about compile tests. As I understood during
looking through code of compile tests they don’t use test suite at all. Am I
right? There is lack of information and examples of running compile tests in LNT
documentation.
>
> We understood that there are two groups of users: users using servers and
collecting a lot of data and SQLite users, but these users as I think wouldn’t
have about millions of sample records.
>
> I think that it’s obvious that there is no universal solution for simple
installing process and flexible high-loaded system.
>
> I will update proposal and take into consideration your suggestion about
third part of test-suite.
>
>
>
> Thanks
>
>
>
> Elena.
>
>
>
> From: chris.matthews at apple.com <mailto:chris.matthews at
apple.com>  [mailto:chris.matthews at apple.com <mailto:chris.matthews at
apple.com> ]
> Sent: Monday, April 25, 2016 8:06 PM
> To: Elena Lepilkina <Elena.Lepilkina at synopsys.com
<mailto:Elena.Lepilkina at synopsys.com> >
> Cc: llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> >
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> I am really torn about this.
>
>
>
> When I implemented the regression tracking stuff recently, it really showed
me how badly we are scaling.  On our production server, the run ingestion can
take well over 100s.  Time is mostly spent in FieldChange generation and
regression grouping. Both have to access a lot of recent samples. This is not
the end of the world, because it runs in a background process.  Where this
really sucks is when a regression has a lot indicators. The web interface
renders these in a graph, and just trying to pull down 100 graphs worth of data
kills the server.  I ended up limiting those to a max of 10 datasets, and even
that takes 30s to load.
>
>
>
> So I do think we need some improvements to the scalability.
>
>
>
> LNT usage is spread between two groups. Users who setup big servers, with
Postgres and apache/Gunicorn. For those uses I think a NoSQL is the way to go.  
However, our second (and probably more common) user, is the people running
little instance on their own machine to do some local compiler benchmarking. 
Their setup process needs to be dead simple, and I think requiring a NoSQL
database to be setup on their machine first is a no starter.  Like we do with
sqlite, I think we need a transparent fall back for people who don’t have a
NoSQL database.
>
>
>
> Would it be helpful to anyone if I got a dump of the llvm.org
<http://llvm.org>  LNT Postgres database?  It is a good dataset big
dataset to test with, and I assume everyone is okay with it being public, since
the LNT server already is.
>
>
>
>
>
> On Apr 25, 2016, at 4:33 AM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:
>
>
>
>
>
>
>
> From: Elena Lepilkina
> Sent: Monday, April 25, 2016 2:33 PM
> To: 'James Molloy' <james at jamesmolloy.co.uk <mailto:james
at jamesmolloy.co.uk> >; Kristof Beyls
> <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at arm.com> >;
Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>
>
> Cc: nd <nd at arm.com <mailto:nd at arm.com> >; Matthias Braun
<matze at braunis.de <mailto:matze at braunis.de> >
> Subject: RE: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi everyone,
>
>
>
> Thank you for your answer. BLOB format adds some more actions for working
with metrics. We know that ComparisonResult class makes analysis work. But it
gets all metrics by request from database, we will need additional time for work
with fields during analysis in ComparisonResult class. May be it will be better
to do one Sample table for each testsuite, as it was suggested before. It should
be more quickly, shouldn’t it? Moreover, next wished LNT changes will need
getting some metrics separately and BLOB format will add some delay in time for
queries.
>
>
>
> As we see now problem of performance is actual, because time for rendering
graph page is about 3 minutes.
>
> <image001.png>
>
> So maybe it will be better to start working with NoSql databases? I made a
small prototype with TestSuite, TestSuiteFields, Test, Run and Sample tables for
getting time metrics. It works quickly. And using NoSQL helps solve problems
with  different fields for samples metrics fields. Then it will be possible to
store different metrics for different testsuites in one table.
>
> What do you think about this proposal?
>
> I used MongoDB, but I know that there is NoSQL extension for
> PostgresSQL with JSONB fields which are more
>
> effective than JSON-encoded BLOB, because it can be included in queries
very simply and let use indexes.
>
>
>
> About proposal that not all metrics should be shown. It can be added as a
field in JSON in .fields file, which describes fields getted from test-suite. To
see other metrics user should choose them with checkboxes in view options. Will
be this solution suitable?
>
> We can make as you wrote
>
> “I'd also suggest that if we're adding many more metrics to a test,
we should create a "test sample information" page that the test link
goes to instead of just the graph. This page could contain all counter/metric
data, historic sparklines, the full graph and profiling links.
>
> ”
>
> But the render time of this page will be too great because of graph render
time. In my opinion, some users wouldn’t like to wait so long for see some
additional metrics.
>
>
>
> Thanks for your suggestions,
>
>
>
> Elena.
>
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
<mailto:llvm-dev-bounces at lists.llvm.org> ] On Behalf Of
> James Molloy via llvm-dev
> Sent: Monday, April 25, 2016 12:43 PM
> To: Kristof Beyls <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at
arm.com> >; Mehdi Amini
> <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com> >
> Cc: llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> >; nd <nd at arm.com <mailto:nd at arm.com> >;
Matthias
> Braun <matze at braunis.de <mailto:matze at braunis.de> >
> Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics
> and test parameterization
>
>
>
> Hi Sergey, Elena,
>
>
>
> Firstly, thanks for this RFC. It's great to see more people actively
using and modifying LNT and the test metrics support in general is rather weak
currently.
>
>
>
> Metrics
>
> -------
>
>
>
> I agree with Daniel and Kristof that your proposed schema changes have the
potential to make many queries extremely slow. Certainly for the metrics
enhancements, I don't see a reason why we need such a radical change in
schema.
>
>
>
> To add custom metrics on the fly, we need to change the schema for the
Sample table. Currently this consists of a column for each metric, but actually
we never ever query those metric values. We never query for example for
"all failing tests in a run" - when we do analyses we use the
ComparisonResult class which reads *all* samples from the database for a run and
performs the analysis entirely in Python.
>
>
>
> Therefore, having a semi-structured format where some fields are
first-class columns and the rest are in a JSON-encoded BLOB (as Daniel suggests)
seems totally acceptable. There is certainly an argument now that we're
using the wrong backend storage solution and that a key-value store might be
more suitable, but that's a very invasive change and I don't think
we've reached the point where we need to force a move from the simplicity of
SQLite.
>
>
>
> Adding an extra BLOB column would be easy - there would just need to be
logic in testsuitedb.py for reading and writing it - the Sample model class
would expose the JSON-encoded fields as normal python fields so the rest of LNT
would be isolated from this change.
>
>
>
> But I think this is a small detail compared to the bigger problem of how to
effectively *display* all this new data. Currently every new metric gets its own
separate table in the report/run views, and this does not scale well at all.
>
>
>
> I think we need some more concepts in the metric system to make it
scaleable:
>
>
>
>   * What "attribute" of the test is this metric measuring? For
example, both "exec_time" and "score" measure the same
attribute; performance of the generated code. It's superfluous to have them
displayed in separate tables. However mem_size and compile_time both measure
completely different aspects of the test.
>
>   * Is this metric useful to display at the top level? or should it only be
exposed when more data about a test result is requested?
>
>     * An example of this is the pass statistics. I don't want my daily
report view cluttered by the time spent in register allocation for every test!
OK, this is useful information when debugging a problem, but it should be
available when requested rather than by default.
>
>
>
> An example of why we need the above is your screenshots in your google doc.
I'm looking at the last screenshot, and it's incredibly difficult to
read and get useful information out of.
>
>
>
> I'd also suggest that if we're adding many more metrics to a test,
we should create a "test sample information" page that the test link
goes to instead of just the graph. This page could contain all counter/metric
data, historic sparklines, the full graph and profiling links.
>
>
>
> Cheers,
>
>
>
> James
>
>
>
> On Fri, 22 Apr 2016 at 10:17 Kristof Beyls via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:
>
>
>
> On 22 Apr 2016, at 11:14, Mehdi Amini <mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com> > wrote:
>
>
>
>
> On Apr 22, 2016, at 12:45 AM, Kristof Beyls via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:
>
>
>
>
> On 21 Apr 2016, at 17:44, Sergey Yakoushkin <sergey.yakoushkin at
gmail.com <mailto:sergey.yakoushkin at gmail.com> > wrote:
>
>
>
> Hi Kristof,
>
>
>
>        The way we use LNT, we would run different configuration (e.g. -O3
vs -Os) as different "machines" in LNT's model.
>
>
>
> O2/O3 is indeed bad example. We're also using different machines for
Os/O3 - such parameters apply to all tests and we don't propose major
changes.
>
> Elena was only extending LNT interface a bit to ease LLVM-testsuite
execution with different compiler or HW flags.
>
>
>
> Oh I see, this boils down to extending the lnt runtest interface to be
> able to specify a set of configurations, rather than a single
> configuration and making
>
> sure configurations get submitted under different machine names? We
> kick off the different configuration runs through a script invoking
> lnt runtest multiple
>
> times. I don't see a big problem with extending the lnt runtest
> interface to do this, assuming it doesn't break the underlying
> concepts assumed throughout
>
> LNT. Maybe the only downside is that this will add even more command
> line options to lnt runtest, which already has a lot (too many?)
> command line
>
> options.
>
>
>
> Maybe some changes are required to analyze and compare metrics between
"machines": e.g. code size/performance between Os/O2/O3.
>
> Do you perform such comparisons?
>
>
>
> We typically do these kinds of comparisons when we test our patches
pre-commit, i.e. comparing for example '-O3' with '-O3 'mllvm
-enable-my-new-pass'.
>
> To stick with the LNT concepts, tests enabling new passes are stored as a
different "machine".
>
> The only way I know to be able to do a comparison between runs on 2
> different "machine"s is to manually edit the URL for run vs run
> comparison
>
> and fill in the runids of the 2 runs you want to compare.
>
> For example, the following URL is a comparison of
green-dragon-07-x86_64-O3-flto vs green-dragon-06-x86_64-O0-g on the public
llvm.org/perf <http://llvm.org/perf>  server:
>
> http://llvm.org/perf/db_default/v4/nts/70644?compare_to=70634
>
> I had to manually look up and fill in the run ids 70644 and 70634.
>
> It would be great if there was a better way to be able to do these kind of
comparisons - i.e. not having to manually fill in run ids, but having a webui to
easily find and pick the runs you want to compare.
>
> (As an aside: I find it intriguing that the URL above suggests that there
are quite a few cases where "-O0 -g" produces faster code than
"-O3 -flto").
>
>
>
> Can you be more explicit which ones? I don't see any regression (other
than compared to the baseline, or on the compile time).
>
>
>
> --
>
> Mehdi
>
>
>
> D'Oh! I was misinterpreting the compile time differences as execution
time differences. Indeed, there is no unexpected result in there.
>
> Sorry for the noise!
>
>
>
> Kristof
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160514/d1ced797/attachment.html>

Matthias Braun via llvm-dev

2016-May-20 17:02 UTC

head link

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

> On May 12, 2016, at 11:21 PM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi all,
>  
> As we understood great changes will be done in LNT, so we are waiting to
new LNT version and stopped our work in LNT.
>  
> One more question about using test-suite separately with cmake. Cmake can
only build all tests and generate lit tests. After that we can run LIT and get
report which is not equal with report (simple) got with make. Cmake test-suite
version has no features to run custom metrics and generate other report type,
right?
>  
> Are these features of make-version of test-suite planned to be added?The lit test-suite runner supports arbitrary metrics, it already features
codesizes for different segments, compiletime, linktime, executable hash,
execution time. I designed it be easily extensible with further metrics. Not all
of these metrics are understood by LNT yet so they may get lost after submission
to an LNT database.

We do not use GenerateReport.pl and friends in the cmake/lit version anymore, as
the main data collection and reporting mechanism in llvm has mostly shifted to
LNT these days. In any case it should be easy to use “lit -o result.json” and
process the resulting json file in your favorite scripting language to generate
arbitrary reports.

As far as the TEST.*.Makefile things go: Many of those variants have no
equivalent in cmake/lit. But I have the strong feeling that the majority of
those is not really used or even broken so there are no plans to adding them to.
On the other hand I would love to hear from people that actually use any of
those besides TEST.simple.Makefile, it should be possible to transition most of
them but I’d first like to  hear what they are used for.

- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160520/6238dbef/attachment.html>

Elena Lepilkina via llvm-dev

2016-May-25 08:54 UTC

head link

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

Hi Matthias,

Thank you for your answer.
But can you answer for some more questions?
First of all, now LNT uses make-style of running tests and parse results from
result csv file. Are there any plans to go to cmake?
As I understood lit will run and collect all metrics, but there is no
opportunity to make any settings for choosing what metrics I would like to
collect. Test reports files allow to choose what report I would like. One time I
can use one, second time I can use another. I can do this with cmake only by
changing test_modules in file.
So I can’t group some metrics and give them some name.

Am I right?

Thanks,
Elena.


From: Matthias Braun [mailto:matze at braunis.de]
Sent: Friday, May 20, 2016 8:03 PM
To: Elena Lepilkina <Elena.Lepilkina at synopsys.com>
Cc: Daniel Dunbar <daniel at zuster.org>; llvm-dev <llvm-dev at
lists.llvm.org>; nd <nd at arm.com>
Subject: Re: [llvm-dev] RFC: LNT/Test-suite support for custom metrics and test
parameterization

On May 12, 2016, at 11:21 PM, Elena Lepilkina via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Hi all,

As we understood great changes will be done in LNT, so we are waiting to new LNT
version and stopped our work in LNT.

One more question about using test-suite separately with cmake. Cmake can only
build all tests and generate lit tests. After that we can run LIT and get report
which is not equal with report (simple) got with make. Cmake test-suite version
has no features to run custom metrics and generate other report type, right?

Are these features of make-version of test-suite planned to be added?
The lit test-suite runner supports arbitrary metrics, it already features
codesizes for different segments, compiletime, linktime, executable hash,
execution time. I designed it be easily extensible with further metrics. Not all
of these metrics are understood by LNT yet so they may get lost after submission
to an LNT database.

We do not use GenerateReport.pl and friends in the cmake/lit version anymore, as
the main data collection and reporting mechanism in llvm has mostly shifted to
LNT these days. In any case it should be easy to use “lit -o result.json” and
process the resulting json file in your favorite scripting language to generate
arbitrary reports.

As far as the TEST.*.Makefile things go: Many of those variants have no
equivalent in cmake/lit. But I have the strong feeling that the majority of
those is not really used or even broken so there are no plans to adding them to.
On the other hand I would love to hear from people that actually use any of
those besides TEST.simple.Makefile, it should be possible to transition most of
them but I’d first like to  hear what they are used for.

- Matthias


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160525/f444ed02/attachment.html>

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - May 2016 - RFC: LNT/Test-suite support for custom metrics and test parameterization

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

[llvm-dev] RFC: LNT/Test-suite support for custom metrics and test parameterization

Reasonably Related Threads