thr3ads.net - llvm dev - [llvm-dev] Adding a new External Suite to test-suite [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Fernando Magno Quintao Pereira via llvm-dev

2020-Apr-06 18:53 UTC

[llvm-dev] Adding a new External Suite to test-suite

Hi Johannes,
> I'd also like to know what the intention here is. What is tested and
how?
    we have a few uses for these benchmarks in the technical report:
http://lac.dcc.ufmg.br/pubs/TechReports/LaC_TechReport012020.pdf, but
since then, we came up with other applications. All these programs
produce object files without external dependencies. We have been using
them to train a predictive compiler that reduces code size (the
technical report has more about that). In addition, you can use them
to compare compilation time, for instance, as Michael had asked. We
have also used these benchmarks in two studies:

1) http://cuda.dcc.ufmg.br/angha/chordAnalysis
2) http://cuda.dcc.ufmg.br/angha/staticProperties

A few other applications that I know about (outside our research
group), include:

* Comparing the size of code produced by three HLS tools: Intel HLS,
Vivado and LegUp.
* Testing the Ultimate Buchi Automizer, to see which kind of C
constructs it handles
* Comparing compilation time of gcc vs clang

A few other studies that I would like to carry out:

* Checking the runtime of different C parsers that we have.
* Trying to infer, empirically, the complexity of compiler analyses
and optimizations.
> Looking at a few of these it seems there is not much you can do as it is
little code with a lot of unknown function calls and global symbols.
Most of the programs are small (avg 63 bytecodes, std 97); however,
among these 1M C functions, we have a few large ones, with more than
40K bytecodes.

Regards,

Fernando

Johannes Doerfert via llvm-dev

2020-Apr-06 23:30 UTC

head link

[llvm-dev] Adding a new External Suite to test-suite

Hi Fernando,

On 4/6/20 1:53 PM, Fernando Magno Quintao Pereira via llvm-dev
wrote:> Hi Johannes,
>
>> I'd also like to know what the intention here is. What is tested
and how?
>      we have a few uses for these benchmarks in the technical report:
> http://lac.dcc.ufmg.br/pubs/TechReports/LaC_TechReport012020.pdf, but
> since then, we came up with other applications. All these programs
> produce object files without external dependencies. We have been using
> them to train a predictive compiler that reduces code size (the
> technical report has more about that). In addition, you can use them
> to compare compilation time, for instance, as Michael had asked. We
> have also used these benchmarks in two studies:
>
> 1) http://cuda.dcc.ufmg.br/angha/chordAnalysis
> 2) http://cuda.dcc.ufmg.br/angha/staticProperties
>
> A few other applications that I know about (outside our research
> group), include:
>
> * Comparing the size of code produced by three HLS tools: Intel HLS,
> Vivado and LegUp.
> * Testing the Ultimate Buchi Automizer, to see which kind of C
> constructs it handles
> * Comparing compilation time of gcc vs clang
>
> A few other studies that I would like to carry out:
>
> * Checking the runtime of different C parsers that we have.
> * Trying to infer, empirically, the complexity of compiler analyses
> and optimizations.
All the use cases sound reasonable but why do we need these kind of 
"weird files" to do this?

I mean, why would you train or measure something on single definition 
translation units and not on the original ones, potentially one function 
at a time?

To me this looks like a really good way to skew the input data set, 
e.g., you don't ever see a call that can be inlined or for which 
inter-procedural reasoning is performed. As a consequence each function 
is way smaller than it would be in a real run, with all the consequences 
on the results obtained from such benchmarks. Again, why can't we take 
the original programs instead?
>> Looking at a few of these it seems there is not much you can do as it
is little code with a lot of unknown function calls and global symbols.
> Most of the programs are small (avg 63 bytecodes, std 97); however,
> among these 1M C functions, we have a few large ones, with more than
> 40K bytecodes.
How many duplicates are there among the small functions? I mean, close 
to 1M functions of such a small size (and with similar pro- and epilogue).


Cheers,

   Johannes

> Regards,
>
> Fernando
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Fernando Magno Quintao Pereira via llvm-dev

2020-Apr-07 00:24 UTC

head link

[llvm-dev] Adding a new External Suite to test-suite

Hi Johannes,
> All the use cases sound reasonable but why do we need these kind of
"weird files" to do this?
>
> I mean, why would you train or measure something on single definition
translation units and not on the original ones, potentially one function at a
time?
I think that's the fundamental question :) The short answer is that it
is hard to compile the files from open-source repositories
automatically. The weird files that you mentioned appear due to the
type inference that we run on them. Let me give you some data and tell
you the whole story.

One of the benchmark collections distributed in our website consists
of 529,498 C functions and their respective LLVM bytecodes. Out of
these files, we extracted 698,449 functions, sizes varying from one
line to 45,263 lines of code (Radare2's assembler). Thus, we produced
an initial code base of 698,449 C files, each file containing a single
function.

We run Psyche-C (http://cuda.dcc.ufmg.br/psyche-c/) with a timeout of
30 seconds on this code base. Psyche-C has been able to reconstruct
dependencies of 529,498 functions; thus, ensuring their compilation.
Compilation consists in the generation of an object file out of the
function.

Out of the 698,449 functions, 31,935 were directly compilable as-is,
that is, without type inference. To perform automatic compilation, we
invoke clang onto a whole C file. In case of success, we count as
compilable every function with a body within that file. Hence, without
type inference, we could ensure compilation of 4.6% of the programs.
With type inference, we could ensure compilation of 75,8% of all the
programs. Failures to reconstruct types were mostly due to macros that
were not syntactically valid in C without preprocessing. Only 3,666
functions could not be reconstructed within the allotted 30-second
time slot.

So, we compile automatically less about 5% of the functions that we
download, even considering all the dependencies in the C files where
these functions exist. Nevertheless, given that we can download
millions of functions, 5% is already enough to give us a
non-negligible number of benchmarks. However, these compilable
functions tend to be very small. The median number of LLVM bytecodes
is seven (in contrast with >60 once we use type inference). Said
functions are unlikely to contain features such as arrays of structs,
type casts, recursive types, double pointer dereferences, etc.
> To me this looks like a really good way to skew the input data set, e.g.,
you don't ever see a call that can be inlined or for which inter-procedural
reasoning is performed. As a consequence each function is way smaller than it
would be in a real run, with all the consequences on the results obtained from
such benchmarks. Again, why can't we take the original programs instead?
Well, in the end, just using the compilable functions leads to poor
predictions. For instance, using these compilable functions, YaCoS
(it's the framework that we have been using) reduces the size of
MiBench's Bitcount by 10%, whereas using AnghaBench, it achieves
16.9%. In Susan, the naturally compilable functions lead to an
increase of code size (5.4%), whereas AnghaBench reduces size by 1.7%.
Although we can find benchmarks in MiBench where the naturally
compilable  functions lead to better code reduction, these gains tend
to be very close to those obtained by AnghaBench, and seldom occur.

About inlining, you are right: there will be no inlining. To get
around this problem, we also have a database of 15K whole files, which
contains files with multiple functions. The programs are available
here: http://cuda.dcc.ufmg.br/angha/files/suites/angha_wholefiles_all_15k.tar.gz

Regards,

Fernando

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Apr 2020 - Adding a new External Suite to test-suite

[llvm-dev] Adding a new External Suite to test-suite

[llvm-dev] Adding a new External Suite to test-suite

[llvm-dev] Adding a new External Suite to test-suite

Apparently Analagous Threads