thr3ads.net - llvm dev - [llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Dean Michael Berris via llvm-dev

2016-Sep-09 02:35 UTC

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

> On 7 Sep 2016, at 01:21, David Blaikie <dblaikie at gmail.com> wrote:
> 
> (sorry for the delay)
> 
All good, thanks Dave!
> On Tue, Aug 23, 2016 at 1:05 AM Dean Michael Berris <dean.berris at
gmail.com <mailto:dean.berris at gmail.com>> wrote:
> Hi llvm-dev,
> 
> I've been implementing a tool for analysing XRay traces. A recap of
XRay's original RFC [0] mentions a tool that does function call accounting
as a starting point. This is implemented currently in D21987 [1], and is being
reviewed by David Blaikie.
> 
> One key issue in that review is the dependency between the log format
determined by the XRay runtime implementation in compiler-rt [2] and the tool
reading these log entries.
> 
> While it seems obvious that we should document clearly the file format of
the traces (even supporting different versions) there's a clear dependency
between the writer (XRay in compiler-rt) and the reader (the tool under
development in LLVM). In this RFC, I'd like to explore some options
regarding the coordination of these two moving pieces located in two places --
in particular, compiler-rt and the LLVM tools.
> 
> # Problem Statement & Background
> 
> XRay traces are only as useful as the analysis you can perform on it. While
it's great to be able to look at stack traces, sometimes basic statistics
and summaries are more digestible and gives a more immediate picture of the
operations performed by one run of a particular binary (or multiple runs of the
same binary on different inputs). Recently I've shared some initial results
of the analysis available in [1] on an instrumented build of Clang [3] -- and
this is just one example of the kinds of analysis possible with the data.
However, there's one wrinkle here:
> 
>   The analysis should be developed independently of the logging
implementation.
> 
> There's many reasons for doing this, and while it's certainly
possible to implement a custom logging handler for XRay-instrumented binaries to
generate some statistics on the fly instead of logging the function calls, this
increases the cost and friction of getting value out of using XRay.
> 
> Given this constraint, here are a few problems:
> 
>   - The runtime library and the tools reading the log should have a common
understanding of the log format. For now we use a naive binary dump log file
format. We understand that there are platform and encoding issues that come with
this (endianness being one of them, size of fields being another across
platforms) but that this could be mitigated with enough metadata in the
beginning the of files to indicate these encoding issues in a portable manner.
Still, this is not easy, and having more complex schemes impose a heavier cost
to the runtime implementation.
>   - The analysis tools should be able to read different executable file
formats -- currently we only support ELF 64-bit. Since some analysis tools would
really be great if they knew to convert function id's generated by the XRay
runtime, having the instrumentation maps from the executables instrumented with
XRay goes a long way to converting function id's to even just function
pointers, and eventually to de-mangled function names. This means the tool will
have to support multiple file formats that XRay-instrumented binaries ought to
be ported to (COFF, MachO, and ELF).
>   - Having the analysis work on common in-memory (or on-disk) data
structures ensures maximum applicability. This means even if the log file format
changes, the analysis should still be able to work as long as we're keeping
at least the same information in the log required by the analyses. For example,
a hypothetical tool for generating just a graph of function calls encountered in
a trace with counts ought to be feasible without being tied to the format of the
XRay trace being fed to the tool.
> 
> This last requirement is a bit that I'm slightly confused by/trying to
better understand. I could picture tools as taking a dependency on some LLVM API
for reading the original, platform-specific, binary format. This would make the
tool neutral to versioning and target.
> 
> But I take it you mean (as detailed later) to have a separate format (could
be a portable binary format, but currently discussing it as JSON/YAML/etc) that
things are converted from that makes them portable?
> 
> One of the reasons, I think you mentioned, is that while the log is already
a separate file, you really want the instrumentation map along with it, and
that's in the whole binary which you probably don't need. Am I following
correctly?
So there are two pieces here:

1) The instrumentation map that's in the binary.
2) The log file written out by an XRay-instrumented binary.

For 1, we for certain should (and already) rely on the LLVM libraries for
dealing with all supported binary file formats.

For 2, we are currently using a very simple "naive" format which is
just a binary dump of data in memory onto-disk. The main driver for this is the
minimal cost in space and time in the XRay runtime.

We'd like to minimise processing the in-memory data from the XRay runtime
perspective, so we do the simplest thing that could possibly work here and write
out fixed-sized records into a binary file. In the future, we might have
different log structures (non-fixed sized records as described in the
whitepaper, records with different types, etc.) but those would still need to be
minimal cost to store.

This complicates the tool slightly, because it has to support all these
different log formats that are first-class supported by the XRay runtime.
Whether the instrumentation map is available is a separate issue -- it  should
be possible to just take an XRay log/trace and run analysis against just the log
file, and not get the translation from function id to function names and
debugging info.

This binary format log will be platform specific due to endianness and sizes of
the fields/records, and I'm saying the tool should be able to handle these
log files too. Of course this means we indicate these details in the log file
itself which would be a little challenging if they stay in binary form.
> 
> Should this extraction then be an extract and merge? (creating a file
containing a log and instrumentation map together in this generic format?)
It could be, but really doesn't need to be -- since the intent is the tools
should be able to work on the log file without the instrumentation map at all,
albeit at reduced functionality (i.e. we can't get function names and debug
info, but still be able to get function ids).
>  
> 
> More concisely:
> 
>   1. We ought to be able to share log writer/reader code between LLVM and
compiler-rt.
>   2. Converting the trace format from platform-specific to
platform-agnostic (and vice versa) ought to be possible.
>   3. The tooling ought to be extensible with more analysis implementations
without being tied to the log format.
> 
> # Proposed Solution
> 
> In [1] I've gone ahead and implemented a tool, currently named
'llvm-xray' which supports sub-commands to do the following:
> 
>   - `llvm-xray extract <xray-instrumented binary>` : Converts the
xray_instr_map in the binary into something more human and machine-readable text
(currently does JSON, but I understand YAML is already supported better in
LLVM).
>   - `llvm-xray account <xray trace> -m <xray-instrumented
binary>` : Performs function call accounting with basic statistics.
> 
> In the near future, we're looking to extend this tool to have the
following (and similar) functionality:
> 
>   - `llvm-xray dump <xray trace> -format={yaml,json,...}` : Takes an
xray trace and turns it into some human-readable text format.
>   - `llvm-xray ingest <xray trace> -input-format={yaml,json...}` :
Takes an xray trace in some human-readable text format, turns it into the binary
format.
> 
> What's the need for this direction? Only for LLVM test purposes? Other
reasons?
> 
Mostly for testing purposes, and for "portability". If for example
we'd like to share these traces around in a neutral human-readable format
(for whatever reason), then having it in these text formats is much more
convenient to inspect and reason about (and snip in emails, serve on web pages,
be searchable, etc.).
> (for DWARF for example, we just generate DWARF from existing code and test
that, rather than having a separate/independent format for generating DWARF more
directly - but we don't have complicated DWARF tools (we have llvm-dwp which
is as close as we get, and in that case I just checked in binary object files
along with the source used to create them)
> 
> This is certainly an area of discussion - with tools like lld taking a few
different approaches (including a YAML format for specifying object files, or
using assembly files and just assembling them on the fly in test cases, or
checking in binary object files). So there's no clear pre-existing answer in
LLVM for this situation, for sure)
I agree -- it's the pioneer's curse I guess, being there first is both
good and bad. :)

That said, I think this ability to round-trip is an important one in principle.
And in the case of the XRay tooling, it seems it's essential to ensure that
we're able to write tests and inspect the binary traces in a human-readable
format.
>  
>   - `llvm-xray stack <xray trace> -input-format=... -format=...` :
Recreates stack traces from an xray trace.
>   - `llvm-xray graph <xray trace> -input-format=...` : Creates a
graph (in dot format) of the function call interactions from the trace file.
> 
> This allows us to do a few things:
> 
>   1. When testing xray in compiler-rt, use the "dump" tool to
inspect the contents of the log generated from xray-instrumented binaries.
> 
> Might be worth considering whether dumping for testability should be
YAML/JSON or something else. (the DWARF and object dumping used in LLVM
isn't in any such format - it's just a format designed for humans which
works well enough for our FileCheck testing, etc)
> 
> But if we need a format change to feed it in to other tools, then yes -
testing on that format (rather than having a JSON/YAML then a separate dumping
format) makes sense. I'm just trying to separate out the different
requirements and what implications they have on the design, etc.
>  
That's an interesting thought. I was trying to avoid defining yet another
text format because there's already seemingly mature support for YAML I/O in
LLVM. JSON is just one of those other formats that's useful for interchange
in this web-connected world, and is really a "nice to have" and people
have expressed as a format that's good to use.

If I was going to trim this down, I'd settle just for YAML for the purposes
of human consumption.

If push came to shove, a custom text format that's unique for the XRay
requirements.

If at all possible, I'd like to just use the available libraries in LLVM. :)
> Similarly be able to synthesise xray binary traces in llvm lit tests using
"ingest".
>   2. Extend the tool with more functionality without having to be gated on
the definition of and/or implementation of the trace format. Since we can define
the reader and writer implementation in one place, we can use the tool to
enforce the format in regression tests (and as we evolve the format further, be
able to support backward compatibility).
> 
> # Proposed Plan of Action
> 
> If the proposed solution is acceptable, the proposed plan of action is as
follows (in chronological order):
> 
>   0. Break up [1] into smaller pieces, starting with the base llvm-xray
tool that literally "does nothing".
>   1. Implement the 'dump' and 'ingest' sub-commands as a
single patch, with defined tests.
>   2. Update the logging implementation in [2] to use the 'dump'
sub-command to test that entries in the log are what we expect them.
>   3. Implement the 'account' sub-command with tests seeded with
data in lit tests.
>   4. Implement the 'stack' sub-command with tests seeded with data
similar to #3.
>   5. Implement the 'graph' sub-command similar to #3.
> 
> Note that we do not actually solve the issue of sharing the log
writer/reader code between LLVM and compiler-rt directly, but rather we sidestep
this in the meantime using the tool.
> 
> # Open Questions
> 
> - Is it possible to define the writer code in LLVM and have the compiler-rt
implementation depend on it? I hear that this is going to be useful for
something like the profiling library in compiler-rt too, so that the readers and
writer implementations are both in LLVM. What are the technical roadblocks
there, and in your opinion is this something worth fixing/enabling?
> 
> Sounds like other people have some ideas on that mentioned in the thread -
again, not an area I'm especially familiar with.
>  
> - What is the preferred human-readable text file format to support in LLVM?
I understand that there's already code to support parsing YAML, so this
might be an obvious choice. OTOH JSON is really popular and there are a lot of
parsers in other languages that can already deal with this file format. I'm
happy to support both but was wondering whether there was a preference for YAML
aside for the reason I already cite?
> 
> I really don't have much/any context here to make a judgement -
I've vaguely seen the existing YAML usage & know there was/is some in
LLD, maybe some being used over in the codeview debug info support (for
generating codeview debug info).
>  
> - This proposal only talks of the tool itself, but the implementation of
the tool involves some moving parts that are worth implementing as libraries and
tested in isolation (or in combinations, some mocked and faked, etc). I'm a
fan of writing unit tests for these things but I don't see a unittests/tools
directory for these tool-specific internals testing. Is this something worth
having? Any pointers on how to proceed with this unit-testing of tool-specific
internals?
> 
> Generally we make the tools small and put any generically usable code in
libraries in LLVM (see libDebugInfo which was used for quite a while (&
parts of it still are) exclusively for llvm-dwarfdump (some parts are now used
in llvm-dwp and llvm-dsymutil)).
> 
> So if there's some reasonable library code you could put it in
LLVM's lib directory in an appropriate spot. Or you can add unit tests for a
tool - don't think there's any philosophical reason that'd be
avoided.
> 
That's a thought. We could make the log reading code something that's a
library that anybody should be able to use. If it was going to be a new library
under LLVM, do I just create a directory+CMake directives in include/llvm/ and
move as much of the logic there? Or put it say in include/llvm/Support/ ?

-- Dean

Dean Michael Berris via llvm-dev

2016-Sep-09 06:34 UTC

head link

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

> On 9 Sep 2016, at 12:35, Dean Michael Berris <dean.berris at
gmail.com> wrote:
> 
> 
> 
>> On 7 Sep 2016, at 01:21, David Blaikie <dblaikie at gmail.com>
wrote:
>> 
>> But I take it you mean (as detailed later) to have a separate format
(could be a portable binary format, but currently discussing it as
JSON/YAML/etc) that things are converted from that makes them portable?
>> 
One thing worth mentioning on this regard is maybe we can use Flatbuffers
(https://google.github.io/flatbuffers/) for the XRay log.

Flatbuffers is Apache 2 licensed though, and if we're going to use it in
compiler-rt, whether that raises issues if embedded in other people's
applications.

This might potentially get us around the non-sharing of code if we're able
to keep the flatbuffer definitions in sync at least across compiler-rt and LLVM.
Also, it might be useful to use flatbuffers for other things in LLVM as well so
that might be something worth exploring (the bit code comes to mind, somehow
being discussed on IRC recently).

I am not a lawyer nor do I play one on the Internet, so I'll let Danny
Berlin chime in on this one on whether/how we can use Flatbuffers for XRay.

-- Dean

David Blaikie via llvm-dev

2016-Sep-09 16:05 UTC

head link

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

On Thu, Sep 8, 2016 at 11:34 PM Dean Michael Berris <dean.berris at
gmail.com>
wrote:
>
> > On 9 Sep 2016, at 12:35, Dean Michael Berris <dean.berris at
gmail.com>
> wrote:
> >
> >
> >
> >> On 7 Sep 2016, at 01:21, David Blaikie <dblaikie at
gmail.com> wrote:
> >>
> >> But I take it you mean (as detailed later) to have a separate
format
> (could be a portable binary format, but currently discussing it as
> JSON/YAML/etc) that things are converted from that makes them portable?
> >>
>
> One thing worth mentioning on this regard is maybe we can use Flatbuffers (
> https://google.github.io/flatbuffers/) for the XRay log.
>
For the secondary form (instead of YAML/JSON) or the primary form (the
thing compiler-rt writes out)? Or as a form that would cover both use cases
without the need to convert from one format to another?

>
> Flatbuffers is Apache 2 licensed though, and if we're going to use it
in
> compiler-rt, whether that raises issues if embedded in other people's
> applications.
>
> This might potentially get us around the non-sharing of code if we're
able
> to keep the flatbuffer definitions in sync at least across compiler-rt and
> LLVM. Also, it might be useful to use flatbuffers for other things in LLVM
> as well so that might be something worth exploring (the bit code comes to
> mind, somehow being discussed on IRC recently).
>
> I am not a lawyer nor do I play one on the Internet, so I'll let Danny
> Berlin chime in on this one on whether/how we can use Flatbuffers for XRay.
>
> -- Dean
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160909/ba892013/attachment.html>

llvm dev - Sep 2016 - [XRay][RFC] Tooling for XRay Trace Analysis

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis