thr3ads.net - llvm dev - [llvm-dev] Linker option to dump dependency graph [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Rui Ueyama via llvm-dev

2019-Feb-27 00:05 UTC

[llvm-dev] Linker option to dump dependency graph

On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at gmail.com>
wrote:
> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I've heard people say that they want to analyze dependencies
between
>> object files at the linker level so that they can run a whole-program
>> analysis which cannot be done at the compiler that works for one
>> compilation unit at a time. I'd like to start a discussion as to
what we
>> can do with it and how to make it possible. I'm also sharing my
idea about
>> how to make it possible.
>>
>> *Dependency analyses*
>> First, let me start with a few examples of analyses I'm heard of or
>> thinking about. Dependencies between object files can be represented as
a
>> graph where vertices are input sections and edges are symbols and
>> relocations. Analyses would work on the dependency graph. Examples of
>> analyses include but not limited to the following:
>>
>>  - Figure out why some library or an object file gets linked.
>>
>>  - Finding a candidate to eliminate dependency by finding a
"weak" link
>> to a library. We can for example say the dependency to a library is
>> *weak* if the library in the graph can be unreachable if we remove *N*
edges
>> from the graph (which is likely to correspond to removing *N* function
>> calls from the code), where *N* is a small number.
>>
>>  - Understanding which of new dependencies increase the executable size
>> the most, compare to a previous build.
>>
>>  - Finding bad or circular dependencies between sub-components.
>>
>> There would be many more analyses you want to run at the linker input
>> level. Currently, lld doesn't actively support such analyses. There
are a
>> few options to make the linker emit dependency information (e.g. --cref
or
>> -Map), but the output of the options is not comprehensive; you cannot
>> reconstruct a dependency graph from the output of the options.
>>
>> *Dumping dependency graph*
>> So, I'm thinking if it would be desirable to add a new feature to
the
>> linker to dump an entire dependency graph in such a way that a graph
can be
>> reconstructed by reading it back. Once we have such feature, we can
link a
>> program with the feature enabled and run any kind of dependency
analysis on
>> the output. You can save dumps to compare to previous builds. You can
run
>> any number of analyses on a dump, instead of invoking the linker for
each
>> analysis.
>>
>> I don't have a concrete idea about the file output format, but I
believe
>> it is essentially enough to emit triplets of (<from input
section>,
>> <symbol>, <to input section>), which represents an edge, to
reconstruct a
>> graph.
>>
>> Thoughts?
>>
>
> Back when I worked on the linker I pretty much always had a way to dump a
> graphviz dot file to look at things.  Pretty much every graph library/tool
> can read dot files, and they are easy to hack up a parser for.  You can
> also add attributes to nodes and edges to store arbitrary data.
>
That's an interesting idea.

As for what to put it in, it really depends on how detailed it needs
to> be.  Should symbols and sections be collapsed together?  Should it include
> relocation types? Symbol types/binding/size/etc?
>
 Maybe everything? We can for example emit all symbols and input sections
first, and then emit a graph as the second half of the output. E.g.

Symbols:
  <list of symbols>
Sections:
  <list of sections>
Graph:
 1 2 3  // 1st section depends on 3rd section via 2nd symbol
 5 1 4  // likewise
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/3e2d5fbd/attachment.html>

Michael Spencer via llvm-dev

2019-Feb-27 00:31 UTC

head link

[llvm-dev] Linker option to dump dependency graph

On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> wrote:
> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at
gmail.com>
> wrote:
>
>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi,
>>>
>>> I've heard people say that they want to analyze dependencies
between
>>> object files at the linker level so that they can run a
whole-program
>>> analysis which cannot be done at the compiler that works for one
>>> compilation unit at a time. I'd like to start a discussion as
to what we
>>> can do with it and how to make it possible. I'm also sharing my
idea about
>>> how to make it possible.
>>>
>>> *Dependency analyses*
>>> First, let me start with a few examples of analyses I'm heard
of or
>>> thinking about. Dependencies between object files can be
represented as a
>>> graph where vertices are input sections and edges are symbols and
>>> relocations. Analyses would work on the dependency graph. Examples
of
>>> analyses include but not limited to the following:
>>>
>>>  - Figure out why some library or an object file gets linked.
>>>
>>>  - Finding a candidate to eliminate dependency by finding a
"weak" link
>>> to a library. We can for example say the dependency to a library is
>>> *weak* if the library in the graph can be unreachable if we remove
*N* edges
>>> from the graph (which is likely to correspond to removing *N*
function
>>> calls from the code), where *N* is a small number.
>>>
>>>  - Understanding which of new dependencies increase the executable
size
>>> the most, compare to a previous build.
>>>
>>>  - Finding bad or circular dependencies between sub-components.
>>>
>>> There would be many more analyses you want to run at the linker
input
>>> level. Currently, lld doesn't actively support such analyses.
There are a
>>> few options to make the linker emit dependency information (e.g.
--cref or
>>> -Map), but the output of the options is not comprehensive; you
cannot
>>> reconstruct a dependency graph from the output of the options.
>>>
>>> *Dumping dependency graph*
>>> So, I'm thinking if it would be desirable to add a new feature
to the
>>> linker to dump an entire dependency graph in such a way that a
graph can be
>>> reconstructed by reading it back. Once we have such feature, we can
link a
>>> program with the feature enabled and run any kind of dependency
analysis on
>>> the output. You can save dumps to compare to previous builds. You
can run
>>> any number of analyses on a dump, instead of invoking the linker
for each
>>> analysis.
>>>
>>> I don't have a concrete idea about the file output format, but
I believe
>>> it is essentially enough to emit triplets of (<from input
section>,
>>> <symbol>, <to input section>), which represents an
edge, to reconstruct a
>>> graph.
>>>
>>> Thoughts?
>>>
>>
>> Back when I worked on the linker I pretty much always had a way to dump
a
>> graphviz dot file to look at things.  Pretty much every graph
library/tool
>> can read dot files, and they are easy to hack up a parser for.  You can
>> also add attributes to nodes and edges to store arbitrary data.
>>
>
> That's an interesting idea.
>
> As for what to put it in, it really depends on how detailed it needs to
>> be.  Should symbols and sections be collapsed together?  Should it
include
>> relocation types? Symbol types/binding/size/etc?
>>
>
>  Maybe everything? We can for example emit all symbols and input sections
> first, and then emit a graph as the second half of the output. E.g.
>
> Symbols:
>   <list of symbols>
> Sections:
>   <list of sections>
> Graph:
>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>  5 1 4  // likewise
>
I suppose it's a question of if we want users to need to also read the
inputs if they want things like section size and other section/symbol
attributes.  It would be pretty trivial to include that data as long as we
have a format/syntax for it.

dot supports listing nodes first with attributes and then referring to them
by name later when listing edges.

- Michael Spencer
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/98b7748a/attachment.html>

Peter Smith via llvm-dev

2019-Feb-27 10:37 UTC

head link

[llvm-dev] Linker option to dump dependency graph

Hello,

I think outputting a dependency graph is a good idea and would enable
some offline analysis. I think that there is some advantage to
building some of the simpler ones in, particularly those that would
need heavy annotations to the dependency graph, in particular unless
we write a sample analysis tool that ships with the release, many
users are going to miss out on useful features as they aren't going to
have the time to build one. I've put some comments inline:

On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com>
wrote:
>>
>> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at
gmail.com> wrote:
>>>
>>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've heard people say that they want to analyze
dependencies between object files at the linker level so that they can run a
whole-program analysis which cannot be done at the compiler that works for one
compilation unit at a time. I'd like to start a discussion as to what we can
do with it and how to make it possible. I'm also sharing my idea about how
to make it possible.
>>>>
>>>> Dependency analyses
>>>> First, let me start with a few examples of analyses I'm
heard of or thinking about. Dependencies between object files can be represented
as a graph where vertices are input sections and edges are symbols and
relocations. Analyses would work on the dependency graph. Examples of analyses
include but not limited to the following:
>>>>
>>>>  - Figure out why some library or an object file gets linked.
>>>>
Arm's proprietary linker has a very helpful feature in verbose mode
where it will report on object loading: global/weak definitions and
global/weak references. For libraries you'd get a message like
selecting member.o from library.a to define symbol S. This resulted in
quite an effective trace of the linker output that could answer most
"why did this library and object file get loaded question?" One thing
a dependency graph might not capture is the order in which events
occur, this can be very useful when debugging problems caused by
library selection order.
>>>>  - Finding a candidate to eliminate dependency by finding a
"weak" link to a library. We can for example say the dependency to a
library is weak if the library in the graph can be unreachable if we remove N
edges from the graph (which is likely to correspond to removing N function calls
from the code), where N is a small number.
>>>>
>>>>  - Understanding which of new dependencies increase the
executable size the most, compare to a previous build.
>>>>
Arm's linker, being focused on embedded systems has a useful feature
that summarises the amount of content taken from each object broken
down into code, ro-data, rw-date etc. This can be helpful in the face
of comdat group elimination and optimisations such as garbage
collection and ICF that can be difficult to predict from a dependency
graph. It is true that this information could be added as attributes
but again it may just be easier to write a simple analysis pass over
the output in the linker.
>>>>  - Finding bad or circular dependencies between sub-components.
>>>>
>>>> There would be many more analyses you want to run at the linker
input level. Currently, lld doesn't actively support such analyses. There
are a few options to make the linker emit dependency information (e.g. --cref or
-Map), but the output of the options is not comprehensive; you cannot
reconstruct a dependency graph from the output of the options.

>>>>
>>>> Dumping dependency graph
>>>> So, I'm thinking if it would be desirable to add a new
feature to the linker to dump an entire dependency graph in such a way that a
graph can be reconstructed by reading it back. Once we have such feature, we can
link a program with the feature enabled and run any kind of dependency analysis
on the output. You can save dumps to compare to previous builds. You can run any
number of analyses on a dump, instead of invoking the linker for each analysis.
>>>>
>>>> I don't have a concrete idea about the file output format,
but I believe it is essentially enough to emit triplets of (<from input
section>, <symbol>, <to input section>), which represents an
edge, to reconstruct a graph.
>>>>
>>>> Thoughts?
>>>
>>>
>>> Back when I worked on the linker I pretty much always had a way to
dump a graphviz dot file to look at things.  Pretty much every graph
library/tool can read dot files, and they are easy to hack up a parser for.  You
can also add attributes to nodes and edges to store arbitrary data.
>>
>>
>> That's an interesting idea.
>>
>>> As for what to put it in, it really depends on how detailed it
needs to be.  Should symbols and sections be collapsed together?  Should it
include relocation types? Symbol types/binding/size/etc?
>>
>>
>>  Maybe everything? We can for example emit all symbols and input
sections first, and then emit a graph as the second half of the output. E.g.
>>
>> Symbols:
>>   <list of symbols>
>> Sections:
>>   <list of sections>
>> Graph:
>>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>>  5 1 4  // likewise
>
>
> I suppose it's a question of if we want users to need to also read the
inputs if they want things like section size and other section/symbol
attributes.  It would be pretty trivial to include that data as long as we have
a format/syntax for it.
>
> dot supports listing nodes first with attributes and then referring to them
by name later when listing edges.
>
> - Michael Spencer
>
I've experimented with dot files for this type of thing in the past.
The difficulty is that they get too large to be realistically viewed
very quickly. At that point you need to write scripts to process the
output and in that case you may as well use JSON or XML, which I guess
could easily be processed into dot files. To summarise, I think we may
be able to do quite well with some very simple extra analysis in LLD,
a machine readable dependency graph would also be very useful for the
more complex cases.

Peter

 _______________________________________________> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - Feb 2019 - Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph