thr3ads.net - llvm dev - [llvm-dev] Linker option to dump dependency graph [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Rui Ueyama via llvm-dev

2019-Feb-26 22:23 UTC

[llvm-dev] Linker option to dump dependency graph

Hi,

I've heard people say that they want to analyze dependencies between object
files at the linker level so that they can run a whole-program analysis
which cannot be done at the compiler that works for one compilation unit at
a time. I'd like to start a discussion as to what we can do with it and how
to make it possible. I'm also sharing my idea about how to make it possible.

*Dependency analyses*
First, let me start with a few examples of analyses I'm heard of or
thinking about. Dependencies between object files can be represented as a
graph where vertices are input sections and edges are symbols and
relocations. Analyses would work on the dependency graph. Examples of
analyses include but not limited to the following:

 - Figure out why some library or an object file gets linked.

 - Finding a candidate to eliminate dependency by finding a "weak"
link to
a library. We can for example say the dependency to a library is *weak* if
the library in the graph can be unreachable if we remove *N* edges from the
graph (which is likely to correspond to removing *N* function calls from
the code), where *N* is a small number.

 - Understanding which of new dependencies increase the executable size the
most, compare to a previous build.

 - Finding bad or circular dependencies between sub-components.

There would be many more analyses you want to run at the linker input
level. Currently, lld doesn't actively support such analyses. There are a
few options to make the linker emit dependency information (e.g. --cref or
-Map), but the output of the options is not comprehensive; you cannot
reconstruct a dependency graph from the output of the options.

*Dumping dependency graph*
So, I'm thinking if it would be desirable to add a new feature to the
linker to dump an entire dependency graph in such a way that a graph can be
reconstructed by reading it back. Once we have such feature, we can link a
program with the feature enabled and run any kind of dependency analysis on
the output. You can save dumps to compare to previous builds. You can run
any number of analyses on a dump, instead of invoking the linker for each
analysis.

I don't have a concrete idea about the file output format, but I believe it
is essentially enough to emit triplets of (<from input section>,
<symbol>,
<to input section>), which represents an edge, to reconstruct a graph.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/b3fd9b27/attachment.html>

Michael Spencer via llvm-dev

2019-Feb-26 23:31 UTC

head link

[llvm-dev] Linker option to dump dependency graph

On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I've heard people say that they want to analyze dependencies between
> object files at the linker level so that they can run a whole-program
> analysis which cannot be done at the compiler that works for one
> compilation unit at a time. I'd like to start a discussion as to what
we
> can do with it and how to make it possible. I'm also sharing my idea
about
> how to make it possible.
>
> *Dependency analyses*
> First, let me start with a few examples of analyses I'm heard of or
> thinking about. Dependencies between object files can be represented as a
> graph where vertices are input sections and edges are symbols and
> relocations. Analyses would work on the dependency graph. Examples of
> analyses include but not limited to the following:
>
>  - Figure out why some library or an object file gets linked.
>
>  - Finding a candidate to eliminate dependency by finding a
"weak" link to
> a library. We can for example say the dependency to a library is *weak*
> if the library in the graph can be unreachable if we remove *N* edges
> from the graph (which is likely to correspond to removing *N* function
> calls from the code), where *N* is a small number.
>
>  - Understanding which of new dependencies increase the executable size
> the most, compare to a previous build.
>
>  - Finding bad or circular dependencies between sub-components.
>
> There would be many more analyses you want to run at the linker input
> level. Currently, lld doesn't actively support such analyses. There are
a
> few options to make the linker emit dependency information (e.g. --cref or
> -Map), but the output of the options is not comprehensive; you cannot
> reconstruct a dependency graph from the output of the options.
>
> *Dumping dependency graph*
> So, I'm thinking if it would be desirable to add a new feature to the
> linker to dump an entire dependency graph in such a way that a graph can be
> reconstructed by reading it back. Once we have such feature, we can link a
> program with the feature enabled and run any kind of dependency analysis on
> the output. You can save dumps to compare to previous builds. You can run
> any number of analyses on a dump, instead of invoking the linker for each
> analysis.
>
> I don't have a concrete idea about the file output format, but I
believe
> it is essentially enough to emit triplets of (<from input section>,
> <symbol>, <to input section>), which represents an edge, to
reconstruct a
> graph.
>
> Thoughts?
>
Back when I worked on the linker I pretty much always had a way to dump a
graphviz dot file to look at things.  Pretty much every graph library/tool
can read dot files, and they are easy to hack up a parser for.  You can
also add attributes to nodes and edges to store arbitrary data.

As for what to put it in, it really depends on how detailed it needs to
be.  Should symbols and sections be collapsed together?  Should it include
relocation types? Symbol types/binding/size/etc?

- Michael Spencer
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/2a121a29/attachment.html>

Rui Ueyama via llvm-dev

2019-Feb-27 00:05 UTC

head link

[llvm-dev] Linker option to dump dependency graph

On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at gmail.com>
wrote:
> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I've heard people say that they want to analyze dependencies
between
>> object files at the linker level so that they can run a whole-program
>> analysis which cannot be done at the compiler that works for one
>> compilation unit at a time. I'd like to start a discussion as to
what we
>> can do with it and how to make it possible. I'm also sharing my
idea about
>> how to make it possible.
>>
>> *Dependency analyses*
>> First, let me start with a few examples of analyses I'm heard of or
>> thinking about. Dependencies between object files can be represented as
a
>> graph where vertices are input sections and edges are symbols and
>> relocations. Analyses would work on the dependency graph. Examples of
>> analyses include but not limited to the following:
>>
>>  - Figure out why some library or an object file gets linked.
>>
>>  - Finding a candidate to eliminate dependency by finding a
"weak" link
>> to a library. We can for example say the dependency to a library is
>> *weak* if the library in the graph can be unreachable if we remove *N*
edges
>> from the graph (which is likely to correspond to removing *N* function
>> calls from the code), where *N* is a small number.
>>
>>  - Understanding which of new dependencies increase the executable size
>> the most, compare to a previous build.
>>
>>  - Finding bad or circular dependencies between sub-components.
>>
>> There would be many more analyses you want to run at the linker input
>> level. Currently, lld doesn't actively support such analyses. There
are a
>> few options to make the linker emit dependency information (e.g. --cref
or
>> -Map), but the output of the options is not comprehensive; you cannot
>> reconstruct a dependency graph from the output of the options.
>>
>> *Dumping dependency graph*
>> So, I'm thinking if it would be desirable to add a new feature to
the
>> linker to dump an entire dependency graph in such a way that a graph
can be
>> reconstructed by reading it back. Once we have such feature, we can
link a
>> program with the feature enabled and run any kind of dependency
analysis on
>> the output. You can save dumps to compare to previous builds. You can
run
>> any number of analyses on a dump, instead of invoking the linker for
each
>> analysis.
>>
>> I don't have a concrete idea about the file output format, but I
believe
>> it is essentially enough to emit triplets of (<from input
section>,
>> <symbol>, <to input section>), which represents an edge, to
reconstruct a
>> graph.
>>
>> Thoughts?
>>
>
> Back when I worked on the linker I pretty much always had a way to dump a
> graphviz dot file to look at things.  Pretty much every graph library/tool
> can read dot files, and they are easy to hack up a parser for.  You can
> also add attributes to nodes and edges to store arbitrary data.
>
That's an interesting idea.

As for what to put it in, it really depends on how detailed it needs
to> be.  Should symbols and sections be collapsed together?  Should it include
> relocation types? Symbol types/binding/size/etc?
>
 Maybe everything? We can for example emit all symbols and input sections
first, and then emit a graph as the second half of the output. E.g.

Symbols:
  <list of symbols>
Sections:
  <list of sections>
Graph:
 1 2 3  // 1st section depends on 3rd section via 2nd symbol
 5 1 4  // likewise
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/3e2d5fbd/attachment.html>

Peter Collingbourne via llvm-dev

2019-Feb-27 21:36 UTC

head link

[llvm-dev] Linker option to dump dependency graph

On Tue, Feb 26, 2019 at 2:24 PM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I've heard people say that they want to analyze dependencies between
> object files at the linker level so that they can run a whole-program
> analysis which cannot be done at the compiler that works for one
> compilation unit at a time. I'd like to start a discussion as to what
we
> can do with it and how to make it possible. I'm also sharing my idea
about
> how to make it possible.
>
> *Dependency analyses*
> First, let me start with a few examples of analyses I'm heard of or
> thinking about. Dependencies between object files can be represented as a
> graph where vertices are input sections and edges are symbols and
> relocations. Analyses would work on the dependency graph. Examples of
> analyses include but not limited to the following:
>
>  - Figure out why some library or an object file gets linked.
>
>  - Finding a candidate to eliminate dependency by finding a
"weak" link to
> a library. We can for example say the dependency to a library is *weak*
> if the library in the graph can be unreachable if we remove *N* edges
> from the graph (which is likely to correspond to removing *N* function
> calls from the code), where *N* is a small number.
>
>  - Understanding which of new dependencies increase the executable size
> the most, compare to a previous build.
>
>  - Finding bad or circular dependencies between sub-components.
>
> There would be many more analyses you want to run at the linker input
> level. Currently, lld doesn't actively support such analyses. There are
a
> few options to make the linker emit dependency information (e.g. --cref or
> -Map), but the output of the options is not comprehensive; you cannot
> reconstruct a dependency graph from the output of the options.
>
> *Dumping dependency graph*
> So, I'm thinking if it would be desirable to add a new feature to the
> linker to dump an entire dependency graph in such a way that a graph can be
> reconstructed by reading it back. Once we have such feature, we can link a
> program with the feature enabled and run any kind of dependency analysis on
> the output. You can save dumps to compare to previous builds. You can run
> any number of analyses on a dump, instead of invoking the linker for each
> analysis.
>
> I don't have a concrete idea about the file output format, but I
believe
> it is essentially enough to emit triplets of (<from input section>,
> <symbol>, <to input section>), which represents an edge, to
reconstruct a
> graph.
>
A couple of points:

- Symbols are not the only kind of dependency edge from one section to
another -- there's also SHF_LINK_ORDER. Maybe we can just call the edge
"SHF_LINK_ORDER" in that case.
- Do we want to mark up the GC roots in some way? I imagine that we could
do that with a synthetic node that represents the GC root, and then have
all roots include edges from it. With my proposal for partitioning, perhaps
we could have one GC root node per partition.

Peter

> Thoughts?
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190227/54897992/attachment-0001.html>

Rui Ueyama via llvm-dev

2019-Feb-28 00:40 UTC

head link

[llvm-dev] Linker option to dump dependency graph

On Wed, Feb 27, 2019 at 1:37 PM Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>
> On Tue, Feb 26, 2019 at 2:24 PM Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I've heard people say that they want to analyze dependencies
between
>> object files at the linker level so that they can run a whole-program
>> analysis which cannot be done at the compiler that works for one
>> compilation unit at a time. I'd like to start a discussion as to
what we
>> can do with it and how to make it possible. I'm also sharing my
idea about
>> how to make it possible.
>>
>> *Dependency analyses*
>> First, let me start with a few examples of analyses I'm heard of or
>> thinking about. Dependencies between object files can be represented as
a
>> graph where vertices are input sections and edges are symbols and
>> relocations. Analyses would work on the dependency graph. Examples of
>> analyses include but not limited to the following:
>>
>>  - Figure out why some library or an object file gets linked.
>>
>>  - Finding a candidate to eliminate dependency by finding a
"weak" link
>> to a library. We can for example say the dependency to a library is
>> *weak* if the library in the graph can be unreachable if we remove *N*
edges
>> from the graph (which is likely to correspond to removing *N* function
>> calls from the code), where *N* is a small number.
>>
>>  - Understanding which of new dependencies increase the executable size
>> the most, compare to a previous build.
>>
>>  - Finding bad or circular dependencies between sub-components.
>>
>> There would be many more analyses you want to run at the linker input
>> level. Currently, lld doesn't actively support such analyses. There
are a
>> few options to make the linker emit dependency information (e.g. --cref
or
>> -Map), but the output of the options is not comprehensive; you cannot
>> reconstruct a dependency graph from the output of the options.
>>
>> *Dumping dependency graph*
>> So, I'm thinking if it would be desirable to add a new feature to
the
>> linker to dump an entire dependency graph in such a way that a graph
can be
>> reconstructed by reading it back. Once we have such feature, we can
link a
>> program with the feature enabled and run any kind of dependency
analysis on
>> the output. You can save dumps to compare to previous builds. You can
run
>> any number of analyses on a dump, instead of invoking the linker for
each
>> analysis.
>>
>> I don't have a concrete idea about the file output format, but I
believe
>> it is essentially enough to emit triplets of (<from input
section>,
>> <symbol>, <to input section>), which represents an edge, to
reconstruct a
>> graph.
>>
>
> A couple of points:
>
> - Symbols are not the only kind of dependency edge from one section to
> another -- there's also SHF_LINK_ORDER. Maybe we can just call the edge
> "SHF_LINK_ORDER" in that case.
> - Do we want to mark up the GC roots in some way? I imagine that we could
> do that with a synthetic node that represents the GC root, and then have
> all roots include edges from it. With my proposal for partitioning, perhaps
> we could have one GC root node per partition.
>
I think we should mark up the GC root in some way. One thing to note is
that not only input sections but also symbols can be GC root. In terms of
the graph, both edge and vertex should have a "GC root" bit.

> Peter
>
>
>> Thoughts?
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
> --
> --
> Peter
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190227/91b326ac/attachment.html>

llvm dev - Feb 2019 - Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph