thr3ads.net - llvm dev - [llvm-dev] Linker option to dump dependency graph [Mar 2019]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne via llvm-dev

2019-Mar-01 05:18 UTC

[llvm-dev] Linker option to dump dependency graph

You might have realized this already but it's probably not a good idea to
use InputSection::Relocations for this because that ends up missing
anything that becomes a dynamic relocation. I reckon that the code should
be doing exactly what MarkLive.cpp is doing.

Peter

On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I hacked up a patch to make lld output a dependency graph in the graphviz
> "dot" format.
>
> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- this
> is a patch, which is just visiting all input sections and relocations. Note
> that this is far from completion but just a proof-of-concept.
>
> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a
> result of static-linking a "hello world" program.
>
> https://rui314.github.io/hello.svg  -- I rendered the above dot file with
> graphviz `sfdp` engine. The rendered graph is too large and very hard to
> read. Apparently, I need a better visualization tool.
>
> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at
google.com> wrote:
>
>> +1 for graphviz dot format, so that it can be consumed by any one of
many
>> existing graph visualization tools.
>>
>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> >To summarise, I think we may
>>> > be able to do quite well with some very simple extra analysis
in LLD,
>>> > a machine readable dependency graph would also be very useful
for the
>>> > more complex cases.
>>>
>>> Strongly agree. The linker based dependency graph would be very
useful
>>> for Uefi firmware. Below are my usage examples:
>>> 1. I need to detect the redundant code in my firmware, and I once
wrote
>>> a analysis tool to compare the IR level symbols and call graph info
before
>>> any optimization and after full optimization (e.g. LTO). But the IR
level
>>> info does not support assembly code info well. So, there are many
>>> dependency information missing and false positive in my analysis
tool. It
>>> will be more sound if the linker can help output complete and
accurate
>>> dependency graph for final executable.
>>> 2. I need a tool to analyze and  track the firmware module accurate
>>> dependency for build cache soundness. Build performance is now a
pain point
>>> in our CI system because every patch need to verify on many build
targets
>>> in our side. We hope to enable the build cache (both module level
and file
>>> level) to accelerate the build time. For module level build cache
enabling,
>>> a very important problem is how to know the module's accurate
dependency
>>> efficiently. I'm looking forward to the linker based dependency
graph
>>> feature.
>>>
>>>
>>> Thanks
>>> Steven
>>> > -----Original Message-----
>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
Behalf Of
>>> Peter
>>> > Smith via llvm-dev
>>> > Sent: Wednesday, February 27, 2019 6:37 PM
>>> > To: Michael Spencer <bigcheesegs at gmail.com>
>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph
>>> >
>>> > Hello,
>>> >
>>> > I think outputting a dependency graph is a good idea and would
enable
>>> > some offline analysis. I think that there is some advantage to
>>> > building some of the simpler ones in, particularly those that
would
>>> > need heavy annotations to the dependency graph, in particular
unless
>>> > we write a sample analysis tool that ships with the release,
many
>>> > users are going to miss out on useful features as they
aren't going to
>>> > have the time to build one. I've put some comments inline:
>>> >
>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
>>> > <llvm-dev at lists.llvm.org> wrote:
>>> > >
>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at
google.com> wrote:
>>> > >>
>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
>>> > <bigcheesegs at gmail.com> wrote:
>>> > >>>
>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via
llvm-dev <llvm-
>>> > dev at lists.llvm.org> wrote:
>>> > >>>>
>>> > >>>> Hi,
>>> > >>>>
>>> > >>>> I've heard people say that they want to
analyze dependencies
>>> between
>>> > object files at the linker level so that they can run a
whole-program
>>> analysis
>>> > which cannot be done at the compiler that works for one
compilation
>>> unit at
>>> > a time. I'd like to start a discussion as to what we can
do with it
>>> and how to
>>> > make it possible. I'm also sharing my idea about how to
make it
>>> possible.
>>> > >>>>
>>> > >>>> Dependency analyses
>>> > >>>> First, let me start with a few examples of
analyses I'm heard of
>>> or
>>> > thinking about. Dependencies between object files can be
represented
>>> as a
>>> > graph where vertices are input sections and edges are symbols
and
>>> > relocations. Analyses would work on the dependency graph.
Examples of
>>> > analyses include but not limited to the following:
>>> > >>>>
>>> > >>>>  - Figure out why some library or an object
file gets linked.
>>> > >>>>
>>> >
>>> > Arm's proprietary linker has a very helpful feature in
verbose mode
>>> > where it will report on object loading: global/weak
definitions and
>>> > global/weak references. For libraries you'd get a message
like
>>> > selecting member.o from library.a to define symbol S. This
resulted in
>>> > quite an effective trace of the linker output that could
answer most
>>> > "why did this library and object file get loaded
question?" One thing
>>> > a dependency graph might not capture is the order in which
events
>>> > occur, this can be very useful when debugging problems caused
by
>>> > library selection order.
>>> >
>>> > >>>>  - Finding a candidate to eliminate
dependency by finding a
>>> "weak" link
>>> > to a library. We can for example say the dependency to a
library is
>>> weak if
>>> > the library in the graph can be unreachable if we remove N
edges from
>>> the
>>> > graph (which is likely to correspond to removing N function
calls from
>>> the
>>> > code), where N is a small number.
>>> > >>>>
>>> > >>>>  - Understanding which of new dependencies
increase the executable
>>> > size the most, compare to a previous build.
>>> > >>>>
>>> >
>>> > Arm's linker, being focused on embedded systems has a
useful feature
>>> > that summarises the amount of content taken from each object
broken
>>> > down into code, ro-data, rw-date etc. This can be helpful in
the face
>>> > of comdat group elimination and optimisations such as garbage
>>> > collection and ICF that can be difficult to predict from a
dependency
>>> > graph. It is true that this information could be added as
attributes
>>> > but again it may just be easier to write a simple analysis
pass over
>>> > the output in the linker.
>>> >
>>> > >>>>  - Finding bad or circular dependencies
between sub-components.
>>> > >>>>
>>> > >>>> There would be many more analyses you want to
run at the linker
>>> input
>>> > level. Currently, lld doesn't actively support such
analyses. There
>>> are a few
>>> > options to make the linker emit dependency information (e.g.
--cref or
>>> -Map),
>>> > but the output of the options is not comprehensive; you cannot
>>> reconstruct a
>>> > dependency graph from the output of the options.
>>> >
>>> >
>>> >
>>> > >>>>
>>> > >>>> Dumping dependency graph
>>> > >>>> So, I'm thinking if it would be desirable
to add a new feature to
>>> the
>>> > linker to dump an entire dependency graph in such a way that a
graph
>>> can be
>>> > reconstructed by reading it back. Once we have such feature,
we can
>>> link a
>>> > program with the feature enabled and run any kind of
dependency
>>> analysis
>>> > on the output. You can save dumps to compare to previous
builds. You
>>> can
>>> > run any number of analyses on a dump, instead of invoking the
linker
>>> for
>>> > each analysis.
>>> > >>>>
>>> > >>>> I don't have a concrete idea about the
file output format, but I
>>> believe
>>> > it is essentially enough to emit triplets of (<from input
section>,
>>> <symbol>,
>>> > <to input section>), which represents an edge, to
reconstruct a graph.
>>> > >>>>
>>> > >>>> Thoughts?
>>> > >>>
>>> > >>>
>>> > >>> Back when I worked on the linker I pretty much
always had a way to
>>> > dump a graphviz dot file to look at things.  Pretty much every
graph
>>> > library/tool can read dot files, and they are easy to hack up
a parser
>>> for.  You
>>> > can also add attributes to nodes and edges to store arbitrary
data.
>>> > >>
>>> > >>
>>> > >> That's an interesting idea.
>>> > >>
>>> > >>> As for what to put it in, it really depends on
how detailed it
>>> needs to be.
>>> > Should symbols and sections be collapsed together?  Should it
include
>>> > relocation types? Symbol types/binding/size/etc?
>>> > >>
>>> > >>
>>> > >>  Maybe everything? We can for example emit all
symbols and input
>>> > sections first, and then emit a graph as the second half of
the
>>> output. E.g.
>>> > >>
>>> > >> Symbols:
>>> > >>   <list of symbols>
>>> > >> Sections:
>>> > >>   <list of sections>
>>> > >> Graph:
>>> > >>  1 2 3  // 1st section depends on 3rd section via 2nd
symbol
>>> > >>  5 1 4  // likewise
>>> > >
>>> > >
>>> > > I suppose it's a question of if we want users to need
to also read
>>> the inputs
>>> > if they want things like section size and other section/symbol
>>> attributes.  It
>>> > would be pretty trivial to include that data as long as we
have a
>>> > format/syntax for it.
>>> > >
>>> > > dot supports listing nodes first with attributes and then
referring
>>> to them by
>>> > name later when listing edges.
>>> > >
>>> > > - Michael Spencer
>>> > >
>>> >
>>> > I've experimented with dot files for this type of thing in
the past.
>>> > The difficulty is that they get too large to be realistically
viewed
>>> > very quickly. At that point you need to write scripts to
process the
>>> > output and in that case you may as well use JSON or XML, which
I guess
>>> > could easily be processed into dot files. To summarise, I
think we may
>>> > be able to do quite well with some very simple extra analysis
in LLD,
>>> > a machine readable dependency graph would also be very useful
for the
>>> > more complex cases.
>>> >
>>> > Peter
>>> >
>>> >  _______________________________________________
>>> > > LLVM Developers mailing list
>>> > > llvm-dev at lists.llvm.org
>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190228/36fb5e33/attachment.html>

Shi, Steven via llvm-dev

2019-Mar-01 12:57 UTC

head link

[llvm-dev] Linker option to dump dependency graph

Hi Rui,>https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a result
of static-linking a "hello world" program.I cannot see the main() function symbol in the dependency result. Shouldn’t the
dependency result include all linked symbols?


Thanks
Steven

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Peter
Collingbourne via llvm-dev
Sent: Friday, March 1, 2019 1:18 PM
To: Rui Ueyama <ruiu at google.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Linker option to dump dependency graph

You might have realized this already but it's probably not a good idea to
use InputSection::Relocations for this because that ends up missing anything
that becomes a dynamic relocation. I reckon that the code should be doing
exactly what MarkLive.cpp is doing.

Peter

On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
I hacked up a patch to make lld output a dependency graph in the graphviz
"dot" format.

https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- this is a
patch, which is just visiting all input sections and relocations. Note that this
is far from completion but just a proof-of-concept.

https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a result of
static-linking a "hello world" program.

https://rui314.github.io/hello.svg  -- I rendered the above dot file with
graphviz `sfdp` engine. The rendered graph is too large and very hard to read.
Apparently, I need a better visualization tool.

On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at
google.com<mailto:zturner at google.com>> wrote:
+1 for graphviz dot format, so that it can be consumed by any one of many
existing graph visualization tools.

On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:>To summarise, I think we may
> be able to do quite well with some very simple extra analysis in LLD,
> a machine readable dependency graph would also be very useful for the
> more complex cases.
Strongly agree. The linker based dependency graph would be very useful for Uefi
firmware. Below are my usage examples:
1. I need to detect the redundant code in my firmware, and I once wrote a
analysis tool to compare the IR level symbols and call graph info before any
optimization and after full optimization (e.g. LTO). But the IR level info does
not support assembly code info well. So, there are many dependency information
missing and false positive in my analysis tool. It will be more sound if the
linker can help output complete and accurate dependency graph for final
executable.
2. I need a tool to analyze and  track the firmware module accurate dependency
for build cache soundness. Build performance is now a pain point in our CI
system because every patch need to verify on many build targets in our side. We
hope to enable the build cache (both module level and file level) to accelerate
the build time. For module level build cache enabling, a very important problem
is how to know the module's accurate dependency efficiently. I'm looking
forward to the linker based dependency graph feature.


Thanks
Steven> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of
Peter
> Smith via llvm-dev
> Sent: Wednesday, February 27, 2019 6:37 PM
> To: Michael Spencer <bigcheesegs at gmail.com<mailto:bigcheesegs at
gmail.com>>
> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>
> Subject: Re: [llvm-dev] Linker option to dump dependency graph
>
> Hello,
>
> I think outputting a dependency graph is a good idea and would enable
> some offline analysis. I think that there is some advantage to
> building some of the simpler ones in, particularly those that would
> need heavy annotations to the dependency graph, in particular unless
> we write a sample analysis tool that ships with the release, many
> users are going to miss out on useful features as they aren't going to
> have the time to build one. I've put some comments inline:
>
> On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:
> >
> > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at
google.com<mailto:ruiu at google.com>> wrote:
> >>
> >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
> <bigcheesegs at gmail.com<mailto:bigcheesegs at gmail.com>>
wrote:
> >>>
> >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev
<llvm-
> dev at lists.llvm.org<mailto:dev at lists.llvm.org>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I've heard people say that they want to analyze
dependencies between
> object files at the linker level so that they can run a whole-program
analysis
> which cannot be done at the compiler that works for one compilation unit at
> a time. I'd like to start a discussion as to what we can do with it and
how to
> make it possible. I'm also sharing my idea about how to make it
possible.
> >>>>
> >>>> Dependency analyses
> >>>> First, let me start with a few examples of analyses
I'm heard of or
> thinking about. Dependencies between object files can be represented as a
> graph where vertices are input sections and edges are symbols and
> relocations. Analyses would work on the dependency graph. Examples of
> analyses include but not limited to the following:
> >>>>
> >>>>  - Figure out why some library or an object file gets
linked.
> >>>>
>
> Arm's proprietary linker has a very helpful feature in verbose mode
> where it will report on object loading: global/weak definitions and
> global/weak references. For libraries you'd get a message like
> selecting member.o from library.a to define symbol S. This resulted in
> quite an effective trace of the linker output that could answer most
> "why did this library and object file get loaded question?" One
thing
> a dependency graph might not capture is the order in which events
> occur, this can be very useful when debugging problems caused by
> library selection order.
>
> >>>>  - Finding a candidate to eliminate dependency by finding
a "weak" link
> to a library. We can for example say the dependency to a library is weak if
> the library in the graph can be unreachable if we remove N edges from the
> graph (which is likely to correspond to removing N function calls from the
> code), where N is a small number.
> >>>>
> >>>>  - Understanding which of new dependencies increase the
executable
> size the most, compare to a previous build.
> >>>>
>
> Arm's linker, being focused on embedded systems has a useful feature
> that summarises the amount of content taken from each object broken
> down into code, ro-data, rw-date etc. This can be helpful in the face
> of comdat group elimination and optimisations such as garbage
> collection and ICF that can be difficult to predict from a dependency
> graph. It is true that this information could be added as attributes
> but again it may just be easier to write a simple analysis pass over
> the output in the linker.
>
> >>>>  - Finding bad or circular dependencies between
sub-components.
> >>>>
> >>>> There would be many more analyses you want to run at the
linker input
> level. Currently, lld doesn't actively support such analyses. There are
a few
> options to make the linker emit dependency information (e.g. --cref or
-Map),
> but the output of the options is not comprehensive; you cannot reconstruct
a
> dependency graph from the output of the options.
>
>
>
> >>>>
> >>>> Dumping dependency graph
> >>>> So, I'm thinking if it would be desirable to add a new
feature to the
> linker to dump an entire dependency graph in such a way that a graph can be
> reconstructed by reading it back. Once we have such feature, we can link a
> program with the feature enabled and run any kind of dependency analysis
> on the output. You can save dumps to compare to previous builds. You can
> run any number of analyses on a dump, instead of invoking the linker for
> each analysis.
> >>>>
> >>>> I don't have a concrete idea about the file output
format, but I believe
> it is essentially enough to emit triplets of (<from input section>,
<symbol>,
> <to input section>), which represents an edge, to reconstruct a
graph.
> >>>>
> >>>> Thoughts?
> >>>
> >>>
> >>> Back when I worked on the linker I pretty much always had a
way to
> dump a graphviz dot file to look at things.  Pretty much every graph
> library/tool can read dot files, and they are easy to hack up a parser for.
You
> can also add attributes to nodes and edges to store arbitrary data.
> >>
> >>
> >> That's an interesting idea.
> >>
> >>> As for what to put it in, it really depends on how detailed it
needs to be.
> Should symbols and sections be collapsed together?  Should it include
> relocation types? Symbol types/binding/size/etc?
> >>
> >>
> >>  Maybe everything? We can for example emit all symbols and input
> sections first, and then emit a graph as the second half of the output.
E.g.
> >>
> >> Symbols:
> >>   <list of symbols>
> >> Sections:
> >>   <list of sections>
> >> Graph:
> >>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
> >>  5 1 4  // likewise
> >
> >
> > I suppose it's a question of if we want users to need to also read
the inputs
> if they want things like section size and other section/symbol attributes. 
It
> would be pretty trivial to include that data as long as we have a
> format/syntax for it.
> >
> > dot supports listing nodes first with attributes and then referring to
them by
> name later when listing edges.
> >
> > - Michael Spencer
> >
>
> I've experimented with dot files for this type of thing in the past.
> The difficulty is that they get too large to be realistically viewed
> very quickly. At that point you need to write scripts to process the
> output and in that case you may as well use JSON or XML, which I guess
> could easily be processed into dot files. To summarise, I think we may
> be able to do quite well with some very simple extra analysis in LLD,
> a machine readable dependency graph would also be very useful for the
> more complex cases.
>
> Peter
>
>  _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
--
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190301/256c0826/attachment.html>

Fāng-ruì Sòng via llvm-dev

2019-Mar-05 07:41 UTC

head link

[llvm-dev] Linker option to dump dependency graph

> One thing a dependency graph might not capture is the order in which events
occur, this can be very useful when debugging problems caused by library
selection order.
The event stream sounds like a more fine-grained --trace (-t).
> (<from input section>, <symbol>, <to input section>)
In --no-gc-sections mode and in some analysis, the file name part of
the input section should be good enough.
> section size and other section/symbol attributes
If such customization is favored and the complexity isn't a big issue,
it can probably be implemented as format specifiers (I'm thinking of
printf, ps -o, date, ...). The design of
https://github.com/Juniper/libxo can be used for reference.

We shall flesh out the possible vertex/edge types and additional
information that users may expect.

On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> You might have realized this already but it's probably not a good idea
to use InputSection::Relocations for this because that ends up missing anything
that becomes a dynamic relocation. I reckon that the code should be doing
exactly what MarkLive.cpp is doing.
>
> Peter
>
> On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>>
>> I hacked up a patch to make lld output a dependency graph in the
graphviz "dot" format.
>>
>> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- this
is a patch, which is just visiting all input sections and relocations. Note that
this is far from completion but just a proof-of-concept.
>>
>> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a
result of static-linking a "hello world" program.
>>
>> https://rui314.github.io/hello.svg  -- I rendered the above dot file
with graphviz `sfdp` engine. The rendered graph is too large and very hard to
read. Apparently, I need a better visualization tool.
>>
>> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at
google.com> wrote:
>>>
>>> +1 for graphviz dot format, so that it can be consumed by any one
of many existing graph visualization tools.
>>>
>>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> >To summarise, I think we may
>>>> > be able to do quite well with some very simple extra
analysis in LLD,
>>>> > a machine readable dependency graph would also be very
useful for the
>>>> > more complex cases.
>>>>
>>>> Strongly agree. The linker based dependency graph would be very
useful for Uefi firmware. Below are my usage examples:
>>>> 1. I need to detect the redundant code in my firmware, and I
once wrote a analysis tool to compare the IR level symbols and call graph info
before any optimization and after full optimization (e.g. LTO). But the IR level
info does not support assembly code info well. So, there are many dependency
information missing and false positive in my analysis tool. It will be more
sound if the linker can help output complete and accurate dependency graph for
final executable.
>>>> 2. I need a tool to analyze and  track the firmware module
accurate dependency for build cache soundness. Build performance is now a pain
point in our CI system because every patch need to verify on many build targets
in our side. We hope to enable the build cache (both module level and file
level) to accelerate the build time. For module level build cache enabling, a
very important problem is how to know the module's accurate dependency
efficiently. I'm looking forward to the linker based dependency graph
feature.
>>>>
>>>>
>>>> Thanks
>>>> Steven
>>>> > -----Original Message-----
>>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org]
On Behalf Of Peter
>>>> > Smith via llvm-dev
>>>> > Sent: Wednesday, February 27, 2019 6:37 PM
>>>> > To: Michael Spencer <bigcheesegs at gmail.com>
>>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
>>>> > Subject: Re: [llvm-dev] Linker option to dump dependency
graph
>>>> >
>>>> > Hello,
>>>> >
>>>> > I think outputting a dependency graph is a good idea and
would enable
>>>> > some offline analysis. I think that there is some
advantage to
>>>> > building some of the simpler ones in, particularly those
that would
>>>> > need heavy annotations to the dependency graph, in
particular unless
>>>> > we write a sample analysis tool that ships with the
release, many
>>>> > users are going to miss out on useful features as they
aren't going to
>>>> > have the time to build one. I've put some comments
inline:
>>>> >
>>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>> > >
>>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu
at google.com> wrote:
>>>> > >>
>>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
>>>> > <bigcheesegs at gmail.com> wrote:
>>>> > >>>
>>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama
via llvm-dev <llvm-
>>>> > dev at lists.llvm.org> wrote:
>>>> > >>>>
>>>> > >>>> Hi,
>>>> > >>>>
>>>> > >>>> I've heard people say that they want
to analyze dependencies between
>>>> > object files at the linker level so that they can run a
whole-program analysis
>>>> > which cannot be done at the compiler that works for one
compilation unit at
>>>> > a time. I'd like to start a discussion as to what we
can do with it and how to
>>>> > make it possible. I'm also sharing my idea about how
to make it possible.
>>>> > >>>>
>>>> > >>>> Dependency analyses
>>>> > >>>> First, let me start with a few examples
of analyses I'm heard of or
>>>> > thinking about. Dependencies between object files can be
represented as a
>>>> > graph where vertices are input sections and edges are
symbols and
>>>> > relocations. Analyses would work on the dependency graph.
Examples of
>>>> > analyses include but not limited to the following:
>>>> > >>>>
>>>> > >>>>  - Figure out why some library or an
object file gets linked.
>>>> > >>>>
>>>> >
>>>> > Arm's proprietary linker has a very helpful feature in
verbose mode
>>>> > where it will report on object loading: global/weak
definitions and
>>>> > global/weak references. For libraries you'd get a
message like
>>>> > selecting member.o from library.a to define symbol S. This
resulted in
>>>> > quite an effective trace of the linker output that could
answer most
>>>> > "why did this library and object file get loaded
question?" One thing
>>>> > a dependency graph might not capture is the order in which
events
>>>> > occur, this can be very useful when debugging problems
caused by
>>>> > library selection order.
>>>> >
>>>> > >>>>  - Finding a candidate to eliminate
dependency by finding a "weak" link
>>>> > to a library. We can for example say the dependency to a
library is weak if
>>>> > the library in the graph can be unreachable if we remove N
edges from the
>>>> > graph (which is likely to correspond to removing N
function calls from the
>>>> > code), where N is a small number.
>>>> > >>>>
>>>> > >>>>  - Understanding which of new
dependencies increase the executable
>>>> > size the most, compare to a previous build.
>>>> > >>>>
>>>> >
>>>> > Arm's linker, being focused on embedded systems has a
useful feature
>>>> > that summarises the amount of content taken from each
object broken
>>>> > down into code, ro-data, rw-date etc. This can be helpful
in the face
>>>> > of comdat group elimination and optimisations such as
garbage
>>>> > collection and ICF that can be difficult to predict from a
dependency
>>>> > graph. It is true that this information could be added as
attributes
>>>> > but again it may just be easier to write a simple analysis
pass over
>>>> > the output in the linker.
>>>> >
>>>> > >>>>  - Finding bad or circular dependencies
between sub-components.
>>>> > >>>>
>>>> > >>>> There would be many more analyses you
want to run at the linker input
>>>> > level. Currently, lld doesn't actively support such
analyses. There are a few
>>>> > options to make the linker emit dependency information
(e.g. --cref or -Map),
>>>> > but the output of the options is not comprehensive; you
cannot reconstruct a
>>>> > dependency graph from the output of the options.
>>>> >
>>>> >
>>>> >
>>>> > >>>>
>>>> > >>>> Dumping dependency graph
>>>> > >>>> So, I'm thinking if it would be
desirable to add a new feature to the
>>>> > linker to dump an entire dependency graph in such a way
that a graph can be
>>>> > reconstructed by reading it back. Once we have such
feature, we can link a
>>>> > program with the feature enabled and run any kind of
dependency analysis
>>>> > on the output. You can save dumps to compare to previous
builds. You can
>>>> > run any number of analyses on a dump, instead of invoking
the linker for
>>>> > each analysis.
>>>> > >>>>
>>>> > >>>> I don't have a concrete idea about
the file output format, but I believe
>>>> > it is essentially enough to emit triplets of (<from
input section>, <symbol>,
>>>> > <to input section>), which represents an edge, to
reconstruct a graph.
>>>> > >>>>
>>>> > >>>> Thoughts?
>>>> > >>>
>>>> > >>>
>>>> > >>> Back when I worked on the linker I pretty
much always had a way to
>>>> > dump a graphviz dot file to look at things.  Pretty much
every graph
>>>> > library/tool can read dot files, and they are easy to hack
up a parser for.  You
>>>> > can also add attributes to nodes and edges to store
arbitrary data.
>>>> > >>
>>>> > >>
>>>> > >> That's an interesting idea.
>>>> > >>
>>>> > >>> As for what to put it in, it really depends
on how detailed it needs to be.
>>>> > Should symbols and sections be collapsed together?  Should
it include
>>>> > relocation types? Symbol types/binding/size/etc?
>>>> > >>
>>>> > >>
>>>> > >>  Maybe everything? We can for example emit all
symbols and input
>>>> > sections first, and then emit a graph as the second half
of the output. E.g.
>>>> > >>
>>>> > >> Symbols:
>>>> > >>   <list of symbols>
>>>> > >> Sections:
>>>> > >>   <list of sections>
>>>> > >> Graph:
>>>> > >>  1 2 3  // 1st section depends on 3rd section via
2nd symbol
>>>> > >>  5 1 4  // likewise
>>>> > >
>>>> > >
>>>> > > I suppose it's a question of if we want users to
need to also read the inputs
>>>> > if they want things like section size and other
section/symbol attributes.  It
>>>> > would be pretty trivial to include that data as long as we
have a
>>>> > format/syntax for it.
>>>> > >
>>>> > > dot supports listing nodes first with attributes and
then referring to them by
>>>> > name later when listing edges.
>>>> > >
>>>> > > - Michael Spencer
>>>> > >
>>>> >
>>>> > I've experimented with dot files for this type of
thing in the past.
>>>> > The difficulty is that they get too large to be
realistically viewed
>>>> > very quickly. At that point you need to write scripts to
process the
>>>> > output and in that case you may as well use JSON or XML,
which I guess
>>>> > could easily be processed into dot files. To summarise, I
think we may
>>>> > be able to do quite well with some very simple extra
analysis in LLD,
>>>> > a machine readable dependency graph would also be very
useful for the
>>>> > more complex cases.
>>>> >
>>>> > Peter
>>>> >
>>>> >  _______________________________________________
>>>> > > LLVM Developers mailing list
>>>> > > llvm-dev at lists.llvm.org
>>>> > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> --
> Peter
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
宋方睿

Andrew Grieve via llvm-dev

2019-Jun-21 01:51 UTC

head link

[llvm-dev] Linker option to dump dependency graph

Just wanted to check in on this - did your patches make it past the
prototype phase?

On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> > One thing a dependency graph might not capture is the order in which
> events occur, this can be very useful when debugging problems caused by
> library selection order.
>
> The event stream sounds like a more fine-grained --trace (-t).
>
> > (<from input section>, <symbol>, <to input section>)
>
> In --no-gc-sections mode and in some analysis, the file name part of
> the input section should be good enough.
>
> > section size and other section/symbol attributes
>
> If such customization is favored and the complexity isn't a big issue,
> it can probably be implemented as format specifiers (I'm thinking of
> printf, ps -o, date, ...). The design of
> https://github.com/Juniper/libxo can be used for reference.
>
> We shall flesh out the possible vertex/edge types and additional
> information that users may expect.
>
> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > You might have realized this already but it's probably not a good
idea
> to use InputSection::Relocations for this because that ends up missing
> anything that becomes a dynamic relocation. I reckon that the code should
> be doing exactly what MarkLive.cpp is doing.
> >
> > Peter
> >
> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>
> >> I hacked up a patch to make lld output a dependency graph in the
> graphviz "dot" format.
> >>
> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa --
> this is a patch, which is just visiting all input sections and relocations.
> Note that this is far from completion but just a proof-of-concept.
> >>
> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is
a
> result of static-linking a "hello world" program.
> >>
> >> https://rui314.github.io/hello.svg  -- I rendered the above dot
file
> with graphviz `sfdp` engine. The rendered graph is too large and very hard
> to read. Apparently, I need a better visualization tool.
> >>
> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at
google.com>
> wrote:
> >>>
> >>> +1 for graphviz dot format, so that it can be consumed by any
one of
> many existing graph visualization tools.
> >>>
> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>>>
> >>>> >To summarise, I think we may
> >>>> > be able to do quite well with some very simple extra
analysis in
> LLD,
> >>>> > a machine readable dependency graph would also be
very useful for
> the
> >>>> > more complex cases.
> >>>>
> >>>> Strongly agree. The linker based dependency graph would be
very
> useful for Uefi firmware. Below are my usage examples:
> >>>> 1. I need to detect the redundant code in my firmware, and
I once
> wrote a analysis tool to compare the IR level symbols and call graph info
> before any optimization and after full optimization (e.g. LTO). But the IR
> level info does not support assembly code info well. So, there are many
> dependency information missing and false positive in my analysis tool. It
> will be more sound if the linker can help output complete and accurate
> dependency graph for final executable.
> >>>> 2. I need a tool to analyze and  track the firmware module
accurate
> dependency for build cache soundness. Build performance is now a pain point
> in our CI system because every patch need to verify on many build targets
> in our side. We hope to enable the build cache (both module level and file
> level) to accelerate the build time. For module level build cache enabling,
> a very important problem is how to know the module's accurate
dependency
> efficiently. I'm looking forward to the linker based dependency graph
> feature.
> >>>>
> >>>>
> >>>> Thanks
> >>>> Steven
> >>>> > -----Original Message-----
> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at
lists.llvm.org] On Behalf
> Of Peter
> >>>> > Smith via llvm-dev
> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM
> >>>> > To: Michael Spencer <bigcheesegs at gmail.com>
> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> >>>> > Subject: Re: [llvm-dev] Linker option to dump
dependency graph
> >>>> >
> >>>> > Hello,
> >>>> >
> >>>> > I think outputting a dependency graph is a good idea
and would
> enable
> >>>> > some offline analysis. I think that there is some
advantage to
> >>>> > building some of the simpler ones in, particularly
those that would
> >>>> > need heavy annotations to the dependency graph, in
particular unless
> >>>> > we write a sample analysis tool that ships with the
release, many
> >>>> > users are going to miss out on useful features as
they aren't going
> to
> >>>> > have the time to build one. I've put some
comments inline:
> >>>> >
> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via
llvm-dev
> >>>> > <llvm-dev at lists.llvm.org> wrote:
> >>>> > >
> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama
<ruiu at google.com>
> wrote:
> >>>> > >>
> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael
Spencer
> >>>> > <bigcheesegs at gmail.com> wrote:
> >>>> > >>>
> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui
Ueyama via llvm-dev <llvm-
> >>>> > dev at lists.llvm.org> wrote:
> >>>> > >>>>
> >>>> > >>>> Hi,
> >>>> > >>>>
> >>>> > >>>> I've heard people say that they
want to analyze dependencies
> between
> >>>> > object files at the linker level so that they can run
a
> whole-program analysis
> >>>> > which cannot be done at the compiler that works for
one compilation
> unit at
> >>>> > a time. I'd like to start a discussion as to what
we can do with it
> and how to
> >>>> > make it possible. I'm also sharing my idea about
how to make it
> possible.
> >>>> > >>>>
> >>>> > >>>> Dependency analyses
> >>>> > >>>> First, let me start with a few
examples of analyses I'm heard
> of or
> >>>> > thinking about. Dependencies between object files can
be
> represented as a
> >>>> > graph where vertices are input sections and edges are
symbols and
> >>>> > relocations. Analyses would work on the dependency
graph. Examples
> of
> >>>> > analyses include but not limited to the following:
> >>>> > >>>>
> >>>> > >>>>  - Figure out why some library or an
object file gets linked.
> >>>> > >>>>
> >>>> >
> >>>> > Arm's proprietary linker has a very helpful
feature in verbose mode
> >>>> > where it will report on object loading: global/weak
definitions and
> >>>> > global/weak references. For libraries you'd get a
message like
> >>>> > selecting member.o from library.a to define symbol S.
This resulted
> in
> >>>> > quite an effective trace of the linker output that
could answer most
> >>>> > "why did this library and object file get loaded
question?" One
> thing
> >>>> > a dependency graph might not capture is the order in
which events
> >>>> > occur, this can be very useful when debugging
problems caused by
> >>>> > library selection order.
> >>>> >
> >>>> > >>>>  - Finding a candidate to eliminate
dependency by finding a
> "weak" link
> >>>> > to a library. We can for example say the dependency
to a library is
> weak if
> >>>> > the library in the graph can be unreachable if we
remove N edges
> from the
> >>>> > graph (which is likely to correspond to removing N
function calls
> from the
> >>>> > code), where N is a small number.
> >>>> > >>>>
> >>>> > >>>>  - Understanding which of new
dependencies increase the
> executable
> >>>> > size the most, compare to a previous build.
> >>>> > >>>>
> >>>> >
> >>>> > Arm's linker, being focused on embedded systems
has a useful feature
> >>>> > that summarises the amount of content taken from each
object broken
> >>>> > down into code, ro-data, rw-date etc. This can be
helpful in the
> face
> >>>> > of comdat group elimination and optimisations such as
garbage
> >>>> > collection and ICF that can be difficult to predict
from a
> dependency
> >>>> > graph. It is true that this information could be
added as attributes
> >>>> > but again it may just be easier to write a simple
analysis pass over
> >>>> > the output in the linker.
> >>>> >
> >>>> > >>>>  - Finding bad or circular
dependencies between sub-components.
> >>>> > >>>>
> >>>> > >>>> There would be many more analyses
you want to run at the
> linker input
> >>>> > level. Currently, lld doesn't actively support
such analyses. There
> are a few
> >>>> > options to make the linker emit dependency
information (e.g. --cref
> or -Map),
> >>>> > but the output of the options is not comprehensive;
you cannot
> reconstruct a
> >>>> > dependency graph from the output of the options.
> >>>> >
> >>>> >
> >>>> >
> >>>> > >>>>
> >>>> > >>>> Dumping dependency graph
> >>>> > >>>> So, I'm thinking if it would be
desirable to add a new feature
> to the
> >>>> > linker to dump an entire dependency graph in such a
way that a
> graph can be
> >>>> > reconstructed by reading it back. Once we have such
feature, we can
> link a
> >>>> > program with the feature enabled and run any kind of
dependency
> analysis
> >>>> > on the output. You can save dumps to compare to
previous builds.
> You can
> >>>> > run any number of analyses on a dump, instead of
invoking the
> linker for
> >>>> > each analysis.
> >>>> > >>>>
> >>>> > >>>> I don't have a concrete idea
about the file output format, but
> I believe
> >>>> > it is essentially enough to emit triplets of
(<from input section>,
> <symbol>,
> >>>> > <to input section>), which represents an edge,
to reconstruct a
> graph.
> >>>> > >>>>
> >>>> > >>>> Thoughts?
> >>>> > >>>
> >>>> > >>>
> >>>> > >>> Back when I worked on the linker I
pretty much always had a way
> to
> >>>> > dump a graphviz dot file to look at things.  Pretty
much every graph
> >>>> > library/tool can read dot files, and they are easy to
hack up a
> parser for.  You
> >>>> > can also add attributes to nodes and edges to store
arbitrary data.
> >>>> > >>
> >>>> > >>
> >>>> > >> That's an interesting idea.
> >>>> > >>
> >>>> > >>> As for what to put it in, it really
depends on how detailed it
> needs to be.
> >>>> > Should symbols and sections be collapsed together? 
Should it
> include
> >>>> > relocation types? Symbol types/binding/size/etc?
> >>>> > >>
> >>>> > >>
> >>>> > >>  Maybe everything? We can for example emit
all symbols and input
> >>>> > sections first, and then emit a graph as the second
half of the
> output. E.g.
> >>>> > >>
> >>>> > >> Symbols:
> >>>> > >>   <list of symbols>
> >>>> > >> Sections:
> >>>> > >>   <list of sections>
> >>>> > >> Graph:
> >>>> > >>  1 2 3  // 1st section depends on 3rd
section via 2nd symbol
> >>>> > >>  5 1 4  // likewise
> >>>> > >
> >>>> > >
> >>>> > > I suppose it's a question of if we want
users to need to also
> read the inputs
> >>>> > if they want things like section size and other
section/symbol
> attributes.  It
> >>>> > would be pretty trivial to include that data as long
as we have a
> >>>> > format/syntax for it.
> >>>> > >
> >>>> > > dot supports listing nodes first with attributes
and then
> referring to them by
> >>>> > name later when listing edges.
> >>>> > >
> >>>> > > - Michael Spencer
> >>>> > >
> >>>> >
> >>>> > I've experimented with dot files for this type of
thing in the past.
> >>>> > The difficulty is that they get too large to be
realistically viewed
> >>>> > very quickly. At that point you need to write scripts
to process the
> >>>> > output and in that case you may as well use JSON or
XML, which I
> guess
> >>>> > could easily be processed into dot files. To
summarise, I think we
> may
> >>>> > be able to do quite well with some very simple extra
analysis in
> LLD,
> >>>> > a machine readable dependency graph would also be
very useful for
> the
> >>>> > more complex cases.
> >>>> >
> >>>> > Peter
> >>>> >
> >>>> >  _______________________________________________
> >>>> > > LLVM Developers mailing list
> >>>> > > llvm-dev at lists.llvm.org
> >>>> > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>> > _______________________________________________
> >>>> > LLVM Developers mailing list
> >>>> > llvm-dev at lists.llvm.org
> >>>> >
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> llvm-dev at lists.llvm.org
> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> >
> > --
> > --
> > Peter
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> 宋方睿
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190620/cbf9682d/attachment.html>

llvm dev - Mar 2019 - Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph

[llvm-dev] Linker option to dump dependency graph