Rui Ueyama via llvm-dev
2019-Jun-21 12:10 UTC
[llvm-dev] Linker option to dump dependency graph
Sorry, I didn't notice that you are asking not to me but to Fangrui. Please disregard my previous email. On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote:> No I didn't. > > On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Just wanted to check in on this - did your patches make it past the >> prototype phase? >> >> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> > One thing a dependency graph might not capture is the order in which >>> events occur, this can be very useful when debugging problems caused by >>> library selection order. >>> >>> The event stream sounds like a more fine-grained --trace (-t). >>> >>> > (<from input section>, <symbol>, <to input section>) >>> >>> In --no-gc-sections mode and in some analysis, the file name part of >>> the input section should be good enough. >>> >>> > section size and other section/symbol attributes >>> >>> If such customization is favored and the complexity isn't a big issue, >>> it can probably be implemented as format specifiers (I'm thinking of >>> printf, ps -o, date, ...). The design of >>> https://github.com/Juniper/libxo can be used for reference. >>> >>> We shall flesh out the possible vertex/edge types and additional >>> information that users may expect. >>> >>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > >>> > You might have realized this already but it's probably not a good idea >>> to use InputSection::Relocations for this because that ends up missing >>> anything that becomes a dynamic relocation. I reckon that the code should >>> be doing exactly what MarkLive.cpp is doing. >>> > >>> > Peter >>> > >>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> I hacked up a patch to make lld output a dependency graph in the >>> graphviz "dot" format. >>> >> >>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- >>> this is a patch, which is just visiting all input sections and relocations. >>> Note that this is far from completion but just a proof-of-concept. >>> >> >>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a >>> result of static-linking a "hello world" program. >>> >> >>> >> https://rui314.github.io/hello.svg -- I rendered the above dot file >>> with graphviz `sfdp` engine. The rendered graph is too large and very hard >>> to read. Apparently, I need a better visualization tool. >>> >> >>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> >>> wrote: >>> >>> >>> >>> +1 for graphviz dot format, so that it can be consumed by any one of >>> many existing graph visualization tools. >>> >>> >>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> >>> >>>> >To summarise, I think we may >>> >>>> > be able to do quite well with some very simple extra analysis in >>> LLD, >>> >>>> > a machine readable dependency graph would also be very useful for >>> the >>> >>>> > more complex cases. >>> >>>> >>> >>>> Strongly agree. The linker based dependency graph would be very >>> useful for Uefi firmware. Below are my usage examples: >>> >>>> 1. I need to detect the redundant code in my firmware, and I once >>> wrote a analysis tool to compare the IR level symbols and call graph info >>> before any optimization and after full optimization (e.g. LTO). But the IR >>> level info does not support assembly code info well. So, there are many >>> dependency information missing and false positive in my analysis tool. It >>> will be more sound if the linker can help output complete and accurate >>> dependency graph for final executable. >>> >>>> 2. I need a tool to analyze and track the firmware module accurate >>> dependency for build cache soundness. Build performance is now a pain point >>> in our CI system because every patch need to verify on many build targets >>> in our side. We hope to enable the build cache (both module level and file >>> level) to accelerate the build time. For module level build cache enabling, >>> a very important problem is how to know the module's accurate dependency >>> efficiently. I'm looking forward to the linker based dependency graph >>> feature. >>> >>>> >>> >>>> >>> >>>> Thanks >>> >>>> Steven >>> >>>> > -----Original Message----- >>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On >>> Behalf Of Peter >>> >>>> > Smith via llvm-dev >>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM >>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com> >>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> >>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph >>> >>>> > >>> >>>> > Hello, >>> >>>> > >>> >>>> > I think outputting a dependency graph is a good idea and would >>> enable >>> >>>> > some offline analysis. I think that there is some advantage to >>> >>>> > building some of the simpler ones in, particularly those that >>> would >>> >>>> > need heavy annotations to the dependency graph, in particular >>> unless >>> >>>> > we write a sample analysis tool that ships with the release, many >>> >>>> > users are going to miss out on useful features as they aren't >>> going to >>> >>>> > have the time to build one. I've put some comments inline: >>> >>>> > >>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev >>> >>>> > <llvm-dev at lists.llvm.org> wrote: >>> >>>> > > >>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> >>> wrote: >>> >>>> > >> >>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer >>> >>>> > <bigcheesegs at gmail.com> wrote: >>> >>>> > >>> >>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <llvm- >>> >>>> > dev at lists.llvm.org> wrote: >>> >>>> > >>>> >>> >>>> > >>>> Hi, >>> >>>> > >>>> >>> >>>> > >>>> I've heard people say that they want to analyze dependencies >>> between >>> >>>> > object files at the linker level so that they can run a >>> whole-program analysis >>> >>>> > which cannot be done at the compiler that works for one >>> compilation unit at >>> >>>> > a time. I'd like to start a discussion as to what we can do with >>> it and how to >>> >>>> > make it possible. I'm also sharing my idea about how to make it >>> possible. >>> >>>> > >>>> >>> >>>> > >>>> Dependency analyses >>> >>>> > >>>> First, let me start with a few examples of analyses I'm >>> heard of or >>> >>>> > thinking about. Dependencies between object files can be >>> represented as a >>> >>>> > graph where vertices are input sections and edges are symbols and >>> >>>> > relocations. Analyses would work on the dependency graph. >>> Examples of >>> >>>> > analyses include but not limited to the following: >>> >>>> > >>>> >>> >>>> > >>>> - Figure out why some library or an object file gets linked. >>> >>>> > >>>> >>> >>>> > >>> >>>> > Arm's proprietary linker has a very helpful feature in verbose >>> mode >>> >>>> > where it will report on object loading: global/weak definitions >>> and >>> >>>> > global/weak references. For libraries you'd get a message like >>> >>>> > selecting member.o from library.a to define symbol S. This >>> resulted in >>> >>>> > quite an effective trace of the linker output that could answer >>> most >>> >>>> > "why did this library and object file get loaded question?" One >>> thing >>> >>>> > a dependency graph might not capture is the order in which events >>> >>>> > occur, this can be very useful when debugging problems caused by >>> >>>> > library selection order. >>> >>>> > >>> >>>> > >>>> - Finding a candidate to eliminate dependency by finding a >>> "weak" link >>> >>>> > to a library. We can for example say the dependency to a library >>> is weak if >>> >>>> > the library in the graph can be unreachable if we remove N edges >>> from the >>> >>>> > graph (which is likely to correspond to removing N function calls >>> from the >>> >>>> > code), where N is a small number. >>> >>>> > >>>> >>> >>>> > >>>> - Understanding which of new dependencies increase the >>> executable >>> >>>> > size the most, compare to a previous build. >>> >>>> > >>>> >>> >>>> > >>> >>>> > Arm's linker, being focused on embedded systems has a useful >>> feature >>> >>>> > that summarises the amount of content taken from each object >>> broken >>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in the >>> face >>> >>>> > of comdat group elimination and optimisations such as garbage >>> >>>> > collection and ICF that can be difficult to predict from a >>> dependency >>> >>>> > graph. It is true that this information could be added as >>> attributes >>> >>>> > but again it may just be easier to write a simple analysis pass >>> over >>> >>>> > the output in the linker. >>> >>>> > >>> >>>> > >>>> - Finding bad or circular dependencies between >>> sub-components. >>> >>>> > >>>> >>> >>>> > >>>> There would be many more analyses you want to run at the >>> linker input >>> >>>> > level. Currently, lld doesn't actively support such analyses. >>> There are a few >>> >>>> > options to make the linker emit dependency information (e.g. >>> --cref or -Map), >>> >>>> > but the output of the options is not comprehensive; you cannot >>> reconstruct a >>> >>>> > dependency graph from the output of the options. >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>>> >>> >>>> > >>>> Dumping dependency graph >>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new >>> feature to the >>> >>>> > linker to dump an entire dependency graph in such a way that a >>> graph can be >>> >>>> > reconstructed by reading it back. Once we have such feature, we >>> can link a >>> >>>> > program with the feature enabled and run any kind of dependency >>> analysis >>> >>>> > on the output. You can save dumps to compare to previous builds. >>> You can >>> >>>> > run any number of analyses on a dump, instead of invoking the >>> linker for >>> >>>> > each analysis. >>> >>>> > >>>> >>> >>>> > >>>> I don't have a concrete idea about the file output format, >>> but I believe >>> >>>> > it is essentially enough to emit triplets of (<from input >>> section>, <symbol>, >>> >>>> > <to input section>), which represents an edge, to reconstruct a >>> graph. >>> >>>> > >>>> >>> >>>> > >>>> Thoughts? >>> >>>> > >>> >>> >>>> > >>> >>> >>>> > >>> Back when I worked on the linker I pretty much always had a >>> way to >>> >>>> > dump a graphviz dot file to look at things. Pretty much every >>> graph >>> >>>> > library/tool can read dot files, and they are easy to hack up a >>> parser for. You >>> >>>> > can also add attributes to nodes and edges to store arbitrary >>> data. >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> That's an interesting idea. >>> >>>> > >> >>> >>>> > >>> As for what to put it in, it really depends on how detailed >>> it needs to be. >>> >>>> > Should symbols and sections be collapsed together? Should it >>> include >>> >>>> > relocation types? Symbol types/binding/size/etc? >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> Maybe everything? We can for example emit all symbols and >>> input >>> >>>> > sections first, and then emit a graph as the second half of the >>> output. E.g. >>> >>>> > >> >>> >>>> > >> Symbols: >>> >>>> > >> <list of symbols> >>> >>>> > >> Sections: >>> >>>> > >> <list of sections> >>> >>>> > >> Graph: >>> >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol >>> >>>> > >> 5 1 4 // likewise >>> >>>> > > >>> >>>> > > >>> >>>> > > I suppose it's a question of if we want users to need to also >>> read the inputs >>> >>>> > if they want things like section size and other section/symbol >>> attributes. It >>> >>>> > would be pretty trivial to include that data as long as we have a >>> >>>> > format/syntax for it. >>> >>>> > > >>> >>>> > > dot supports listing nodes first with attributes and then >>> referring to them by >>> >>>> > name later when listing edges. >>> >>>> > > >>> >>>> > > - Michael Spencer >>> >>>> > > >>> >>>> > >>> >>>> > I've experimented with dot files for this type of thing in the >>> past. >>> >>>> > The difficulty is that they get too large to be realistically >>> viewed >>> >>>> > very quickly. At that point you need to write scripts to process >>> the >>> >>>> > output and in that case you may as well use JSON or XML, which I >>> guess >>> >>>> > could easily be processed into dot files. To summarise, I think >>> we may >>> >>>> > be able to do quite well with some very simple extra analysis in >>> LLD, >>> >>>> > a machine readable dependency graph would also be very useful for >>> the >>> >>>> > more complex cases. >>> >>>> > >>> >>>> > Peter >>> >>>> > >>> >>>> > _______________________________________________ >>> >>>> > > LLVM Developers mailing list >>> >>>> > > llvm-dev at lists.llvm.org >>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>>> > _______________________________________________ >>> >>>> > LLVM Developers mailing list >>> >>>> > llvm-dev at lists.llvm.org >>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>>> _______________________________________________ >>> >>>> LLVM Developers mailing list >>> >>>> llvm-dev at lists.llvm.org >>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >>> >> _______________________________________________ >>> >> LLVM Developers mailing list >>> >> llvm-dev at lists.llvm.org >>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> > >>> > >>> > -- >>> > -- >>> > Peter >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >>> -- >>> 宋方睿 >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/f97d77dc/attachment.html>
Andrew Grieve via llvm-dev
2019-Jun-21 14:08 UTC
[llvm-dev] Linker option to dump dependency graph
I didn't pay much attention when "replying all". I did actually mean to ask you :). It's coming up repeatedly in Chrome that I want to be able to find the reason why a symbol is included, so even if there's a patch I could pull in myself to answer these queries, that would be appreciated :). On Fri, Jun 21, 2019 at 8:10 AM Rui Ueyama <ruiu at google.com> wrote:> Sorry, I didn't notice that you are asking not to me but to Fangrui. > Please disregard my previous email. > > On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote: > >> No I didn't. >> >> On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Just wanted to check in on this - did your patches make it past the >>> prototype phase? >>> >>> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> > One thing a dependency graph might not capture is the order in which >>>> events occur, this can be very useful when debugging problems caused by >>>> library selection order. >>>> >>>> The event stream sounds like a more fine-grained --trace (-t). >>>> >>>> > (<from input section>, <symbol>, <to input section>) >>>> >>>> In --no-gc-sections mode and in some analysis, the file name part of >>>> the input section should be good enough. >>>> >>>> > section size and other section/symbol attributes >>>> >>>> If such customization is favored and the complexity isn't a big issue, >>>> it can probably be implemented as format specifiers (I'm thinking of >>>> printf, ps -o, date, ...). The design of >>>> https://github.com/Juniper/libxo can be used for reference. >>>> >>>> We shall flesh out the possible vertex/edge types and additional >>>> information that users may expect. >>>> >>>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> > >>>> > You might have realized this already but it's probably not a good >>>> idea to use InputSection::Relocations for this because that ends up missing >>>> anything that becomes a dynamic relocation. I reckon that the code should >>>> be doing exactly what MarkLive.cpp is doing. >>>> > >>>> > Peter >>>> > >>>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >> >>>> >> I hacked up a patch to make lld output a dependency graph in the >>>> graphviz "dot" format. >>>> >> >>>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- >>>> this is a patch, which is just visiting all input sections and relocations. >>>> Note that this is far from completion but just a proof-of-concept. >>>> >> >>>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is >>>> a result of static-linking a "hello world" program. >>>> >> >>>> >> https://rui314.github.io/hello.svg -- I rendered the above dot >>>> file with graphviz `sfdp` engine. The rendered graph is too large and very >>>> hard to read. Apparently, I need a better visualization tool. >>>> >> >>>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> >>>> wrote: >>>> >>> >>>> >>> +1 for graphviz dot format, so that it can be consumed by any one >>>> of many existing graph visualization tools. >>>> >>> >>>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>> >>>> >>>> >To summarise, I think we may >>>> >>>> > be able to do quite well with some very simple extra analysis in >>>> LLD, >>>> >>>> > a machine readable dependency graph would also be very useful >>>> for the >>>> >>>> > more complex cases. >>>> >>>> >>>> >>>> Strongly agree. The linker based dependency graph would be very >>>> useful for Uefi firmware. Below are my usage examples: >>>> >>>> 1. I need to detect the redundant code in my firmware, and I once >>>> wrote a analysis tool to compare the IR level symbols and call graph info >>>> before any optimization and after full optimization (e.g. LTO). But the IR >>>> level info does not support assembly code info well. So, there are many >>>> dependency information missing and false positive in my analysis tool. It >>>> will be more sound if the linker can help output complete and accurate >>>> dependency graph for final executable. >>>> >>>> 2. I need a tool to analyze and track the firmware module >>>> accurate dependency for build cache soundness. Build performance is now a >>>> pain point in our CI system because every patch need to verify on many >>>> build targets in our side. We hope to enable the build cache (both module >>>> level and file level) to accelerate the build time. For module level build >>>> cache enabling, a very important problem is how to know the module's >>>> accurate dependency efficiently. I'm looking forward to the linker based >>>> dependency graph feature. >>>> >>>> >>>> >>>> >>>> >>>> Thanks >>>> >>>> Steven >>>> >>>> > -----Original Message----- >>>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On >>>> Behalf Of Peter >>>> >>>> > Smith via llvm-dev >>>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM >>>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com> >>>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> >>>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph >>>> >>>> > >>>> >>>> > Hello, >>>> >>>> > >>>> >>>> > I think outputting a dependency graph is a good idea and would >>>> enable >>>> >>>> > some offline analysis. I think that there is some advantage to >>>> >>>> > building some of the simpler ones in, particularly those that >>>> would >>>> >>>> > need heavy annotations to the dependency graph, in particular >>>> unless >>>> >>>> > we write a sample analysis tool that ships with the release, many >>>> >>>> > users are going to miss out on useful features as they aren't >>>> going to >>>> >>>> > have the time to build one. I've put some comments inline: >>>> >>>> > >>>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev >>>> >>>> > <llvm-dev at lists.llvm.org> wrote: >>>> >>>> > > >>>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> >>>> wrote: >>>> >>>> > >> >>>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer >>>> >>>> > <bigcheesegs at gmail.com> wrote: >>>> >>>> > >>> >>>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev >>>> <llvm- >>>> >>>> > dev at lists.llvm.org> wrote: >>>> >>>> > >>>> >>>> >>>> > >>>> Hi, >>>> >>>> > >>>> >>>> >>>> > >>>> I've heard people say that they want to analyze >>>> dependencies between >>>> >>>> > object files at the linker level so that they can run a >>>> whole-program analysis >>>> >>>> > which cannot be done at the compiler that works for one >>>> compilation unit at >>>> >>>> > a time. I'd like to start a discussion as to what we can do with >>>> it and how to >>>> >>>> > make it possible. I'm also sharing my idea about how to make it >>>> possible. >>>> >>>> > >>>> >>>> >>>> > >>>> Dependency analyses >>>> >>>> > >>>> First, let me start with a few examples of analyses I'm >>>> heard of or >>>> >>>> > thinking about. Dependencies between object files can be >>>> represented as a >>>> >>>> > graph where vertices are input sections and edges are symbols and >>>> >>>> > relocations. Analyses would work on the dependency graph. >>>> Examples of >>>> >>>> > analyses include but not limited to the following: >>>> >>>> > >>>> >>>> >>>> > >>>> - Figure out why some library or an object file gets >>>> linked. >>>> >>>> > >>>> >>>> >>>> > >>>> >>>> > Arm's proprietary linker has a very helpful feature in verbose >>>> mode >>>> >>>> > where it will report on object loading: global/weak definitions >>>> and >>>> >>>> > global/weak references. For libraries you'd get a message like >>>> >>>> > selecting member.o from library.a to define symbol S. This >>>> resulted in >>>> >>>> > quite an effective trace of the linker output that could answer >>>> most >>>> >>>> > "why did this library and object file get loaded question?" One >>>> thing >>>> >>>> > a dependency graph might not capture is the order in which events >>>> >>>> > occur, this can be very useful when debugging problems caused by >>>> >>>> > library selection order. >>>> >>>> > >>>> >>>> > >>>> - Finding a candidate to eliminate dependency by finding a >>>> "weak" link >>>> >>>> > to a library. We can for example say the dependency to a library >>>> is weak if >>>> >>>> > the library in the graph can be unreachable if we remove N edges >>>> from the >>>> >>>> > graph (which is likely to correspond to removing N function >>>> calls from the >>>> >>>> > code), where N is a small number. >>>> >>>> > >>>> >>>> >>>> > >>>> - Understanding which of new dependencies increase the >>>> executable >>>> >>>> > size the most, compare to a previous build. >>>> >>>> > >>>> >>>> >>>> > >>>> >>>> > Arm's linker, being focused on embedded systems has a useful >>>> feature >>>> >>>> > that summarises the amount of content taken from each object >>>> broken >>>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in the >>>> face >>>> >>>> > of comdat group elimination and optimisations such as garbage >>>> >>>> > collection and ICF that can be difficult to predict from a >>>> dependency >>>> >>>> > graph. It is true that this information could be added as >>>> attributes >>>> >>>> > but again it may just be easier to write a simple analysis pass >>>> over >>>> >>>> > the output in the linker. >>>> >>>> > >>>> >>>> > >>>> - Finding bad or circular dependencies between >>>> sub-components. >>>> >>>> > >>>> >>>> >>>> > >>>> There would be many more analyses you want to run at the >>>> linker input >>>> >>>> > level. Currently, lld doesn't actively support such analyses. >>>> There are a few >>>> >>>> > options to make the linker emit dependency information (e.g. >>>> --cref or -Map), >>>> >>>> > but the output of the options is not comprehensive; you cannot >>>> reconstruct a >>>> >>>> > dependency graph from the output of the options. >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> >>>> > >>>> Dumping dependency graph >>>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new >>>> feature to the >>>> >>>> > linker to dump an entire dependency graph in such a way that a >>>> graph can be >>>> >>>> > reconstructed by reading it back. Once we have such feature, we >>>> can link a >>>> >>>> > program with the feature enabled and run any kind of dependency >>>> analysis >>>> >>>> > on the output. You can save dumps to compare to previous builds. >>>> You can >>>> >>>> > run any number of analyses on a dump, instead of invoking the >>>> linker for >>>> >>>> > each analysis. >>>> >>>> > >>>> >>>> >>>> > >>>> I don't have a concrete idea about the file output format, >>>> but I believe >>>> >>>> > it is essentially enough to emit triplets of (<from input >>>> section>, <symbol>, >>>> >>>> > <to input section>), which represents an edge, to reconstruct a >>>> graph. >>>> >>>> > >>>> >>>> >>>> > >>>> Thoughts? >>>> >>>> > >>> >>>> >>>> > >>> >>>> >>>> > >>> Back when I worked on the linker I pretty much always had a >>>> way to >>>> >>>> > dump a graphviz dot file to look at things. Pretty much every >>>> graph >>>> >>>> > library/tool can read dot files, and they are easy to hack up a >>>> parser for. You >>>> >>>> > can also add attributes to nodes and edges to store arbitrary >>>> data. >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> > >> That's an interesting idea. >>>> >>>> > >> >>>> >>>> > >>> As for what to put it in, it really depends on how detailed >>>> it needs to be. >>>> >>>> > Should symbols and sections be collapsed together? Should it >>>> include >>>> >>>> > relocation types? Symbol types/binding/size/etc? >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> > >> Maybe everything? We can for example emit all symbols and >>>> input >>>> >>>> > sections first, and then emit a graph as the second half of the >>>> output. E.g. >>>> >>>> > >> >>>> >>>> > >> Symbols: >>>> >>>> > >> <list of symbols> >>>> >>>> > >> Sections: >>>> >>>> > >> <list of sections> >>>> >>>> > >> Graph: >>>> >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol >>>> >>>> > >> 5 1 4 // likewise >>>> >>>> > > >>>> >>>> > > >>>> >>>> > > I suppose it's a question of if we want users to need to also >>>> read the inputs >>>> >>>> > if they want things like section size and other section/symbol >>>> attributes. It >>>> >>>> > would be pretty trivial to include that data as long as we have a >>>> >>>> > format/syntax for it. >>>> >>>> > > >>>> >>>> > > dot supports listing nodes first with attributes and then >>>> referring to them by >>>> >>>> > name later when listing edges. >>>> >>>> > > >>>> >>>> > > - Michael Spencer >>>> >>>> > > >>>> >>>> > >>>> >>>> > I've experimented with dot files for this type of thing in the >>>> past. >>>> >>>> > The difficulty is that they get too large to be realistically >>>> viewed >>>> >>>> > very quickly. At that point you need to write scripts to process >>>> the >>>> >>>> > output and in that case you may as well use JSON or XML, which I >>>> guess >>>> >>>> > could easily be processed into dot files. To summarise, I think >>>> we may >>>> >>>> > be able to do quite well with some very simple extra analysis in >>>> LLD, >>>> >>>> > a machine readable dependency graph would also be very useful >>>> for the >>>> >>>> > more complex cases. >>>> >>>> > >>>> >>>> > Peter >>>> >>>> > >>>> >>>> > _______________________________________________ >>>> >>>> > > LLVM Developers mailing list >>>> >>>> > > llvm-dev at lists.llvm.org >>>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> > _______________________________________________ >>>> >>>> > LLVM Developers mailing list >>>> >>>> > llvm-dev at lists.llvm.org >>>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> _______________________________________________ >>>> >>>> LLVM Developers mailing list >>>> >>>> llvm-dev at lists.llvm.org >>>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >> >>>> >> _______________________________________________ >>>> >> LLVM Developers mailing list >>>> >> llvm-dev at lists.llvm.org >>>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> > >>>> > >>>> > -- >>>> > -- >>>> > Peter >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> >>>> -- >>>> 宋方睿 >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/cc0d9e50/attachment-0001.html>
Fāng-ruì Sòng via llvm-dev
2019-Jun-21 15:15 UTC
[llvm-dev] Linker option to dump dependency graph
You may use -y foo to trace a symbol. -t and -M are also useful. On Fri, Jun 21, 2019 at 10:08 PM Andrew Grieve <agrieve at chromium.org> wrote:> I didn't pay much attention when "replying all". I did actually mean to > ask you :). > > It's coming up repeatedly in Chrome that I want to be able to find the > reason why a symbol is included, so even if there's a patch I could pull in > myself to answer these queries, that would be appreciated :). > > On Fri, Jun 21, 2019 at 8:10 AM Rui Ueyama <ruiu at google.com> wrote: > >> Sorry, I didn't notice that you are asking not to me but to Fangrui. >> Please disregard my previous email. >> >> On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote: >> >>> No I didn't. >>> >>> On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Just wanted to check in on this - did your patches make it past the >>>> prototype phase? >>>> >>>> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> > One thing a dependency graph might not capture is the order in which >>>>> events occur, this can be very useful when debugging problems caused by >>>>> library selection order. >>>>> >>>>> The event stream sounds like a more fine-grained --trace (-t). >>>>> >>>>> > (<from input section>, <symbol>, <to input section>) >>>>> >>>>> In --no-gc-sections mode and in some analysis, the file name part of >>>>> the input section should be good enough. >>>>> >>>>> > section size and other section/symbol attributes >>>>> >>>>> If such customization is favored and the complexity isn't a big issue, >>>>> it can probably be implemented as format specifiers (I'm thinking of >>>>> printf, ps -o, date, ...). The design of >>>>> https://github.com/Juniper/libxo can be used for reference. >>>>> >>>>> We shall flesh out the possible vertex/edge types and additional >>>>> information that users may expect. >>>>> >>>>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev >>>>> <llvm-dev at lists.llvm.org> wrote: >>>>> > >>>>> > You might have realized this already but it's probably not a good >>>>> idea to use InputSection::Relocations for this because that ends up missing >>>>> anything that becomes a dynamic relocation. I reckon that the code should >>>>> be doing exactly what MarkLive.cpp is doing. >>>>> > >>>>> > Peter >>>>> > >>>>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >> >>>>> >> I hacked up a patch to make lld output a dependency graph in the >>>>> graphviz "dot" format. >>>>> >> >>>>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- >>>>> this is a patch, which is just visiting all input sections and relocations. >>>>> Note that this is far from completion but just a proof-of-concept. >>>>> >> >>>>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is >>>>> a result of static-linking a "hello world" program. >>>>> >> >>>>> >> https://rui314.github.io/hello.svg -- I rendered the above dot >>>>> file with graphviz `sfdp` engine. The rendered graph is too large and very >>>>> hard to read. Apparently, I need a better visualization tool. >>>>> >> >>>>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> >>>>> wrote: >>>>> >>> >>>>> >>> +1 for graphviz dot format, so that it can be consumed by any one >>>>> of many existing graph visualization tools. >>>>> >>> >>>>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>> >>>>> >>>> >To summarise, I think we may >>>>> >>>> > be able to do quite well with some very simple extra analysis >>>>> in LLD, >>>>> >>>> > a machine readable dependency graph would also be very useful >>>>> for the >>>>> >>>> > more complex cases. >>>>> >>>> >>>>> >>>> Strongly agree. The linker based dependency graph would be very >>>>> useful for Uefi firmware. Below are my usage examples: >>>>> >>>> 1. I need to detect the redundant code in my firmware, and I once >>>>> wrote a analysis tool to compare the IR level symbols and call graph info >>>>> before any optimization and after full optimization (e.g. LTO). But the IR >>>>> level info does not support assembly code info well. So, there are many >>>>> dependency information missing and false positive in my analysis tool. It >>>>> will be more sound if the linker can help output complete and accurate >>>>> dependency graph for final executable. >>>>> >>>> 2. I need a tool to analyze and track the firmware module >>>>> accurate dependency for build cache soundness. Build performance is now a >>>>> pain point in our CI system because every patch need to verify on many >>>>> build targets in our side. We hope to enable the build cache (both module >>>>> level and file level) to accelerate the build time. For module level build >>>>> cache enabling, a very important problem is how to know the module's >>>>> accurate dependency efficiently. I'm looking forward to the linker based >>>>> dependency graph feature. >>>>> >>>> >>>>> >>>> >>>>> >>>> Thanks >>>>> >>>> Steven >>>>> >>>> > -----Original Message----- >>>>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On >>>>> Behalf Of Peter >>>>> >>>> > Smith via llvm-dev >>>>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM >>>>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com> >>>>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> >>>>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph >>>>> >>>> > >>>>> >>>> > Hello, >>>>> >>>> > >>>>> >>>> > I think outputting a dependency graph is a good idea and would >>>>> enable >>>>> >>>> > some offline analysis. I think that there is some advantage to >>>>> >>>> > building some of the simpler ones in, particularly those that >>>>> would >>>>> >>>> > need heavy annotations to the dependency graph, in particular >>>>> unless >>>>> >>>> > we write a sample analysis tool that ships with the release, >>>>> many >>>>> >>>> > users are going to miss out on useful features as they aren't >>>>> going to >>>>> >>>> > have the time to build one. I've put some comments inline: >>>>> >>>> > >>>>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev >>>>> >>>> > <llvm-dev at lists.llvm.org> wrote: >>>>> >>>> > > >>>>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> >>>>> wrote: >>>>> >>>> > >> >>>>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer >>>>> >>>> > <bigcheesegs at gmail.com> wrote: >>>>> >>>> > >>> >>>>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev >>>>> <llvm- >>>>> >>>> > dev at lists.llvm.org> wrote: >>>>> >>>> > >>>> >>>>> >>>> > >>>> Hi, >>>>> >>>> > >>>> >>>>> >>>> > >>>> I've heard people say that they want to analyze >>>>> dependencies between >>>>> >>>> > object files at the linker level so that they can run a >>>>> whole-program analysis >>>>> >>>> > which cannot be done at the compiler that works for one >>>>> compilation unit at >>>>> >>>> > a time. I'd like to start a discussion as to what we can do >>>>> with it and how to >>>>> >>>> > make it possible. I'm also sharing my idea about how to make it >>>>> possible. >>>>> >>>> > >>>> >>>>> >>>> > >>>> Dependency analyses >>>>> >>>> > >>>> First, let me start with a few examples of analyses I'm >>>>> heard of or >>>>> >>>> > thinking about. Dependencies between object files can be >>>>> represented as a >>>>> >>>> > graph where vertices are input sections and edges are symbols >>>>> and >>>>> >>>> > relocations. Analyses would work on the dependency graph. >>>>> Examples of >>>>> >>>> > analyses include but not limited to the following: >>>>> >>>> > >>>> >>>>> >>>> > >>>> - Figure out why some library or an object file gets >>>>> linked. >>>>> >>>> > >>>> >>>>> >>>> > >>>>> >>>> > Arm's proprietary linker has a very helpful feature in verbose >>>>> mode >>>>> >>>> > where it will report on object loading: global/weak definitions >>>>> and >>>>> >>>> > global/weak references. For libraries you'd get a message like >>>>> >>>> > selecting member.o from library.a to define symbol S. This >>>>> resulted in >>>>> >>>> > quite an effective trace of the linker output that could answer >>>>> most >>>>> >>>> > "why did this library and object file get loaded question?" One >>>>> thing >>>>> >>>> > a dependency graph might not capture is the order in which >>>>> events >>>>> >>>> > occur, this can be very useful when debugging problems caused by >>>>> >>>> > library selection order. >>>>> >>>> > >>>>> >>>> > >>>> - Finding a candidate to eliminate dependency by finding >>>>> a "weak" link >>>>> >>>> > to a library. We can for example say the dependency to a >>>>> library is weak if >>>>> >>>> > the library in the graph can be unreachable if we remove N >>>>> edges from the >>>>> >>>> > graph (which is likely to correspond to removing N function >>>>> calls from the >>>>> >>>> > code), where N is a small number. >>>>> >>>> > >>>> >>>>> >>>> > >>>> - Understanding which of new dependencies increase the >>>>> executable >>>>> >>>> > size the most, compare to a previous build. >>>>> >>>> > >>>> >>>>> >>>> > >>>>> >>>> > Arm's linker, being focused on embedded systems has a useful >>>>> feature >>>>> >>>> > that summarises the amount of content taken from each object >>>>> broken >>>>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in >>>>> the face >>>>> >>>> > of comdat group elimination and optimisations such as garbage >>>>> >>>> > collection and ICF that can be difficult to predict from a >>>>> dependency >>>>> >>>> > graph. It is true that this information could be added as >>>>> attributes >>>>> >>>> > but again it may just be easier to write a simple analysis pass >>>>> over >>>>> >>>> > the output in the linker. >>>>> >>>> > >>>>> >>>> > >>>> - Finding bad or circular dependencies between >>>>> sub-components. >>>>> >>>> > >>>> >>>>> >>>> > >>>> There would be many more analyses you want to run at the >>>>> linker input >>>>> >>>> > level. Currently, lld doesn't actively support such analyses. >>>>> There are a few >>>>> >>>> > options to make the linker emit dependency information (e.g. >>>>> --cref or -Map), >>>>> >>>> > but the output of the options is not comprehensive; you cannot >>>>> reconstruct a >>>>> >>>> > dependency graph from the output of the options. >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>> >>>>> >>>> > >>>> Dumping dependency graph >>>>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new >>>>> feature to the >>>>> >>>> > linker to dump an entire dependency graph in such a way that a >>>>> graph can be >>>>> >>>> > reconstructed by reading it back. Once we have such feature, we >>>>> can link a >>>>> >>>> > program with the feature enabled and run any kind of dependency >>>>> analysis >>>>> >>>> > on the output. You can save dumps to compare to previous >>>>> builds. You can >>>>> >>>> > run any number of analyses on a dump, instead of invoking the >>>>> linker for >>>>> >>>> > each analysis. >>>>> >>>> > >>>> >>>>> >>>> > >>>> I don't have a concrete idea about the file output format, >>>>> but I believe >>>>> >>>> > it is essentially enough to emit triplets of (<from input >>>>> section>, <symbol>, >>>>> >>>> > <to input section>), which represents an edge, to reconstruct a >>>>> graph. >>>>> >>>> > >>>> >>>>> >>>> > >>>> Thoughts? >>>>> >>>> > >>> >>>>> >>>> > >>> >>>>> >>>> > >>> Back when I worked on the linker I pretty much always had a >>>>> way to >>>>> >>>> > dump a graphviz dot file to look at things. Pretty much every >>>>> graph >>>>> >>>> > library/tool can read dot files, and they are easy to hack up a >>>>> parser for. You >>>>> >>>> > can also add attributes to nodes and edges to store arbitrary >>>>> data. >>>>> >>>> > >> >>>>> >>>> > >> >>>>> >>>> > >> That's an interesting idea. >>>>> >>>> > >> >>>>> >>>> > >>> As for what to put it in, it really depends on how detailed >>>>> it needs to be. >>>>> >>>> > Should symbols and sections be collapsed together? Should it >>>>> include >>>>> >>>> > relocation types? Symbol types/binding/size/etc? >>>>> >>>> > >> >>>>> >>>> > >> >>>>> >>>> > >> Maybe everything? We can for example emit all symbols and >>>>> input >>>>> >>>> > sections first, and then emit a graph as the second half of the >>>>> output. E.g. >>>>> >>>> > >> >>>>> >>>> > >> Symbols: >>>>> >>>> > >> <list of symbols> >>>>> >>>> > >> Sections: >>>>> >>>> > >> <list of sections> >>>>> >>>> > >> Graph: >>>>> >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol >>>>> >>>> > >> 5 1 4 // likewise >>>>> >>>> > > >>>>> >>>> > > >>>>> >>>> > > I suppose it's a question of if we want users to need to also >>>>> read the inputs >>>>> >>>> > if they want things like section size and other section/symbol >>>>> attributes. It >>>>> >>>> > would be pretty trivial to include that data as long as we have >>>>> a >>>>> >>>> > format/syntax for it. >>>>> >>>> > > >>>>> >>>> > > dot supports listing nodes first with attributes and then >>>>> referring to them by >>>>> >>>> > name later when listing edges. >>>>> >>>> > > >>>>> >>>> > > - Michael Spencer >>>>> >>>> > > >>>>> >>>> > >>>>> >>>> > I've experimented with dot files for this type of thing in the >>>>> past. >>>>> >>>> > The difficulty is that they get too large to be realistically >>>>> viewed >>>>> >>>> > very quickly. At that point you need to write scripts to >>>>> process the >>>>> >>>> > output and in that case you may as well use JSON or XML, which >>>>> I guess >>>>> >>>> > could easily be processed into dot files. To summarise, I think >>>>> we may >>>>> >>>> > be able to do quite well with some very simple extra analysis >>>>> in LLD, >>>>> >>>> > a machine readable dependency graph would also be very useful >>>>> for the >>>>> >>>> > more complex cases. >>>>> >>>> > >>>>> >>>> > Peter >>>>> >>>> > >>>>> >>>> > _______________________________________________ >>>>> >>>> > > LLVM Developers mailing list >>>>> >>>> > > llvm-dev at lists.llvm.org >>>>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> > _______________________________________________ >>>>> >>>> > LLVM Developers mailing list >>>>> >>>> > llvm-dev at lists.llvm.org >>>>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> _______________________________________________ >>>>> >>>> LLVM Developers mailing list >>>>> >>>> llvm-dev at lists.llvm.org >>>>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >> >>>>> >> _______________________________________________ >>>>> >> LLVM Developers mailing list >>>>> >> llvm-dev at lists.llvm.org >>>>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > -- >>>>> > Peter >>>>> > _______________________________________________ >>>>> > LLVM Developers mailing list >>>>> > llvm-dev at lists.llvm.org >>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>>> >>>>> -- >>>>> 宋方睿 >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>-- 宋方睿 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/47b403a7/attachment.html>