Andrew Grieve via llvm-dev
2019-Jun-21 01:51 UTC
[llvm-dev] Linker option to dump dependency graph
Just wanted to check in on this - did your patches make it past the prototype phase? On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > One thing a dependency graph might not capture is the order in which > events occur, this can be very useful when debugging problems caused by > library selection order. > > The event stream sounds like a more fine-grained --trace (-t). > > > (<from input section>, <symbol>, <to input section>) > > In --no-gc-sections mode and in some analysis, the file name part of > the input section should be good enough. > > > section size and other section/symbol attributes > > If such customization is favored and the complexity isn't a big issue, > it can probably be implemented as format specifiers (I'm thinking of > printf, ps -o, date, ...). The design of > https://github.com/Juniper/libxo can be used for reference. > > We shall flesh out the possible vertex/edge types and additional > information that users may expect. > > On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > You might have realized this already but it's probably not a good idea > to use InputSection::Relocations for this because that ends up missing > anything that becomes a dynamic relocation. I reckon that the code should > be doing exactly what MarkLive.cpp is doing. > > > > Peter > > > > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> I hacked up a patch to make lld output a dependency graph in the > graphviz "dot" format. > >> > >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- > this is a patch, which is just visiting all input sections and relocations. > Note that this is far from completion but just a proof-of-concept. > >> > >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a > result of static-linking a "hello world" program. > >> > >> https://rui314.github.io/hello.svg -- I rendered the above dot file > with graphviz `sfdp` engine. The rendered graph is too large and very hard > to read. Apparently, I need a better visualization tool. > >> > >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> > wrote: > >>> > >>> +1 for graphviz dot format, so that it can be consumed by any one of > many existing graph visualization tools. > >>> > >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >>>> > >>>> >To summarise, I think we may > >>>> > be able to do quite well with some very simple extra analysis in > LLD, > >>>> > a machine readable dependency graph would also be very useful for > the > >>>> > more complex cases. > >>>> > >>>> Strongly agree. The linker based dependency graph would be very > useful for Uefi firmware. Below are my usage examples: > >>>> 1. I need to detect the redundant code in my firmware, and I once > wrote a analysis tool to compare the IR level symbols and call graph info > before any optimization and after full optimization (e.g. LTO). But the IR > level info does not support assembly code info well. So, there are many > dependency information missing and false positive in my analysis tool. It > will be more sound if the linker can help output complete and accurate > dependency graph for final executable. > >>>> 2. I need a tool to analyze and track the firmware module accurate > dependency for build cache soundness. Build performance is now a pain point > in our CI system because every patch need to verify on many build targets > in our side. We hope to enable the build cache (both module level and file > level) to accelerate the build time. For module level build cache enabling, > a very important problem is how to know the module's accurate dependency > efficiently. I'm looking forward to the linker based dependency graph > feature. > >>>> > >>>> > >>>> Thanks > >>>> Steven > >>>> > -----Original Message----- > >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf > Of Peter > >>>> > Smith via llvm-dev > >>>> > Sent: Wednesday, February 27, 2019 6:37 PM > >>>> > To: Michael Spencer <bigcheesegs at gmail.com> > >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> > >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph > >>>> > > >>>> > Hello, > >>>> > > >>>> > I think outputting a dependency graph is a good idea and would > enable > >>>> > some offline analysis. I think that there is some advantage to > >>>> > building some of the simpler ones in, particularly those that would > >>>> > need heavy annotations to the dependency graph, in particular unless > >>>> > we write a sample analysis tool that ships with the release, many > >>>> > users are going to miss out on useful features as they aren't going > to > >>>> > have the time to build one. I've put some comments inline: > >>>> > > >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev > >>>> > <llvm-dev at lists.llvm.org> wrote: > >>>> > > > >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> > wrote: > >>>> > >> > >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer > >>>> > <bigcheesegs at gmail.com> wrote: > >>>> > >>> > >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <llvm- > >>>> > dev at lists.llvm.org> wrote: > >>>> > >>>> > >>>> > >>>> Hi, > >>>> > >>>> > >>>> > >>>> I've heard people say that they want to analyze dependencies > between > >>>> > object files at the linker level so that they can run a > whole-program analysis > >>>> > which cannot be done at the compiler that works for one compilation > unit at > >>>> > a time. I'd like to start a discussion as to what we can do with it > and how to > >>>> > make it possible. I'm also sharing my idea about how to make it > possible. > >>>> > >>>> > >>>> > >>>> Dependency analyses > >>>> > >>>> First, let me start with a few examples of analyses I'm heard > of or > >>>> > thinking about. Dependencies between object files can be > represented as a > >>>> > graph where vertices are input sections and edges are symbols and > >>>> > relocations. Analyses would work on the dependency graph. Examples > of > >>>> > analyses include but not limited to the following: > >>>> > >>>> > >>>> > >>>> - Figure out why some library or an object file gets linked. > >>>> > >>>> > >>>> > > >>>> > Arm's proprietary linker has a very helpful feature in verbose mode > >>>> > where it will report on object loading: global/weak definitions and > >>>> > global/weak references. For libraries you'd get a message like > >>>> > selecting member.o from library.a to define symbol S. This resulted > in > >>>> > quite an effective trace of the linker output that could answer most > >>>> > "why did this library and object file get loaded question?" One > thing > >>>> > a dependency graph might not capture is the order in which events > >>>> > occur, this can be very useful when debugging problems caused by > >>>> > library selection order. > >>>> > > >>>> > >>>> - Finding a candidate to eliminate dependency by finding a > "weak" link > >>>> > to a library. We can for example say the dependency to a library is > weak if > >>>> > the library in the graph can be unreachable if we remove N edges > from the > >>>> > graph (which is likely to correspond to removing N function calls > from the > >>>> > code), where N is a small number. > >>>> > >>>> > >>>> > >>>> - Understanding which of new dependencies increase the > executable > >>>> > size the most, compare to a previous build. > >>>> > >>>> > >>>> > > >>>> > Arm's linker, being focused on embedded systems has a useful feature > >>>> > that summarises the amount of content taken from each object broken > >>>> > down into code, ro-data, rw-date etc. This can be helpful in the > face > >>>> > of comdat group elimination and optimisations such as garbage > >>>> > collection and ICF that can be difficult to predict from a > dependency > >>>> > graph. It is true that this information could be added as attributes > >>>> > but again it may just be easier to write a simple analysis pass over > >>>> > the output in the linker. > >>>> > > >>>> > >>>> - Finding bad or circular dependencies between sub-components. > >>>> > >>>> > >>>> > >>>> There would be many more analyses you want to run at the > linker input > >>>> > level. Currently, lld doesn't actively support such analyses. There > are a few > >>>> > options to make the linker emit dependency information (e.g. --cref > or -Map), > >>>> > but the output of the options is not comprehensive; you cannot > reconstruct a > >>>> > dependency graph from the output of the options. > >>>> > > >>>> > > >>>> > > >>>> > >>>> > >>>> > >>>> Dumping dependency graph > >>>> > >>>> So, I'm thinking if it would be desirable to add a new feature > to the > >>>> > linker to dump an entire dependency graph in such a way that a > graph can be > >>>> > reconstructed by reading it back. Once we have such feature, we can > link a > >>>> > program with the feature enabled and run any kind of dependency > analysis > >>>> > on the output. You can save dumps to compare to previous builds. > You can > >>>> > run any number of analyses on a dump, instead of invoking the > linker for > >>>> > each analysis. > >>>> > >>>> > >>>> > >>>> I don't have a concrete idea about the file output format, but > I believe > >>>> > it is essentially enough to emit triplets of (<from input section>, > <symbol>, > >>>> > <to input section>), which represents an edge, to reconstruct a > graph. > >>>> > >>>> > >>>> > >>>> Thoughts? > >>>> > >>> > >>>> > >>> > >>>> > >>> Back when I worked on the linker I pretty much always had a way > to > >>>> > dump a graphviz dot file to look at things. Pretty much every graph > >>>> > library/tool can read dot files, and they are easy to hack up a > parser for. You > >>>> > can also add attributes to nodes and edges to store arbitrary data. > >>>> > >> > >>>> > >> > >>>> > >> That's an interesting idea. > >>>> > >> > >>>> > >>> As for what to put it in, it really depends on how detailed it > needs to be. > >>>> > Should symbols and sections be collapsed together? Should it > include > >>>> > relocation types? Symbol types/binding/size/etc? > >>>> > >> > >>>> > >> > >>>> > >> Maybe everything? We can for example emit all symbols and input > >>>> > sections first, and then emit a graph as the second half of the > output. E.g. > >>>> > >> > >>>> > >> Symbols: > >>>> > >> <list of symbols> > >>>> > >> Sections: > >>>> > >> <list of sections> > >>>> > >> Graph: > >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol > >>>> > >> 5 1 4 // likewise > >>>> > > > >>>> > > > >>>> > > I suppose it's a question of if we want users to need to also > read the inputs > >>>> > if they want things like section size and other section/symbol > attributes. It > >>>> > would be pretty trivial to include that data as long as we have a > >>>> > format/syntax for it. > >>>> > > > >>>> > > dot supports listing nodes first with attributes and then > referring to them by > >>>> > name later when listing edges. > >>>> > > > >>>> > > - Michael Spencer > >>>> > > > >>>> > > >>>> > I've experimented with dot files for this type of thing in the past. > >>>> > The difficulty is that they get too large to be realistically viewed > >>>> > very quickly. At that point you need to write scripts to process the > >>>> > output and in that case you may as well use JSON or XML, which I > guess > >>>> > could easily be processed into dot files. To summarise, I think we > may > >>>> > be able to do quite well with some very simple extra analysis in > LLD, > >>>> > a machine readable dependency graph would also be very useful for > the > >>>> > more complex cases. > >>>> > > >>>> > Peter > >>>> > > >>>> > _______________________________________________ > >>>> > > LLVM Developers mailing list > >>>> > > llvm-dev at lists.llvm.org > >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > _______________________________________________ > >>>> > LLVM Developers mailing list > >>>> > llvm-dev at lists.llvm.org > >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > -- > > -- > > Peter > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > -- > 宋方睿 > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190620/cbf9682d/attachment.html>
Rui Ueyama via llvm-dev
2019-Jun-21 12:08 UTC
[llvm-dev] Linker option to dump dependency graph
No I didn't. On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Just wanted to check in on this - did your patches make it past the > prototype phase? > > On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > One thing a dependency graph might not capture is the order in which >> events occur, this can be very useful when debugging problems caused by >> library selection order. >> >> The event stream sounds like a more fine-grained --trace (-t). >> >> > (<from input section>, <symbol>, <to input section>) >> >> In --no-gc-sections mode and in some analysis, the file name part of >> the input section should be good enough. >> >> > section size and other section/symbol attributes >> >> If such customization is favored and the complexity isn't a big issue, >> it can probably be implemented as format specifiers (I'm thinking of >> printf, ps -o, date, ...). The design of >> https://github.com/Juniper/libxo can be used for reference. >> >> We shall flesh out the possible vertex/edge types and additional >> information that users may expect. >> >> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > >> > You might have realized this already but it's probably not a good idea >> to use InputSection::Relocations for this because that ends up missing >> anything that becomes a dynamic relocation. I reckon that the code should >> be doing exactly what MarkLive.cpp is doing. >> > >> > Peter >> > >> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> >> >> I hacked up a patch to make lld output a dependency graph in the >> graphviz "dot" format. >> >> >> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- >> this is a patch, which is just visiting all input sections and relocations. >> Note that this is far from completion but just a proof-of-concept. >> >> >> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a >> result of static-linking a "hello world" program. >> >> >> >> https://rui314.github.io/hello.svg -- I rendered the above dot file >> with graphviz `sfdp` engine. The rendered graph is too large and very hard >> to read. Apparently, I need a better visualization tool. >> >> >> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> >> wrote: >> >>> >> >>> +1 for graphviz dot format, so that it can be consumed by any one of >> many existing graph visualization tools. >> >>> >> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>>> >> >>>> >To summarise, I think we may >> >>>> > be able to do quite well with some very simple extra analysis in >> LLD, >> >>>> > a machine readable dependency graph would also be very useful for >> the >> >>>> > more complex cases. >> >>>> >> >>>> Strongly agree. The linker based dependency graph would be very >> useful for Uefi firmware. Below are my usage examples: >> >>>> 1. I need to detect the redundant code in my firmware, and I once >> wrote a analysis tool to compare the IR level symbols and call graph info >> before any optimization and after full optimization (e.g. LTO). But the IR >> level info does not support assembly code info well. So, there are many >> dependency information missing and false positive in my analysis tool. It >> will be more sound if the linker can help output complete and accurate >> dependency graph for final executable. >> >>>> 2. I need a tool to analyze and track the firmware module accurate >> dependency for build cache soundness. Build performance is now a pain point >> in our CI system because every patch need to verify on many build targets >> in our side. We hope to enable the build cache (both module level and file >> level) to accelerate the build time. For module level build cache enabling, >> a very important problem is how to know the module's accurate dependency >> efficiently. I'm looking forward to the linker based dependency graph >> feature. >> >>>> >> >>>> >> >>>> Thanks >> >>>> Steven >> >>>> > -----Original Message----- >> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf >> Of Peter >> >>>> > Smith via llvm-dev >> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM >> >>>> > To: Michael Spencer <bigcheesegs at gmail.com> >> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> >> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph >> >>>> > >> >>>> > Hello, >> >>>> > >> >>>> > I think outputting a dependency graph is a good idea and would >> enable >> >>>> > some offline analysis. I think that there is some advantage to >> >>>> > building some of the simpler ones in, particularly those that would >> >>>> > need heavy annotations to the dependency graph, in particular >> unless >> >>>> > we write a sample analysis tool that ships with the release, many >> >>>> > users are going to miss out on useful features as they aren't >> going to >> >>>> > have the time to build one. I've put some comments inline: >> >>>> > >> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev >> >>>> > <llvm-dev at lists.llvm.org> wrote: >> >>>> > > >> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> >> wrote: >> >>>> > >> >> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer >> >>>> > <bigcheesegs at gmail.com> wrote: >> >>>> > >>> >> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <llvm- >> >>>> > dev at lists.llvm.org> wrote: >> >>>> > >>>> >> >>>> > >>>> Hi, >> >>>> > >>>> >> >>>> > >>>> I've heard people say that they want to analyze dependencies >> between >> >>>> > object files at the linker level so that they can run a >> whole-program analysis >> >>>> > which cannot be done at the compiler that works for one >> compilation unit at >> >>>> > a time. I'd like to start a discussion as to what we can do with >> it and how to >> >>>> > make it possible. I'm also sharing my idea about how to make it >> possible. >> >>>> > >>>> >> >>>> > >>>> Dependency analyses >> >>>> > >>>> First, let me start with a few examples of analyses I'm heard >> of or >> >>>> > thinking about. Dependencies between object files can be >> represented as a >> >>>> > graph where vertices are input sections and edges are symbols and >> >>>> > relocations. Analyses would work on the dependency graph. Examples >> of >> >>>> > analyses include but not limited to the following: >> >>>> > >>>> >> >>>> > >>>> - Figure out why some library or an object file gets linked. >> >>>> > >>>> >> >>>> > >> >>>> > Arm's proprietary linker has a very helpful feature in verbose mode >> >>>> > where it will report on object loading: global/weak definitions and >> >>>> > global/weak references. For libraries you'd get a message like >> >>>> > selecting member.o from library.a to define symbol S. This >> resulted in >> >>>> > quite an effective trace of the linker output that could answer >> most >> >>>> > "why did this library and object file get loaded question?" One >> thing >> >>>> > a dependency graph might not capture is the order in which events >> >>>> > occur, this can be very useful when debugging problems caused by >> >>>> > library selection order. >> >>>> > >> >>>> > >>>> - Finding a candidate to eliminate dependency by finding a >> "weak" link >> >>>> > to a library. We can for example say the dependency to a library >> is weak if >> >>>> > the library in the graph can be unreachable if we remove N edges >> from the >> >>>> > graph (which is likely to correspond to removing N function calls >> from the >> >>>> > code), where N is a small number. >> >>>> > >>>> >> >>>> > >>>> - Understanding which of new dependencies increase the >> executable >> >>>> > size the most, compare to a previous build. >> >>>> > >>>> >> >>>> > >> >>>> > Arm's linker, being focused on embedded systems has a useful >> feature >> >>>> > that summarises the amount of content taken from each object broken >> >>>> > down into code, ro-data, rw-date etc. This can be helpful in the >> face >> >>>> > of comdat group elimination and optimisations such as garbage >> >>>> > collection and ICF that can be difficult to predict from a >> dependency >> >>>> > graph. It is true that this information could be added as >> attributes >> >>>> > but again it may just be easier to write a simple analysis pass >> over >> >>>> > the output in the linker. >> >>>> > >> >>>> > >>>> - Finding bad or circular dependencies between >> sub-components. >> >>>> > >>>> >> >>>> > >>>> There would be many more analyses you want to run at the >> linker input >> >>>> > level. Currently, lld doesn't actively support such analyses. >> There are a few >> >>>> > options to make the linker emit dependency information (e.g. >> --cref or -Map), >> >>>> > but the output of the options is not comprehensive; you cannot >> reconstruct a >> >>>> > dependency graph from the output of the options. >> >>>> > >> >>>> > >> >>>> > >> >>>> > >>>> >> >>>> > >>>> Dumping dependency graph >> >>>> > >>>> So, I'm thinking if it would be desirable to add a new >> feature to the >> >>>> > linker to dump an entire dependency graph in such a way that a >> graph can be >> >>>> > reconstructed by reading it back. Once we have such feature, we >> can link a >> >>>> > program with the feature enabled and run any kind of dependency >> analysis >> >>>> > on the output. You can save dumps to compare to previous builds. >> You can >> >>>> > run any number of analyses on a dump, instead of invoking the >> linker for >> >>>> > each analysis. >> >>>> > >>>> >> >>>> > >>>> I don't have a concrete idea about the file output format, >> but I believe >> >>>> > it is essentially enough to emit triplets of (<from input >> section>, <symbol>, >> >>>> > <to input section>), which represents an edge, to reconstruct a >> graph. >> >>>> > >>>> >> >>>> > >>>> Thoughts? >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> Back when I worked on the linker I pretty much always had a >> way to >> >>>> > dump a graphviz dot file to look at things. Pretty much every >> graph >> >>>> > library/tool can read dot files, and they are easy to hack up a >> parser for. You >> >>>> > can also add attributes to nodes and edges to store arbitrary data. >> >>>> > >> >> >>>> > >> >> >>>> > >> That's an interesting idea. >> >>>> > >> >> >>>> > >>> As for what to put it in, it really depends on how detailed it >> needs to be. >> >>>> > Should symbols and sections be collapsed together? Should it >> include >> >>>> > relocation types? Symbol types/binding/size/etc? >> >>>> > >> >> >>>> > >> >> >>>> > >> Maybe everything? We can for example emit all symbols and input >> >>>> > sections first, and then emit a graph as the second half of the >> output. E.g. >> >>>> > >> >> >>>> > >> Symbols: >> >>>> > >> <list of symbols> >> >>>> > >> Sections: >> >>>> > >> <list of sections> >> >>>> > >> Graph: >> >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol >> >>>> > >> 5 1 4 // likewise >> >>>> > > >> >>>> > > >> >>>> > > I suppose it's a question of if we want users to need to also >> read the inputs >> >>>> > if they want things like section size and other section/symbol >> attributes. It >> >>>> > would be pretty trivial to include that data as long as we have a >> >>>> > format/syntax for it. >> >>>> > > >> >>>> > > dot supports listing nodes first with attributes and then >> referring to them by >> >>>> > name later when listing edges. >> >>>> > > >> >>>> > > - Michael Spencer >> >>>> > > >> >>>> > >> >>>> > I've experimented with dot files for this type of thing in the >> past. >> >>>> > The difficulty is that they get too large to be realistically >> viewed >> >>>> > very quickly. At that point you need to write scripts to process >> the >> >>>> > output and in that case you may as well use JSON or XML, which I >> guess >> >>>> > could easily be processed into dot files. To summarise, I think we >> may >> >>>> > be able to do quite well with some very simple extra analysis in >> LLD, >> >>>> > a machine readable dependency graph would also be very useful for >> the >> >>>> > more complex cases. >> >>>> > >> >>>> > Peter >> >>>> > >> >>>> > _______________________________________________ >> >>>> > > LLVM Developers mailing list >> >>>> > > llvm-dev at lists.llvm.org >> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>>> > _______________________________________________ >> >>>> > LLVM Developers mailing list >> >>>> > llvm-dev at lists.llvm.org >> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>>> _______________________________________________ >> >>>> LLVM Developers mailing list >> >>>> llvm-dev at lists.llvm.org >> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >> > >> > >> > -- >> > -- >> > Peter >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> -- >> 宋方睿 >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/37ec3650/attachment-0001.html>
Rui Ueyama via llvm-dev
2019-Jun-21 12:10 UTC
[llvm-dev] Linker option to dump dependency graph
Sorry, I didn't notice that you are asking not to me but to Fangrui. Please disregard my previous email. On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote:> No I didn't. > > On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Just wanted to check in on this - did your patches make it past the >> prototype phase? >> >> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> > One thing a dependency graph might not capture is the order in which >>> events occur, this can be very useful when debugging problems caused by >>> library selection order. >>> >>> The event stream sounds like a more fine-grained --trace (-t). >>> >>> > (<from input section>, <symbol>, <to input section>) >>> >>> In --no-gc-sections mode and in some analysis, the file name part of >>> the input section should be good enough. >>> >>> > section size and other section/symbol attributes >>> >>> If such customization is favored and the complexity isn't a big issue, >>> it can probably be implemented as format specifiers (I'm thinking of >>> printf, ps -o, date, ...). The design of >>> https://github.com/Juniper/libxo can be used for reference. >>> >>> We shall flesh out the possible vertex/edge types and additional >>> information that users may expect. >>> >>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > >>> > You might have realized this already but it's probably not a good idea >>> to use InputSection::Relocations for this because that ends up missing >>> anything that becomes a dynamic relocation. I reckon that the code should >>> be doing exactly what MarkLive.cpp is doing. >>> > >>> > Peter >>> > >>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> I hacked up a patch to make lld output a dependency graph in the >>> graphviz "dot" format. >>> >> >>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- >>> this is a patch, which is just visiting all input sections and relocations. >>> Note that this is far from completion but just a proof-of-concept. >>> >> >>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a >>> result of static-linking a "hello world" program. >>> >> >>> >> https://rui314.github.io/hello.svg -- I rendered the above dot file >>> with graphviz `sfdp` engine. The rendered graph is too large and very hard >>> to read. Apparently, I need a better visualization tool. >>> >> >>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> >>> wrote: >>> >>> >>> >>> +1 for graphviz dot format, so that it can be consumed by any one of >>> many existing graph visualization tools. >>> >>> >>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> >>> >>>> >To summarise, I think we may >>> >>>> > be able to do quite well with some very simple extra analysis in >>> LLD, >>> >>>> > a machine readable dependency graph would also be very useful for >>> the >>> >>>> > more complex cases. >>> >>>> >>> >>>> Strongly agree. The linker based dependency graph would be very >>> useful for Uefi firmware. Below are my usage examples: >>> >>>> 1. I need to detect the redundant code in my firmware, and I once >>> wrote a analysis tool to compare the IR level symbols and call graph info >>> before any optimization and after full optimization (e.g. LTO). But the IR >>> level info does not support assembly code info well. So, there are many >>> dependency information missing and false positive in my analysis tool. It >>> will be more sound if the linker can help output complete and accurate >>> dependency graph for final executable. >>> >>>> 2. I need a tool to analyze and track the firmware module accurate >>> dependency for build cache soundness. Build performance is now a pain point >>> in our CI system because every patch need to verify on many build targets >>> in our side. We hope to enable the build cache (both module level and file >>> level) to accelerate the build time. For module level build cache enabling, >>> a very important problem is how to know the module's accurate dependency >>> efficiently. I'm looking forward to the linker based dependency graph >>> feature. >>> >>>> >>> >>>> >>> >>>> Thanks >>> >>>> Steven >>> >>>> > -----Original Message----- >>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On >>> Behalf Of Peter >>> >>>> > Smith via llvm-dev >>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM >>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com> >>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org> >>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph >>> >>>> > >>> >>>> > Hello, >>> >>>> > >>> >>>> > I think outputting a dependency graph is a good idea and would >>> enable >>> >>>> > some offline analysis. I think that there is some advantage to >>> >>>> > building some of the simpler ones in, particularly those that >>> would >>> >>>> > need heavy annotations to the dependency graph, in particular >>> unless >>> >>>> > we write a sample analysis tool that ships with the release, many >>> >>>> > users are going to miss out on useful features as they aren't >>> going to >>> >>>> > have the time to build one. I've put some comments inline: >>> >>>> > >>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev >>> >>>> > <llvm-dev at lists.llvm.org> wrote: >>> >>>> > > >>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> >>> wrote: >>> >>>> > >> >>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer >>> >>>> > <bigcheesegs at gmail.com> wrote: >>> >>>> > >>> >>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <llvm- >>> >>>> > dev at lists.llvm.org> wrote: >>> >>>> > >>>> >>> >>>> > >>>> Hi, >>> >>>> > >>>> >>> >>>> > >>>> I've heard people say that they want to analyze dependencies >>> between >>> >>>> > object files at the linker level so that they can run a >>> whole-program analysis >>> >>>> > which cannot be done at the compiler that works for one >>> compilation unit at >>> >>>> > a time. I'd like to start a discussion as to what we can do with >>> it and how to >>> >>>> > make it possible. I'm also sharing my idea about how to make it >>> possible. >>> >>>> > >>>> >>> >>>> > >>>> Dependency analyses >>> >>>> > >>>> First, let me start with a few examples of analyses I'm >>> heard of or >>> >>>> > thinking about. Dependencies between object files can be >>> represented as a >>> >>>> > graph where vertices are input sections and edges are symbols and >>> >>>> > relocations. Analyses would work on the dependency graph. >>> Examples of >>> >>>> > analyses include but not limited to the following: >>> >>>> > >>>> >>> >>>> > >>>> - Figure out why some library or an object file gets linked. >>> >>>> > >>>> >>> >>>> > >>> >>>> > Arm's proprietary linker has a very helpful feature in verbose >>> mode >>> >>>> > where it will report on object loading: global/weak definitions >>> and >>> >>>> > global/weak references. For libraries you'd get a message like >>> >>>> > selecting member.o from library.a to define symbol S. This >>> resulted in >>> >>>> > quite an effective trace of the linker output that could answer >>> most >>> >>>> > "why did this library and object file get loaded question?" One >>> thing >>> >>>> > a dependency graph might not capture is the order in which events >>> >>>> > occur, this can be very useful when debugging problems caused by >>> >>>> > library selection order. >>> >>>> > >>> >>>> > >>>> - Finding a candidate to eliminate dependency by finding a >>> "weak" link >>> >>>> > to a library. We can for example say the dependency to a library >>> is weak if >>> >>>> > the library in the graph can be unreachable if we remove N edges >>> from the >>> >>>> > graph (which is likely to correspond to removing N function calls >>> from the >>> >>>> > code), where N is a small number. >>> >>>> > >>>> >>> >>>> > >>>> - Understanding which of new dependencies increase the >>> executable >>> >>>> > size the most, compare to a previous build. >>> >>>> > >>>> >>> >>>> > >>> >>>> > Arm's linker, being focused on embedded systems has a useful >>> feature >>> >>>> > that summarises the amount of content taken from each object >>> broken >>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in the >>> face >>> >>>> > of comdat group elimination and optimisations such as garbage >>> >>>> > collection and ICF that can be difficult to predict from a >>> dependency >>> >>>> > graph. It is true that this information could be added as >>> attributes >>> >>>> > but again it may just be easier to write a simple analysis pass >>> over >>> >>>> > the output in the linker. >>> >>>> > >>> >>>> > >>>> - Finding bad or circular dependencies between >>> sub-components. >>> >>>> > >>>> >>> >>>> > >>>> There would be many more analyses you want to run at the >>> linker input >>> >>>> > level. Currently, lld doesn't actively support such analyses. >>> There are a few >>> >>>> > options to make the linker emit dependency information (e.g. >>> --cref or -Map), >>> >>>> > but the output of the options is not comprehensive; you cannot >>> reconstruct a >>> >>>> > dependency graph from the output of the options. >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>>> >>> >>>> > >>>> Dumping dependency graph >>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new >>> feature to the >>> >>>> > linker to dump an entire dependency graph in such a way that a >>> graph can be >>> >>>> > reconstructed by reading it back. Once we have such feature, we >>> can link a >>> >>>> > program with the feature enabled and run any kind of dependency >>> analysis >>> >>>> > on the output. You can save dumps to compare to previous builds. >>> You can >>> >>>> > run any number of analyses on a dump, instead of invoking the >>> linker for >>> >>>> > each analysis. >>> >>>> > >>>> >>> >>>> > >>>> I don't have a concrete idea about the file output format, >>> but I believe >>> >>>> > it is essentially enough to emit triplets of (<from input >>> section>, <symbol>, >>> >>>> > <to input section>), which represents an edge, to reconstruct a >>> graph. >>> >>>> > >>>> >>> >>>> > >>>> Thoughts? >>> >>>> > >>> >>> >>>> > >>> >>> >>>> > >>> Back when I worked on the linker I pretty much always had a >>> way to >>> >>>> > dump a graphviz dot file to look at things. Pretty much every >>> graph >>> >>>> > library/tool can read dot files, and they are easy to hack up a >>> parser for. You >>> >>>> > can also add attributes to nodes and edges to store arbitrary >>> data. >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> That's an interesting idea. >>> >>>> > >> >>> >>>> > >>> As for what to put it in, it really depends on how detailed >>> it needs to be. >>> >>>> > Should symbols and sections be collapsed together? Should it >>> include >>> >>>> > relocation types? Symbol types/binding/size/etc? >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> Maybe everything? We can for example emit all symbols and >>> input >>> >>>> > sections first, and then emit a graph as the second half of the >>> output. E.g. >>> >>>> > >> >>> >>>> > >> Symbols: >>> >>>> > >> <list of symbols> >>> >>>> > >> Sections: >>> >>>> > >> <list of sections> >>> >>>> > >> Graph: >>> >>>> > >> 1 2 3 // 1st section depends on 3rd section via 2nd symbol >>> >>>> > >> 5 1 4 // likewise >>> >>>> > > >>> >>>> > > >>> >>>> > > I suppose it's a question of if we want users to need to also >>> read the inputs >>> >>>> > if they want things like section size and other section/symbol >>> attributes. It >>> >>>> > would be pretty trivial to include that data as long as we have a >>> >>>> > format/syntax for it. >>> >>>> > > >>> >>>> > > dot supports listing nodes first with attributes and then >>> referring to them by >>> >>>> > name later when listing edges. >>> >>>> > > >>> >>>> > > - Michael Spencer >>> >>>> > > >>> >>>> > >>> >>>> > I've experimented with dot files for this type of thing in the >>> past. >>> >>>> > The difficulty is that they get too large to be realistically >>> viewed >>> >>>> > very quickly. At that point you need to write scripts to process >>> the >>> >>>> > output and in that case you may as well use JSON or XML, which I >>> guess >>> >>>> > could easily be processed into dot files. To summarise, I think >>> we may >>> >>>> > be able to do quite well with some very simple extra analysis in >>> LLD, >>> >>>> > a machine readable dependency graph would also be very useful for >>> the >>> >>>> > more complex cases. >>> >>>> > >>> >>>> > Peter >>> >>>> > >>> >>>> > _______________________________________________ >>> >>>> > > LLVM Developers mailing list >>> >>>> > > llvm-dev at lists.llvm.org >>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>>> > _______________________________________________ >>> >>>> > LLVM Developers mailing list >>> >>>> > llvm-dev at lists.llvm.org >>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>>> _______________________________________________ >>> >>>> LLVM Developers mailing list >>> >>>> llvm-dev at lists.llvm.org >>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >>> >> _______________________________________________ >>> >> LLVM Developers mailing list >>> >> llvm-dev at lists.llvm.org >>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> > >>> > >>> > -- >>> > -- >>> > Peter >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >>> -- >>> 宋方睿 >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/f97d77dc/attachment.html>