Teresa Johnson via llvm-dev
2017-Jun-08 23:55 UTC
[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format
Great! For the hotness, try creating a small test case with a very hot loop that iterates many times. Let me know if you are still having trouble. While the llvm-dis serialization is being discussed, I suppose at the very least this can go in with the rest of the existing YAML summary dumping and get emitted from llvm-lto2 using the patch Peter attached. Peter - do you want to add that to llvm-lto2 so that we have a way of dumping the existing YAML supported summary info to stdout, or would you rather have Charles take that one over and submit it (probably just needs a test case). Teresa On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <charles.saternos at gmail.com> wrote:> Hey Teresa, > > I've updated the YAML to include the names and GUIDs for all > functions/global vars/aliases. I've also added the hotness info to the > output, but for some reason, none of my tests when running with FDO gave > anything besides Unknown. I'll be looking more into this tomorrow. > > Here's the current format: > > > ../build/bin/llvm-lto2 dump-summary b.o > --- > NamedGlobalValueMap: > : > - GUID: 3762489268811518743 > Kind: GlobalVar > Linkage: PrivateLinkage > NotEligibleToImport: true > Live: false > cold: > - GUID: 11668175513417606517 > Kind: Function > Linkage: ExternalLinkage > NotEligibleToImport: true > Live: false > InstCount: 5 > Calls: > - Name: puts > GUID: 8979701042202144121 > Hotness: Unknown > fib: > - GUID: 8667248078361406812 > Kind: Function > Linkage: ExternalLinkage > NotEligibleToImport: true > Live: false > InstCount: 26 > Calls: > - Name: fib > GUID: 8667248078361406812 > Hotness: Unknown > hot: > - GUID: 10177652421713147431 > Kind: Function > Linkage: ExternalLinkage > NotEligibleToImport: true > Live: false > InstCount: 14 > Calls: > - Name: fib > GUID: 8667248078361406812 > Hotness: Unknown > - Name: printf > GUID: 7383291119112528047 > Hotness: Unknown > llvm.used: > - GUID: 15665353970260777610 > Kind: GlobalVar > Linkage: AppendingLinkage > NotEligibleToImport: true > Live: true > TypeIdMap: > WithGlobalValueDeadStripping: false > ... > > Thanks, > Charles > > > On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at google.com> > wrote: > >> >> >> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos < >> charles.saternos at gmail.com> wrote: >> >>> Alright, now it outputs YAML in the following format: >>> >>> --- >>> NamedGlobalValueMap: >>> X: >>> - Kind: GlobalVar >>> Linkage: ExternalLinkage >>> NotEligibleToImport: false >>> Live: false >>> a: >>> - Kind: Alias >>> Linkage: WeakAnyLinkage >>> NotEligibleToImport: false >>> Live: false >>> AliaseeGUID: 1881667236089500162 >>> afun: >>> - Kind: Function >>> Linkage: ExternalLinkage >>> NotEligibleToImport: false >>> Live: false >>> InstCount: 2 >>> testtest: >>> - Kind: Function >>> Linkage: ExternalLinkage >>> NotEligibleToImport: false >>> Live: false >>> InstCount: 2 >>> Calls: >>> - Function: 14471680721094503013 >>> TypeIdMap: >>> WithGlobalValueDeadStripping: false >>> ... >>> >>> Any thoughts on the new format? >>> >> >> Thanks, Charles. The main improvement I think we would want is to output >> value names instead of the GUID. Can you build up a map from GUID -> name >> ahead of time and use those like you were for your initial patch? Actually, >> I also think it would be useful to emit both the GUID and the name, since >> the combined index will eventually only have the GUID, so this would give a >> mapping to use for at least the visual inspection of the combined index. >> >> Also, would be good to see an example with FDO, to make sure the hotness >> info of the calls is emitted. >> >> Teresa >> >> >>> Thanks, >>> Charles >>> >>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: >>> >>>> >>>> >>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>: >>>> >>>>> >>>>> >>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev < >>>>>> llvm-dev at lists.llvm.org>: >>>>>> >>>>>>> I know there's been a bunch of discussion here already, but I was >>>>>>> wondering if perhaps someone (probably Teresa? Peter?) could: >>>>>>> >>>>>>> 1) summarize the current state >>>>>>> 2) describe the end-goal >>>>>>> 3) describe what steps (& how this patch relates) are planned to get >>>>>>> to (2) >>>>>>> >>>>>>> My naive thoughts, not being intimately familiar with any of this: >>>>>>> Usually bitcode and textual IR support go in together or around the same >>>>>>> time, and designed that way from the start (take r211920 for examaple, >>>>>>> which added an explicit representation of COMDATs to the IR). This seems to >>>>>>> have been an oversight in the implementation of IR summaries (is that an >>>>>>> accurate representation/statement?) >>>>>>> >>>>>> >>>>>> More or less: it was not an oversight. >>>>>> The summaries are not really part of the IR, it is more like an >>>>>> "analysis result" that is serialized. It can always be recomputed from the >>>>>> IR. This aspect makes it quite "special", it is the only analysis result >>>>>> that I know of that we serialize. >>>>>> >>>>> >>>>> The use list work seems pretty similar in some ways (granted, can't be >>>>> recomputed to match, hence the desire to serialize it for test case >>>>> implementation). >>>>> >>>> >>>> I see use-list as a leaky implementation detail of the IR that we >>>> serialized because it impact the processing of the IR. >>>> >>>> Summaries are more like serializing the CFG for example. >>>> >>>> >>>>> But it looks like the same is true here to a degree - there are test >>>>> cases that exercise the summary handling, so they want summaries for input >>>>> (for now, I think, I've seen test cases that run another LLVM tool to >>>>> insert/create a summary to then feed that back in for a test), or to test >>>>> that the resulting summary is correct. >>>>> >>>> >>>> We have cases were we want summaries as an input and check a combined >>>> summary as an output, and for these having the YAML representation will be >>>> useful (we didn't have it before). >>>> >>>> >>>>> >>>>> Can summaries be standalone? I thought they could (that'd be ideal for >>>>> the distributed situation - only the summary needs to go to the 'thin link' >>>>> step, I think? (currently maybe only the debug info is stripped for that - >>>>> but ideally other unused IR wouldn't be shipped there as well, I would >>>>> think) >>>>> >>>> >>>> Yes conceptually they can be standalone. >>>> >>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>>> & now there's an effort to correct that. >>>>>>> >>>>>> >>>>>> The main motivation here, I believe, is more to help dev to have >>>>>> human readable/understandable dump for ThinLTO bitcodes. Having to inspect >>>>>> separately summaries is a pain. >>>>>> >>>>> >>>>> Not sure I quite follow - inspect separately? >>>>> >>>> >>>> llvm-dis does not display summaries today, so you can't just use >>>> llvm-dis like a "regular" flow. >>>> >>>> >>>>> How are they inspected today? >>>>> >>>> >>>> llvm-bcanalyzer? And now the YAML dump as well. >>>> >>>> >>>>> & also, I think there are test cases that want to/are currently >>>>> testing summary input but do so somewhat awkwardly by using another tool to >>>>> produce the summary first. Ideally the test case would have the summary >>>>> written in to start, I would think, if that's a codepath worth testing? >>>>> >>>> >>>> The IR already contains all the information, so why repeating it? This >>>> makes the test case harder to maintain, in the vast majority, I expect that >>>> if a test needs IR then it shouldn't need to include a summary as well (and >>>> vice-versa). >>>> >>>> In the majority of test we have we want to check if the importing does >>>> what it is supposed to do, and if the linkage are correctly adjusted. With >>>> a YAML (or other) serialization for the summaries this could indeed been >>>> done purely with summaries, without any IR involved. >>>> >>>> -- >>>> Mehdi >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> - Dave >>>>> >>>>> >>>>>> >>>>>> -- >>>>>> Mehdi >>>>>> >>>>>> So it seems like that would start with a discussion of what the right >>>>>>> end-state would be: What the syntax in textual IR should be, then >>>>>>> implementing it. I can understand implementing such a thing in steps - it's >>>>>>> perhaps more involved than the COMDAT situation. In that case starting on >>>>>>> either side seems fine - implementing the emission first (hidden behind a >>>>>>> flag, so as not to break round-tripping in the interim) or the parsing >>>>>>> first (no need to hide it behind any flags - manually written examples can >>>>>>> be used as input tests). >>>>>>> >>>>>>> (& it sounds like there's some partially implemented functionality >>>>>>> using a YAML format that was intended to address how some test cases could >>>>>>> be written? & this might be a good basis for the syntax - but seems to me >>>>>>> like it might be a bit disjointed/out of place in the textual IR format >>>>>>> that's not otherwise YAML-based?) >>>>>>> >>>>>>> - Dave >>>>>>> >>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>>> Hey all, >>>>>>>> >>>>>>>> Below is the proposed format for the dump of the ThinLTO module >>>>>>>> summary in the llvm-dis utility: >>>>>>>> >>>>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll >>>>>>>> ; ModuleID = '2.o' >>>>>>>> source_filename = "2.ll" >>>>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" >>>>>>>> target triple = "x86_64-unknown-linux-gnu" >>>>>>>> >>>>>>>> @X = constant i32 42, section "foo", align 4 >>>>>>>> >>>>>>>> @a = weak alias i32, i32* @X >>>>>>>> >>>>>>>> define void @afun() { >>>>>>>> %1 = load i32, i32* @a >>>>>>>> ret void >>>>>>>> } >>>>>>>> >>>>>>>> define void @testtest() { >>>>>>>> tail call void @boop() >>>>>>>> ret void >>>>>>>> } >>>>>>>> >>>>>>>> declare void @boop() >>>>>>>> >>>>>>>> ; Module summary: >>>>>>>> ; testtest (External linkage) >>>>>>>> ; Function (2 instructions) >>>>>>>> ; Calls: boop >>>>>>>> ; X (External linkage) >>>>>>>> ; Global Variable >>>>>>>> ; afun (External linkage) >>>>>>>> ; Function (2 instructions) >>>>>>>> ; Refs: >>>>>>>> ; a >>>>>>>> ; a (Weak any linkage) >>>>>>>> ; Alias (aliasee X) >>>>>>>> >>>>>>>> I've implemented the above format in the llvm-dis utility, since >>>>>>>> there currently isn't really a way of getting ThinLTO summaries in a >>>>>>>> human-readable format. >>>>>>>> >>>>>>>> Let me know what you think of this format, and what information you >>>>>>>> think should be added/removed. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Charles >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>> >>>>>>> >>>> >>> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 <(408)%20460-2413> >> > >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/da136b53/attachment.html>
Peter Collingbourne via llvm-dev
2017-Jun-09 00:01 UTC
[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format
I'd ask Charles to take it over. I think it just needs a test case and an update to the usage message. Peter On Thu, Jun 8, 2017 at 4:55 PM, Teresa Johnson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Great! For the hotness, try creating a small test case with a very hot > loop that iterates many times. Let me know if you are still having trouble. > While the llvm-dis serialization is being discussed, I suppose at the very > least this can go in with the rest of the existing YAML summary dumping and > get emitted from llvm-lto2 using the patch Peter attached. Peter - do you > want to add that to llvm-lto2 so that we have a way of dumping the existing > YAML supported summary info to stdout, or would you rather have Charles > take that one over and submit it (probably just needs a test case). > > Teresa > > On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos < > charles.saternos at gmail.com> wrote: > >> Hey Teresa, >> >> I've updated the YAML to include the names and GUIDs for all >> functions/global vars/aliases. I've also added the hotness info to the >> output, but for some reason, none of my tests when running with FDO gave >> anything besides Unknown. I'll be looking more into this tomorrow. >> >> Here's the current format: >> >> > ../build/bin/llvm-lto2 dump-summary b.o >> --- >> NamedGlobalValueMap: >> : >> - GUID: 3762489268811518743 >> Kind: GlobalVar >> Linkage: PrivateLinkage >> NotEligibleToImport: true >> Live: false >> cold: >> - GUID: 11668175513417606517 >> Kind: Function >> Linkage: ExternalLinkage >> NotEligibleToImport: true >> Live: false >> InstCount: 5 >> Calls: >> - Name: puts >> GUID: 8979701042202144121 >> Hotness: Unknown >> fib: >> - GUID: 8667248078361406812 >> Kind: Function >> Linkage: ExternalLinkage >> NotEligibleToImport: true >> Live: false >> InstCount: 26 >> Calls: >> - Name: fib >> GUID: 8667248078361406812 >> Hotness: Unknown >> hot: >> - GUID: 10177652421713147431 >> Kind: Function >> Linkage: ExternalLinkage >> NotEligibleToImport: true >> Live: false >> InstCount: 14 >> Calls: >> - Name: fib >> GUID: 8667248078361406812 >> Hotness: Unknown >> - Name: printf >> GUID: 7383291119112528047 >> Hotness: Unknown >> llvm.used: >> - GUID: 15665353970260777610 >> Kind: GlobalVar >> Linkage: AppendingLinkage >> NotEligibleToImport: true >> Live: true >> TypeIdMap: >> WithGlobalValueDeadStripping: false >> ... >> >> Thanks, >> Charles >> >> >> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at google.com> >> wrote: >> >>> >>> >>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos < >>> charles.saternos at gmail.com> wrote: >>> >>>> Alright, now it outputs YAML in the following format: >>>> >>>> --- >>>> NamedGlobalValueMap: >>>> X: >>>> - Kind: GlobalVar >>>> Linkage: ExternalLinkage >>>> NotEligibleToImport: false >>>> Live: false >>>> a: >>>> - Kind: Alias >>>> Linkage: WeakAnyLinkage >>>> NotEligibleToImport: false >>>> Live: false >>>> AliaseeGUID: 1881667236089500162 >>>> afun: >>>> - Kind: Function >>>> Linkage: ExternalLinkage >>>> NotEligibleToImport: false >>>> Live: false >>>> InstCount: 2 >>>> testtest: >>>> - Kind: Function >>>> Linkage: ExternalLinkage >>>> NotEligibleToImport: false >>>> Live: false >>>> InstCount: 2 >>>> Calls: >>>> - Function: 14471680721094503013 >>>> TypeIdMap: >>>> WithGlobalValueDeadStripping: false >>>> ... >>>> >>>> Any thoughts on the new format? >>>> >>> >>> Thanks, Charles. The main improvement I think we would want is to output >>> value names instead of the GUID. Can you build up a map from GUID -> name >>> ahead of time and use those like you were for your initial patch? Actually, >>> I also think it would be useful to emit both the GUID and the name, since >>> the combined index will eventually only have the GUID, so this would give a >>> mapping to use for at least the visual inspection of the combined index. >>> >>> Also, would be good to see an example with FDO, to make sure the hotness >>> info of the calls is emitted. >>> >>> Teresa >>> >>> >>>> Thanks, >>>> Charles >>>> >>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org>: >>>>>>> >>>>>>>> I know there's been a bunch of discussion here already, but I was >>>>>>>> wondering if perhaps someone (probably Teresa? Peter?) could: >>>>>>>> >>>>>>>> 1) summarize the current state >>>>>>>> 2) describe the end-goal >>>>>>>> 3) describe what steps (& how this patch relates) are planned to >>>>>>>> get to (2) >>>>>>>> >>>>>>>> My naive thoughts, not being intimately familiar with any of this: >>>>>>>> Usually bitcode and textual IR support go in together or around the same >>>>>>>> time, and designed that way from the start (take r211920 for examaple, >>>>>>>> which added an explicit representation of COMDATs to the IR). This seems to >>>>>>>> have been an oversight in the implementation of IR summaries (is that an >>>>>>>> accurate representation/statement?) >>>>>>>> >>>>>>> >>>>>>> More or less: it was not an oversight. >>>>>>> The summaries are not really part of the IR, it is more like an >>>>>>> "analysis result" that is serialized. It can always be recomputed from the >>>>>>> IR. This aspect makes it quite "special", it is the only analysis result >>>>>>> that I know of that we serialize. >>>>>>> >>>>>> >>>>>> The use list work seems pretty similar in some ways (granted, can't >>>>>> be recomputed to match, hence the desire to serialize it for test case >>>>>> implementation). >>>>>> >>>>> >>>>> I see use-list as a leaky implementation detail of the IR that we >>>>> serialized because it impact the processing of the IR. >>>>> >>>>> Summaries are more like serializing the CFG for example. >>>>> >>>>> >>>>>> But it looks like the same is true here to a degree - there are test >>>>>> cases that exercise the summary handling, so they want summaries for input >>>>>> (for now, I think, I've seen test cases that run another LLVM tool to >>>>>> insert/create a summary to then feed that back in for a test), or to test >>>>>> that the resulting summary is correct. >>>>>> >>>>> >>>>> We have cases were we want summaries as an input and check a combined >>>>> summary as an output, and for these having the YAML representation will be >>>>> useful (we didn't have it before). >>>>> >>>>> >>>>>> >>>>>> Can summaries be standalone? I thought they could (that'd be ideal >>>>>> for the distributed situation - only the summary needs to go to the 'thin >>>>>> link' step, I think? (currently maybe only the debug info is stripped for >>>>>> that - but ideally other unused IR wouldn't be shipped there as well, I >>>>>> would think) >>>>>> >>>>> >>>>> Yes conceptually they can be standalone. >>>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>> & now there's an effort to correct that. >>>>>>>> >>>>>>> >>>>>>> The main motivation here, I believe, is more to help dev to have >>>>>>> human readable/understandable dump for ThinLTO bitcodes. Having to inspect >>>>>>> separately summaries is a pain. >>>>>>> >>>>>> >>>>>> Not sure I quite follow - inspect separately? >>>>>> >>>>> >>>>> llvm-dis does not display summaries today, so you can't just use >>>>> llvm-dis like a "regular" flow. >>>>> >>>>> >>>>>> How are they inspected today? >>>>>> >>>>> >>>>> llvm-bcanalyzer? And now the YAML dump as well. >>>>> >>>>> >>>>>> & also, I think there are test cases that want to/are currently >>>>>> testing summary input but do so somewhat awkwardly by using another tool to >>>>>> produce the summary first. Ideally the test case would have the summary >>>>>> written in to start, I would think, if that's a codepath worth testing? >>>>>> >>>>> >>>>> The IR already contains all the information, so why repeating it? This >>>>> makes the test case harder to maintain, in the vast majority, I expect that >>>>> if a test needs IR then it shouldn't need to include a summary as well (and >>>>> vice-versa). >>>>> >>>>> In the majority of test we have we want to check if the importing does >>>>> what it is supposed to do, and if the linkage are correctly adjusted. With >>>>> a YAML (or other) serialization for the summaries this could indeed been >>>>> done purely with summaries, without any IR involved. >>>>> >>>>> -- >>>>> Mehdi >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> - Dave >>>>>> >>>>>> >>>>>>> >>>>>>> -- >>>>>>> Mehdi >>>>>>> >>>>>>> So it seems like that would start with a discussion of what the >>>>>>>> right end-state would be: What the syntax in textual IR should be, then >>>>>>>> implementing it. I can understand implementing such a thing in steps - it's >>>>>>>> perhaps more involved than the COMDAT situation. In that case starting on >>>>>>>> either side seems fine - implementing the emission first (hidden behind a >>>>>>>> flag, so as not to break round-tripping in the interim) or the parsing >>>>>>>> first (no need to hide it behind any flags - manually written examples can >>>>>>>> be used as input tests). >>>>>>>> >>>>>>>> (& it sounds like there's some partially implemented functionality >>>>>>>> using a YAML format that was intended to address how some test cases could >>>>>>>> be written? & this might be a good basis for the syntax - but seems to me >>>>>>>> like it might be a bit disjointed/out of place in the textual IR format >>>>>>>> that's not otherwise YAML-based?) >>>>>>>> >>>>>>>> - Dave >>>>>>>> >>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> Hey all, >>>>>>>>> >>>>>>>>> Below is the proposed format for the dump of the ThinLTO module >>>>>>>>> summary in the llvm-dis utility: >>>>>>>>> >>>>>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll >>>>>>>>> ; ModuleID = '2.o' >>>>>>>>> source_filename = "2.ll" >>>>>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" >>>>>>>>> target triple = "x86_64-unknown-linux-gnu" >>>>>>>>> >>>>>>>>> @X = constant i32 42, section "foo", align 4 >>>>>>>>> >>>>>>>>> @a = weak alias i32, i32* @X >>>>>>>>> >>>>>>>>> define void @afun() { >>>>>>>>> %1 = load i32, i32* @a >>>>>>>>> ret void >>>>>>>>> } >>>>>>>>> >>>>>>>>> define void @testtest() { >>>>>>>>> tail call void @boop() >>>>>>>>> ret void >>>>>>>>> } >>>>>>>>> >>>>>>>>> declare void @boop() >>>>>>>>> >>>>>>>>> ; Module summary: >>>>>>>>> ; testtest (External linkage) >>>>>>>>> ; Function (2 instructions) >>>>>>>>> ; Calls: boop >>>>>>>>> ; X (External linkage) >>>>>>>>> ; Global Variable >>>>>>>>> ; afun (External linkage) >>>>>>>>> ; Function (2 instructions) >>>>>>>>> ; Refs: >>>>>>>>> ; a >>>>>>>>> ; a (Weak any linkage) >>>>>>>>> ; Alias (aliasee X) >>>>>>>>> >>>>>>>>> I've implemented the above format in the llvm-dis utility, since >>>>>>>>> there currently isn't really a way of getting ThinLTO summaries in a >>>>>>>>> human-readable format. >>>>>>>>> >>>>>>>>> Let me know what you think of this format, and what information >>>>>>>>> you think should be added/removed. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Charles >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>>> >>>>> >>>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>> 408-460-2413 <(408)%20460-2413> >>> >> >> > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 <(408)%20460-2413> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/298570d7/attachment-0001.html>
Charles Saternos via llvm-dev
2017-Jun-09 19:58 UTC
[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format
OK, I tested the hotness, and it works.> I'd ask Charles to take it over. I think it just needs a test case and anupdate to the usage message. Sure - I've added the message and a quick test. The patch is here: https://reviews.llvm.org/D34063 Thanks, Charles On Thu, Jun 8, 2017 at 8:01 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:> I'd ask Charles to take it over. I think it just needs a test case and an > update to the usage message. > > Peter > > On Thu, Jun 8, 2017 at 4:55 PM, Teresa Johnson via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Great! For the hotness, try creating a small test case with a very hot >> loop that iterates many times. Let me know if you are still having trouble. >> While the llvm-dis serialization is being discussed, I suppose at the very >> least this can go in with the rest of the existing YAML summary dumping and >> get emitted from llvm-lto2 using the patch Peter attached. Peter - do you >> want to add that to llvm-lto2 so that we have a way of dumping the existing >> YAML supported summary info to stdout, or would you rather have Charles >> take that one over and submit it (probably just needs a test case). >> >> Teresa >> >> On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos < >> charles.saternos at gmail.com> wrote: >> >>> Hey Teresa, >>> >>> I've updated the YAML to include the names and GUIDs for all >>> functions/global vars/aliases. I've also added the hotness info to the >>> output, but for some reason, none of my tests when running with FDO gave >>> anything besides Unknown. I'll be looking more into this tomorrow. >>> >>> Here's the current format: >>> >>> > ../build/bin/llvm-lto2 dump-summary b.o >>> --- >>> NamedGlobalValueMap: >>> : >>> - GUID: 3762489268811518743 >>> Kind: GlobalVar >>> Linkage: PrivateLinkage >>> NotEligibleToImport: true >>> Live: false >>> cold: >>> - GUID: 11668175513417606517 >>> Kind: Function >>> Linkage: ExternalLinkage >>> NotEligibleToImport: true >>> Live: false >>> InstCount: 5 >>> Calls: >>> - Name: puts >>> GUID: 8979701042202144121 >>> Hotness: Unknown >>> fib: >>> - GUID: 8667248078361406812 >>> Kind: Function >>> Linkage: ExternalLinkage >>> NotEligibleToImport: true >>> Live: false >>> InstCount: 26 >>> Calls: >>> - Name: fib >>> GUID: 8667248078361406812 >>> Hotness: Unknown >>> hot: >>> - GUID: 10177652421713147431 >>> Kind: Function >>> Linkage: ExternalLinkage >>> NotEligibleToImport: true >>> Live: false >>> InstCount: 14 >>> Calls: >>> - Name: fib >>> GUID: 8667248078361406812 >>> Hotness: Unknown >>> - Name: printf >>> GUID: 7383291119112528047 >>> Hotness: Unknown >>> llvm.used: >>> - GUID: 15665353970260777610 >>> Kind: GlobalVar >>> Linkage: AppendingLinkage >>> NotEligibleToImport: true >>> Live: true >>> TypeIdMap: >>> WithGlobalValueDeadStripping: false >>> ... >>> >>> Thanks, >>> Charles >>> >>> >>> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at google.com> >>> wrote: >>> >>>> >>>> >>>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos < >>>> charles.saternos at gmail.com> wrote: >>>> >>>>> Alright, now it outputs YAML in the following format: >>>>> >>>>> --- >>>>> NamedGlobalValueMap: >>>>> X: >>>>> - Kind: GlobalVar >>>>> Linkage: ExternalLinkage >>>>> NotEligibleToImport: false >>>>> Live: false >>>>> a: >>>>> - Kind: Alias >>>>> Linkage: WeakAnyLinkage >>>>> NotEligibleToImport: false >>>>> Live: false >>>>> AliaseeGUID: 1881667236089500162 >>>>> afun: >>>>> - Kind: Function >>>>> Linkage: ExternalLinkage >>>>> NotEligibleToImport: false >>>>> Live: false >>>>> InstCount: 2 >>>>> testtest: >>>>> - Kind: Function >>>>> Linkage: ExternalLinkage >>>>> NotEligibleToImport: false >>>>> Live: false >>>>> InstCount: 2 >>>>> Calls: >>>>> - Function: 14471680721094503013 >>>>> TypeIdMap: >>>>> WithGlobalValueDeadStripping: false >>>>> ... >>>>> >>>>> Any thoughts on the new format? >>>>> >>>> >>>> Thanks, Charles. The main improvement I think we would want is to >>>> output value names instead of the GUID. Can you build up a map from GUID -> >>>> name ahead of time and use those like you were for your initial patch? >>>> Actually, I also think it would be useful to emit both the GUID and the >>>> name, since the combined index will eventually only have the GUID, so this >>>> would give a mapping to use for at least the visual inspection of the >>>> combined index. >>>> >>>> Also, would be good to see an example with FDO, to make sure the >>>> hotness info of the calls is emitted. >>>> >>>> Teresa >>>> >>>> >>>>> Thanks, >>>>> Charles >>>>> >>>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org>: >>>>>>>> >>>>>>>>> I know there's been a bunch of discussion here already, but I was >>>>>>>>> wondering if perhaps someone (probably Teresa? Peter?) could: >>>>>>>>> >>>>>>>>> 1) summarize the current state >>>>>>>>> 2) describe the end-goal >>>>>>>>> 3) describe what steps (& how this patch relates) are planned to >>>>>>>>> get to (2) >>>>>>>>> >>>>>>>>> My naive thoughts, not being intimately familiar with any of this: >>>>>>>>> Usually bitcode and textual IR support go in together or around the same >>>>>>>>> time, and designed that way from the start (take r211920 for examaple, >>>>>>>>> which added an explicit representation of COMDATs to the IR). This seems to >>>>>>>>> have been an oversight in the implementation of IR summaries (is that an >>>>>>>>> accurate representation/statement?) >>>>>>>>> >>>>>>>> >>>>>>>> More or less: it was not an oversight. >>>>>>>> The summaries are not really part of the IR, it is more like an >>>>>>>> "analysis result" that is serialized. It can always be recomputed from the >>>>>>>> IR. This aspect makes it quite "special", it is the only analysis result >>>>>>>> that I know of that we serialize. >>>>>>>> >>>>>>> >>>>>>> The use list work seems pretty similar in some ways (granted, can't >>>>>>> be recomputed to match, hence the desire to serialize it for test case >>>>>>> implementation). >>>>>>> >>>>>> >>>>>> I see use-list as a leaky implementation detail of the IR that we >>>>>> serialized because it impact the processing of the IR. >>>>>> >>>>>> Summaries are more like serializing the CFG for example. >>>>>> >>>>>> >>>>>>> But it looks like the same is true here to a degree - there are test >>>>>>> cases that exercise the summary handling, so they want summaries for input >>>>>>> (for now, I think, I've seen test cases that run another LLVM tool to >>>>>>> insert/create a summary to then feed that back in for a test), or to test >>>>>>> that the resulting summary is correct. >>>>>>> >>>>>> >>>>>> We have cases were we want summaries as an input and check a combined >>>>>> summary as an output, and for these having the YAML representation will be >>>>>> useful (we didn't have it before). >>>>>> >>>>>> >>>>>>> >>>>>>> Can summaries be standalone? I thought they could (that'd be ideal >>>>>>> for the distributed situation - only the summary needs to go to the 'thin >>>>>>> link' step, I think? (currently maybe only the debug info is stripped for >>>>>>> that - but ideally other unused IR wouldn't be shipped there as well, I >>>>>>> would think) >>>>>>> >>>>>> >>>>>> Yes conceptually they can be standalone. >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> & now there's an effort to correct that. >>>>>>>>> >>>>>>>> >>>>>>>> The main motivation here, I believe, is more to help dev to have >>>>>>>> human readable/understandable dump for ThinLTO bitcodes. Having to inspect >>>>>>>> separately summaries is a pain. >>>>>>>> >>>>>>> >>>>>>> Not sure I quite follow - inspect separately? >>>>>>> >>>>>> >>>>>> llvm-dis does not display summaries today, so you can't just use >>>>>> llvm-dis like a "regular" flow. >>>>>> >>>>>> >>>>>>> How are they inspected today? >>>>>>> >>>>>> >>>>>> llvm-bcanalyzer? And now the YAML dump as well. >>>>>> >>>>>> >>>>>>> & also, I think there are test cases that want to/are currently >>>>>>> testing summary input but do so somewhat awkwardly by using another tool to >>>>>>> produce the summary first. Ideally the test case would have the summary >>>>>>> written in to start, I would think, if that's a codepath worth testing? >>>>>>> >>>>>> >>>>>> The IR already contains all the information, so why repeating it? >>>>>> This makes the test case harder to maintain, in the vast majority, I expect >>>>>> that if a test needs IR then it shouldn't need to include a summary as well >>>>>> (and vice-versa). >>>>>> >>>>>> In the majority of test we have we want to check if the importing >>>>>> does what it is supposed to do, and if the linkage are correctly adjusted. >>>>>> With a YAML (or other) serialization for the summaries this could indeed >>>>>> been done purely with summaries, without any IR involved. >>>>>> >>>>>> -- >>>>>> Mehdi >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> - Dave >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Mehdi >>>>>>>> >>>>>>>> So it seems like that would start with a discussion of what the >>>>>>>>> right end-state would be: What the syntax in textual IR should be, then >>>>>>>>> implementing it. I can understand implementing such a thing in steps - it's >>>>>>>>> perhaps more involved than the COMDAT situation. In that case starting on >>>>>>>>> either side seems fine - implementing the emission first (hidden behind a >>>>>>>>> flag, so as not to break round-tripping in the interim) or the parsing >>>>>>>>> first (no need to hide it behind any flags - manually written examples can >>>>>>>>> be used as input tests). >>>>>>>>> >>>>>>>>> (& it sounds like there's some partially implemented functionality >>>>>>>>> using a YAML format that was intended to address how some test cases could >>>>>>>>> be written? & this might be a good basis for the syntax - but seems to me >>>>>>>>> like it might be a bit disjointed/out of place in the textual IR format >>>>>>>>> that's not otherwise YAML-based?) >>>>>>>>> >>>>>>>>> - Dave >>>>>>>>> >>>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev < >>>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>>> >>>>>>>>>> Hey all, >>>>>>>>>> >>>>>>>>>> Below is the proposed format for the dump of the ThinLTO module >>>>>>>>>> summary in the llvm-dis utility: >>>>>>>>>> >>>>>>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll >>>>>>>>>> ; ModuleID = '2.o' >>>>>>>>>> source_filename = "2.ll" >>>>>>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" >>>>>>>>>> target triple = "x86_64-unknown-linux-gnu" >>>>>>>>>> >>>>>>>>>> @X = constant i32 42, section "foo", align 4 >>>>>>>>>> >>>>>>>>>> @a = weak alias i32, i32* @X >>>>>>>>>> >>>>>>>>>> define void @afun() { >>>>>>>>>> %1 = load i32, i32* @a >>>>>>>>>> ret void >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> define void @testtest() { >>>>>>>>>> tail call void @boop() >>>>>>>>>> ret void >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> declare void @boop() >>>>>>>>>> >>>>>>>>>> ; Module summary: >>>>>>>>>> ; testtest (External linkage) >>>>>>>>>> ; Function (2 instructions) >>>>>>>>>> ; Calls: boop >>>>>>>>>> ; X (External linkage) >>>>>>>>>> ; Global Variable >>>>>>>>>> ; afun (External linkage) >>>>>>>>>> ; Function (2 instructions) >>>>>>>>>> ; Refs: >>>>>>>>>> ; a >>>>>>>>>> ; a (Weak any linkage) >>>>>>>>>> ; Alias (aliasee X) >>>>>>>>>> >>>>>>>>>> I've implemented the above format in the llvm-dis utility, since >>>>>>>>>> there currently isn't really a way of getting ThinLTO summaries in a >>>>>>>>>> human-readable format. >>>>>>>>>> >>>>>>>>>> Let me know what you think of this format, and what information >>>>>>>>>> you think should be added/removed. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Charles >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>> 408-460-2413 <(408)%20460-2413> >>>> >>> >>> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 <(408)%20460-2413> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > > -- > -- > Peter >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170609/2db5579b/attachment.html>